Re: [PATCH 1/3] accounting: task counters for disk/network
- From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 7 Apr 2008 23:10:07 -0700
On Tue, 8 Apr 2008 07:48:37 +0200 Gerlof Langeveld <gerlof@xxxxxxxxxxxxxx> wrote:
--- linux-2.6.24.4-vanilla/block/ll_rw_blk.c 2008-03-24 19:49:18.000000000 +0100
+++ linux-2.6.24.4-modified/block/ll_rw_blk.c 2008-03-25 13:52:14.000000000 +0100
@@ -2739,6 +2739,19 @@ static void drive_stat_acct(struct reque
disk_round_stats(rq->rq_disk);
rq->rq_disk->in_flight++;
}
+
+#ifdef CONFIG_TASK_IO_ACCOUNTING
+ switch (rw) {
+ case READ:
+ current->group_leader->ioac.dsk_rio += new_io;
+ current->group_leader->ioac.dsk_rsz += rq->nr_sectors;
+ break;
+ case WRITE:
+ current->group_leader->ioac.dsk_wio += new_io;
+ current->group_leader->ioac.dsk_wsz += rq->nr_sectors;
+ break;
+ }
+#endif
For many workloads, this will cause almost all writeout to be accounted to
pdflush and perhaps kswapd. This makes the per-task write accounting
largely unuseful.
There are several situations that writeouts are accounted to the user-process
itself, e.g. when issueing direct writes (open mode O_DIRECT) or synchronous
writes (open mode O_SYNC, syscall sync/fsync, synchronous file attribute,
synchronous mounted filesystem).
yup.
Apart from that, swapping out of process pages by kswapd is currently not
accounted at all as shown by the following snapshot of 'atop' on a heavily
swapping system:
Under heavy load, callers into alloc_pages() will themselves perform disk
writeout. So under the proposed scheme, process A will be accounted for
writeout which was in fact caused by process B.
So the extra counters can be considered as a useful addition to the I/O
counters that are currently maintained.
mmm, maybe. But if we implement a partial solution like this we really
should have a plan to finish it off.
There have been numerous attempts at this, which tend to involve adding
backpointers to the pageframe structure and such.
This sort of accounting will presumably be needed by a disk bandwidth
cgroup controller. Perhaps the containers/cgroup people have plans of code
already?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [PATCH 1/3] accounting: task counters for disk/network
- From: Paul Menage
- Re: [PATCH 1/3] accounting: task counters for disk/network
- References:
- [PATCH 1/3] accounting: task counters for disk/network
- From: Gerlof Langeveld
- Re: [PATCH 1/3] accounting: task counters for disk/network
- From: Andrew Morton
- Re: [PATCH 1/3] accounting: task counters for disk/network
- From: Gerlof Langeveld
- [PATCH 1/3] accounting: task counters for disk/network
- Prev by Date: Re: rc6+ regression - backlight reset to 0 on boot after 7c0ea45be4f114d85ee35caeead8e1660699c46f
- Next by Date: Re: [PATCH] x86: create array based interface to change page attribute
- Previous by thread: Re: [PATCH 1/3] accounting: task counters for disk/network
- Next by thread: Re: [PATCH 1/3] accounting: task counters for disk/network
- Index(es):
Relevant Pages
|