Re: Buffered I/O slowness
From: Jesse Barnes (jbarnes_at_engr.sgi.com)
Date: 10/29/04
- Previous message: Andrew A.: "RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?)"
- In reply to: Jesse Barnes: "Buffered write slowness"
- Next in thread: Andrew Morton: "Re: Buffered I/O slowness"
- Reply: Andrew Morton: "Re: Buffered I/O slowness"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
To: linux-kernel@vger.kernel.org Date: Fri, 29 Oct 2004 10:46:48 -0700
On Monday, October 25, 2004 6:14 pm, Jesse Barnes wrote:
> I've been doing some simple disk I/O benchmarking with an eye towards
> improving large, striped volume bandwidth. I ran some tests on individual
> disks and filesystems to establish a baseline and found that things
> generally scale quite well:
>
> o one thread/disk using O_DIRECT on the block device
> read avg: 2784.81 MB/s
> write avg: 2585.60 MB/s
>
> o one thread/disk using O_DIRECT + filesystem
> read avg: 2635.98 MB/s
> write avg: 2573.39 MB/s
>
> o one thread/disk using buffered I/O + filesystem
> read w/default (128) block/*/queue/read_ahead_kb avg: 2626.25 MB/s
> read w/max (4096) block/*/queue/read_ahead_kb avg: 2652.62 MB/s
> write avg: 1394.99 MB/s
>
> Configuration:
> o 8p sn2 ia64 box
> o 8GB memory
> o 58 disks across 16 controllers
> (4 disks for 10 of them and 3 for the other 6)
> o aggregate I/O bw available is about 2.8GB/s
>
> Test:
> o one I/O thread per disk, round robined across the 8 CPUs
> o each thread did ~450MB of I/O depending on the test (ran for 10s)
> Note: the total was > 8GB so in the buffered read case not everything
> could be cached
More results here. I've run some tests on a large dm striped volume formatted
with XFS. It had 64 disks with a 64k stripe unit (XFS was made aware of this
at format time), and I explicitly set the readahead using blockdev to 524288
blocks. The results aren't as bad as my previous runs, but are still much
slower than they ought to be I think given the direct I/O results above.
This is after a fresh mount, so the pagecache was empty when I started the
tests.
o one thread on one large volume using buffered I/O + filesystem
read (1 thread, one volume, 131072 blocks/request) avg: ~931 MB/s
write (1 thread, one volume, 131072 blocks/request) avg: ~908 MB/s
I'm intentionally issuing very large reads and writes here to take advantage
of the striping, but it looks like both the readahead and regular buffered
I/O code will split the I/O into page sized chunks? The call chain is pretty
long, but it looks to me like do_generic_mapping_read() will split the reads
up by page and issue them independently to the lower levels. In the direct
I/O case, up to 64 pages are issued at a time, which seems like it would help
throughput quite a bit. The profile seems to confirm this. Unfortunately I
didn't save the vmstat output for this run (and now the fc switch is
misbehaving so I have to fix that before I run again), but iirc the system
time was pretty high given that only one thread was issuing I/O.
So maybe a few things need to be done:
o set readahead to larger values by default for dm volumes at setup time
(the default was very small)
o maybe bypass readahead for very large requests?
if the process is doing a huge request, chances are that readahead won't
benefit it as much as a process doing small requests
o not sure about writes yet, I haven't looked at that call chain much yet
Does any of this sound reasonable at all? What else could be done to make the
buffered I/O layer friendlier to large requests?
Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- text/plain attachment: vol-buffered-read-profile.txt
- Previous message: Andrew A.: "RE: Consistent lock up 2.6.10-rc1-bk7 (mutex/SCHED_RR bug?)"
- In reply to: Jesse Barnes: "Buffered write slowness"
- Next in thread: Andrew Morton: "Re: Buffered I/O slowness"
- Reply: Andrew Morton: "Re: Buffered I/O slowness"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|