Q: maximizing sequential file read performance
From: Francois Grieu (fgrieu_at_micronet.fr)
Date: 03/08/04
- Next message: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Previous message: Måns Rullgård: "Re: physical/virtual memory"
- Next in thread: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Reply: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Reply: Jan Knutar: "Re: Q: maximizing sequential file read performance"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 08 Mar 2004 16:24:17 +0100
I'm making a tool that performs lightweight processing
on huge input files, where processor usage per byte is not
many times above memcpy, and not many times below hardware
disk IO speed (think of a 32 bits CRC tool).
I want the optimal speed under modern kernels (say >= 2.4).
I see a number of choices.
A) fopen & fread
B) open & read
C) open with O_DIRECT, and read in two alternating
buffers while I compute data in the other buffer,
in another thread, kept in sync with some mutex;
possibly with a "quick start" strategy that
progressivly increase buffer size.
D) mmap() plus things I don't know how to do in order to
hint the kernel to read-ahead.
E) other?
Would A) allow me to read files >2GB?
In B), if the first read asks for a huge block, what will
be the kernel's strategy?
1) read the first block from disk into memory, then return
to my code; meaning much CPU time is not used, and
performance will decrease with the block size when
it becomes comparable to the file size
2) read little data into memory at the beginning
of my buffer, up to the next page boundary (possibly
no data), and setup the remaining pages so that
they page fault if I read from them before disk data
has filled them.
Then, will the system read-ahead? Will it move the
bulk of my data to where I ask for it in the next read()
using memcpy, or using some magic on pages ?
All in all, when doing B, what buffer size should I use?
Fixed, or of increasing size so as to to minimize latency
on the first block?
Can C) work reliably, or is it file-system dependant,
or likely to trigger many bugs? Can there be a perceptible
gain? On an obsolete non-unix OS (MacOS 9) I'm familiar
with, there are big gains with explicit application-level,
double-buffered, explicitly non-cached, asynchronous disk
operation; no thread is necessary, making it (and the
quick start thing) a manageable anoyance.
As of D), I saw Linus hates O_DIRECT, and advocates mmap().
Again, how robust? What gains? Any snippet or handwaving on
how to perform this? Pointers much appreciated.
TIA,
Francois Grieu
- Next message: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Previous message: Måns Rullgård: "Re: physical/virtual memory"
- Next in thread: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Reply: Basile Starynkevitch [news]: "Re: Q: maximizing sequential file read performance"
- Reply: Jan Knutar: "Re: Q: maximizing sequential file read performance"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|