Q: maximizing sequential file read performance

From: Francois Grieu (fgrieu_at_micronet.fr)
Date: 03/08/04


Date: Mon, 08 Mar 2004 16:24:17 +0100

I'm making a tool that performs lightweight processing
on huge input files, where processor usage per byte is not
many times above memcpy, and not many times below hardware
disk IO speed (think of a 32 bits CRC tool).

I want the optimal speed under modern kernels (say >= 2.4).
I see a number of choices.

A) fopen & fread
B) open & read
C) open with O_DIRECT, and read in two alternating
   buffers while I compute data in the other buffer,
   in another thread, kept in sync with some mutex;
   possibly with a "quick start" strategy that
   progressivly increase buffer size.
D) mmap() plus things I don't know how to do in order to
   hint the kernel to read-ahead.
E) other?

Would A) allow me to read files >2GB?

In B), if the first read asks for a huge block, what will
be the kernel's strategy?

1) read the first block from disk into memory, then return
   to my code; meaning much CPU time is not used, and
   performance will decrease with the block size when
   it becomes comparable to the file size

2) read little data into memory at the beginning
   of my buffer, up to the next page boundary (possibly
   no data), and setup the remaining pages so that
   they page fault if I read from them before disk data
   has filled them.

Then, will the system read-ahead? Will it move the
bulk of my data to where I ask for it in the next read()
using memcpy, or using some magic on pages ?

All in all, when doing B, what buffer size should I use?
Fixed, or of increasing size so as to to minimize latency
on the first block?

Can C) work reliably, or is it file-system dependant,
or likely to trigger many bugs? Can there be a perceptible
gain? On an obsolete non-unix OS (MacOS 9) I'm familiar
with, there are big gains with explicit application-level,
double-buffered, explicitly non-cached, asynchronous disk
operation; no thread is necessary, making it (and the
quick start thing) a manageable anoyance.

As of D), I saw Linus hates O_DIRECT, and advocates mmap().
Again, how robust? What gains? Any snippet or handwaving on
how to perform this? Pointers much appreciated.

TIA,

   Francois Grieu



Relevant Pages

  • [PATCH 3/3] Add ext3 data=guarded mode
    ... buffer onto a list of things that must be written before a commit. ... a workqueue where the real work of updating the on disk i_size is done. ... When we start tracking guarded buffers on a given inode, ... and it also takes a reference on the buffer head. ...
    (Linux-Kernel)
  • Re: [PATCH RFC] ext3 data=guarded v6
    ... Fixup locking while deleting an orphan entry. ... Fixup O_DIRCECT disk i_size updates ... buffer onto a list of things that must be written before a commit. ... When we start tracking guarded buffers on a given inode, ...
    (Linux-Kernel)
  • [PATCH RFC] ext3 data=guarded v6
    ... Fixup locking while deleting an orphan entry. ... Fixup O_DIRCECT disk i_size updates ... buffer onto a list of things that must be written before a commit. ... When we start tracking guarded buffers on a given inode, ...
    (Linux-Kernel)
  • [PATCH RFC] ext3 data=guarded v7
    ... buffer onto a list of things that must be written before a commit. ... a workqueue where the real work of updating the on disk i_size is done. ... When we start tracking guarded buffers on a given inode, ... void ext3_delete_inode (struct inode * inode) ...
    (Linux-Kernel)
  • [PATCH] Add ext3 data=guarded mode
    ... Here's an updated patch for ext3 data=guarded mode. ... wouldn't get sent through the guarded code and the on disk i_size wasn't ... buffer onto a list of things that must be written before a commit. ... When we start tracking guarded buffers on a given inode, ...
    (Linux-Kernel)