Re: need fastest way to write 2gig array to disk file
From: Eric Taylor (et1_at_rocketship1.com)
Date: 09/15/05
- Next message: QNils_O=2E_Sel=E5sdal=22?=: "Re: autoconf automake flex and bison problem"
- Previous message: Binary: "Re: Strange! Why this will cause program hang?"
- In reply to: John Fusco: "Re: need fastest way to write 2gig array to disk file"
- Next in thread: Basile Starynkevitch [news]: "Re: need fastest way to write 2gig array to disk file"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 14 Sep 2005 21:13:30 -0700
Something strange was going on. I have no idea why it would
not cache beyond about 820 megs when writing to one particular
directory, so I just cloned the directory and removed the old one.
This fixed that problem. (I've been told the disk had been freshly
formatted only 1 week ago).
However, I now find other strange behavior. I can write 2 gig
files, 3 in a row and each takes about 10 seconds going into the cache.
But on the 4th one, it can hang for up to 70 seconds at about
600 meg written.
I can somewhat unload the cache by writing enough files and
then rm-ing them all. This brings the cache down to under 1 gig. I
can grow it to almost 12 gigs.
So, here is top after I've got the cache empty of writes:
top - 20:31:56 up 6:48, 2 users, load average: 7.83, 5.92, 2.65
Tasks: 86 total, 1 running, 85 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0% us, 2.2% sy, 0.0% ni, 97.8% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 12475076k total, 2800904k used, 9674172k free, 18276k buffers
Swap: 12578704k total, 0k used, 12578704k free, 758344k cached
Now I run my program 4 times,
The first 3 times run fast (10 secs) then on the fourth one, at about 400-600 meg written, it
hangs for 30-70 secs. Here is top's top at this time:
top - 20:33:10 up 6:49, 2 users, load average: 4.15, 5.11, 2.60
Tasks: 84 total, 2 running, 82 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 0.0% id, 98.9% wa, 0.3% hi, 0.9% si
Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 0.0% id, 100.0% wa, 0.0% hi, 0.0% si
Mem: 12475076k total, 9501264k used, 2973812k free, 24716k buffers
Swap: 12578704k total, 0k used, 12578704k free, 7249044k cached
So, I guess the wa state is the key, but I'm not sure what it's "wa"-ing for. I assume
this is the infamous io-wait state, but i'm new to 2.6. (actually rhel 4's version of 2.6)
Could some flushing process lock down the cache so the program cannot
continue to write to the cache? But even if the cache is locked, why can't it
just continue by writing to the disk instead. I guess all writes must travel through the
cache.
After the long delay, it takes off again, and finally I get this, and in a few more
seconds it's all idle. As you can see, we've increased the cache by about 8 gigs
for the 4 x 2 gig files I just wrote.
Oh, I am writing the files with open/write (not the f-versions). If I strace the
program, it stops on the write when it is hung.
top - 20:33:59 up 6:50, 2 users, load average: 3.61, 4.84, 2.64
Tasks: 84 total, 2 running, 82 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.2% us, 7.1% sy, 0.0% ni, 0.0% id, 91.5% wa, 0.2% hi, 1.0% si
Cpu1 : 0.1% us, 11.4% sy, 0.0% ni, 28.0% id, 60.6% wa, 0.0% hi, 0.0% si
Mem: 12475076k total, 11087248k used, 1387828k free, 26244k buffers
Swap: 12578704k total, 0k used, 12578704k free, 8757856k cached
John Fusco wrote:
> Eric Taylor wrote:
> > Actually, I think what I need is to be able to write entirely to the cache, so
> > my program can continue as soon as possible. So...
> >
> > Are there some /proc parameters I can set so the write cache can be
> > made as large as, say, 75% of all free memory? My systems all have
> > at least 12 gigs, and only this one program will be running (a simulation).
> >
> > And also, if a lot of memory is already cached (e.g. read cache), is there a
> > way to flush it just prior to writing my file, so I can have it all to myself? Can
> > I flush the write cache, once the file has been written?
> >
> > The file I am writing is a saved state of a large simulation. As soon as I write
> > the file, the simulation can proceed. I have to save this state every 20 minutes
> > or so.
> >
> > Before I try things like mmap, I'd like to see if I can simply tune linux. In prior
> > versions of the linux kernel, this worked rather well. But since we've installed
> > the rhel4 system with the 2.6 kernel that redhat starts with, we've seen this
> > slowdown.
> >
> >
>
> Actually, Linux already uses nearly 100% of free memory for filesystem
> cache. There are some tuning parameters in /proc/sys/vm to tell how much
> it must keep available for allocations, but I have never dived deep into
> these.
>
> You touch on a tricky problem with the cache. Once data is in there it
> can be hard to get rid of. There is no way I know of to just discard all
> the unmodified pages in memory. These pages include "read cache" as well
> as any pages that have been sync'd to disk. These do slow you down when
> you try to write new data as the system has to search through all this
> crap and apply it's algorithms to decide exactly which page is first to go.
>
> Based on your description, mmap may be the way to go since you can lock
> these pages down (if you have root privilege). The trick is finding 2G
> of contiguous virtual memory to do your mmap. On IA32 this can be a problem.
>
> I'm surprised to hear that you see a slowdown with 2.6 because my
> experience has been the exact opposite.
>
> John
- Next message: QNils_O=2E_Sel=E5sdal=22?=: "Re: autoconf automake flex and bison problem"
- Previous message: Binary: "Re: Strange! Why this will cause program hang?"
- In reply to: John Fusco: "Re: need fastest way to write 2gig array to disk file"
- Next in thread: Basile Starynkevitch [news]: "Re: need fastest way to write 2gig array to disk file"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|