Re: [NFS] NFS on loopback locks up entire system(2.6.23-rc6)?



On 9/21/07, Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> wrote:
No. The requirement for 'hard' mounts is not that the server be up all
the time. The server can go up and down as it pleases: the client can
happily recover from that.

The requirement is rather that nobody remove it permanently before the
application is done with it, and the partition is unmounted. That is
hardly unreasonable (it is the only way I know of to ensure data
integrity), and it is much less strict than the requirements for local
disks.

Yes. I completely agree. This is required for data consistency.

But in my testing, if one of the NFS server/mount goes offline for
some point of time, the entire system slows down, especially IO.

In my test program, I forked off 50 threads to do 4K writes on 50
different files in a NFS mounted directory.

Now, I have turned off the NFS server and started another dd process
on local disk ("dd if=/dev/zero of=/tmp/x count=1000") and this dd
process progresses.

I see I/O wait of 100% in vmstat.
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 21 0 2628416 15152 551024 0 0 0 0 28 344 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 8 340 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 26 343 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 8 341 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 26 357 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 8 325 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 26 343 0
0 0 100 0
0 21 0 2628416 15152 551024 0 0 0 0 8 325 0
0 0 100 0

I have about 4Gig of RAM in the system and most of the memory is free.
I see only about 550MB in buffers, rest all is pretty much available.

[root@h46 ~]# free
total used free shared buffers cached
Mem: 3238004 609340 2628664 0 15136 551024
-/+ buffers/cache: 43180 3194824
Swap: 4096532 0 4096532

Here is the stack trace for one of my test program threads and dd
process, both of them are stuck in congestion_wait.
--------------------------------------
PID: 3552 TASK: cb1fc610 CPU: 0 COMMAND: "dd"
#0 [f5c04c38] schedule at c0624a34
#1 [f5c04cac] schedule_timeout at c06250ee
#2 [f5c04cf0] io_schedule_timeout at c0624c15
#3 [f5c04d04] congestion_wait at c045eb7d
#4 [f5c04d28] balance_dirty_pages_ratelimited_nr at c045ab91
#5 [f5c04d7c] generic_file_buffered_write at c0457148
#6 [f5c04e10] __generic_file_aio_write_nolock at c04576e5
#7 [f5c04e84] generic_file_aio_write at c0457799
#8 [f5c04eb4] ext3_file_write at f8888fd7
#9 [f5c04ed0] do_sync_write at c0472e27
#10 [f5c04f7c] vfs_write at c0473689
#11 [f5c04f98] sys_write at c0473c95
#12 [f5c04fb4] sysenter_entry at c0404ddf
------------------------------------------
#0 [f6050c10] schedule at c0624a34
#1 [f6050c84] schedule_timeout at c06250ee
#2 [f6050cc8] io_schedule_timeout at c0624c15
#3 [f6050cdc] congestion_wait at c045eb7d
#4 [f6050d00] balance_dirty_pages_ratelimited_nr at c045ab91
#5 [f6050d54] generic_file_buffered_write at c0457148
#6 [f6050de8] __generic_file_aio_write_nolock at c04576e5
#7 [f6050e40] enqueue_entity at c042131f
#8 [f6050e5c] generic_file_aio_write at c0457799
#9 [f6050e8c] nfs_file_write at f8f90cee
#10 [f6050e9c] getnstimeofday at c043d3f7
#11 [f6050ed0] do_sync_write at c0472e27
#12 [f6050f7c] vfs_write at c0473689
#13 [f6050f98] sys_write at c0473c95
#14 [f6050fb4] sysenter_entry at c0404ddf
-----------------------------------

Can this be worked around, since most of the RAM is available, dd
process could infact find more memory for it's buffers rather than
waiting due to NFS requests. I believe this could be one reason why
file systems like VxFS use their own buffer cache different from
system-wide buffer cache.

Thanks
--Chakri
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • RE: AW: BUFFERS on Solaris-Box
    ... You are running a very old patch level of Solaris - install the latest 9_Recommended patch cluster. ... It won't help you with this issue, but good from a server management perspective:) ... Subject: AW: BUFFERS on Solaris-Box ... > monitor. ...
    (comp.databases.informix)
  • Re: Socket switch delay
    ... buffers are queued and will be later sent. ... socket, not sure about a blocking socket. ... >>(and I noticed you don't do overlapped receive at the server!) ... > optional in the client too). ...
    (microsoft.public.win32.programmer.networks)
  • Re: 1 file, multiple threads
    ... >only one) thread while reading the file in multiple other threads. ... buffers both before and after any file access. ... Much simpler to designate a single file-handler 'server', ...
    (comp.lang.python)
  • Re: Live event streaming hangs and delays - increasing WMS buffers
    ... You'll need to do some digging to find out why the stream going to WMS is ... Indeed the buffering happens on the server too, ... Any idea what can be done to improve the streaming and prevent these buffers ... video over slow line. ...
    (microsoft.public.windowsmedia.sdk)