Random Filesystem Lag / Delay
From: James Sella (alles_at_digital-genesis.com)
Date: 10/06/03
- Next message: pacifican: "Re: cannot get a pid file in the /var/run directory, how do you do it?"
- Previous message: Flip: "cannot get a pid file in the /var/run directory, how do you do it?"
- Next in thread: Paul Lutus: "Re: Random Filesystem Lag / Delay"
- Reply: Paul Lutus: "Re: Random Filesystem Lag / Delay"
- Reply: Michael Heiming: "Re: Random Filesystem Lag / Delay"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 06 Oct 2003 02:25:28 GMT
Looking for some ideas on how to troubleshoot and/or identify the cause
of the issues I've been experiencing.
The system will randomly (every 20 seconds to 15 minutes) pause or delay
when running commands, such as a simple 'ls' or even 'uptime'. The
command will generally delay for 1 to 15 seconds. Every shell
experiences the lag at the same time. When I notice the first command
lag, I can trigger a different command in another window and it will lag
as well. Both will complete at the exact same time, as if they were both
waiting for something. The servers 1min load average will jump up by 1-2
(ie: 0.21 -> 2.10), when the lag occurs.
Network I/O isn't affected. Sessions don't lockup, you can continue to
type, just not run commands and the server will continue to respond to ICMP.
The server is:
Dual Athlon MP 2400+
2G RAM
(2) 80G IDE Drives (Soft RAID1)
Drives are installed as hda and hdc, both are UDMA(100).
Partitions are all ext3 in the default mode (ordered).
So far, I've lowered the elevator's read and write latency from 2048/8096:
# /sbin/elvtune /dev/hda
/dev/hda elevator ID 1
read_latency: 1024
write_latency: 4096
max_bomb_segments: 0
#/sbin/elvtune /dev/hdc
/dev/hdc elevator ID 2
read_latency: 1024
write_latency: 4096
max_bomb_segments: 0
The server generally has up to 20 users logged in at the same time, and
around 200 processes running on it. The RAID1 set is fully functional as
reported in /proc/mdstat, I can see no task that peg the system and
causes the sudden lag when watching in top (even at 0.1 sec updates). I
don't see any noticable change in I/O activity when watching vmstat.
Output from 'time' is strange. No time is spent in user or sys, but real
time passes:
# time w
real 0m32.005s
user 0m0.000s
sys 0m0.010s
I'm not recieving any IDE errors from the kernel, so it doesn't appear
to be the drive itself hanging.
Any ideas??
-Jim
- Next message: pacifican: "Re: cannot get a pid file in the /var/run directory, how do you do it?"
- Previous message: Flip: "cannot get a pid file in the /var/run directory, how do you do it?"
- Next in thread: Paul Lutus: "Re: Random Filesystem Lag / Delay"
- Reply: Paul Lutus: "Re: Random Filesystem Lag / Delay"
- Reply: Michael Heiming: "Re: Random Filesystem Lag / Delay"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|