Re: Very high load on P4 machines with 2.4.28

From: Marek Habersack (grendel_at_caudium.net)
Date: 01/05/05

  • Next message: Christoph Hellwig: "Re: XFS: inode with st_mode == 0"
    Date:	Wed, 5 Jan 2005 12:32:44 +0100
    To: Willy Tarreau <willy@w.ods.org>
    
    
    

    On Wed, Jan 05, 2005 at 06:28:41AM +0100, Willy Tarreau scribbled:
    > Hi,
    Hello,

    > On Wed, Jan 05, 2005 at 12:07:33AM +0100, Marek Habersack wrote:
    > > Interestingly enough, the machine with the highest load average is the
    > > one generating 4Mbit/s and the one with 24Mbit/s has the smallest load
    > > average value.
    >
    > This is common with multi-process servers like apache if the link is
    > saturated, because data takes more time to reach the client, so you have
    > a higher concurrency.
    The link isn't saturated - we have a 200Mbit/s margin atm. It's not a
    bandwidth problem, that's certain.

    > > The latter also suffers from the biggest loadavg increase.
    > > All of the virtual machines have iptables accounting chains for each
    > > configured IP (there are between 62 IP numbers on one and 32 on the other).
    > > The virtual boxes have two 80GB SATA drives raided with softraid. The
    > > non-virtual box has a single IDE drive, no raid.
    >
    > > (virtual #2, the 24Mbit/s one)
    > > # vmstat
    > > procs -----------memory---------- ---swap-- -----io---- --system------cpu----
    > > r b swpd free buff cache si so bi bo in cs us sy id wa
    > > 5 3 172448 13084 1208 304048 4 4 90 50 109 117 19 8 73 0
    >
    > I don't like something : with 73% idle, you have 5 processes in the rq. I think
    > this machine writes logs synchronously to disks, or stores SSL sessions on a
    the only synchronously written logs are auth.log and mail.err, SSL is there
    indeed, but the site is hardly ever accessed (as of a while ago, the box has
    a load of 0.75, pushing out 14Mbit/s. With 2.4.28 last week it was around
    10.0 in the same conditions).

    > real disk and waits for writes. A tmpfs would be a great help.
    The only thigs writing to disk on regular basis (except for syslog and
    apache for logs) are the php session files, one tdb database for traffic
    data and mysql (which might be using fsync - can that be the cause of the
    i/o slowness?). But, in any case, the machine behaves well under kernels
    other than 2.4.28.

    > You can try to trace the processes activity with :
    >
    > # strace -Te write <process pid>
    > It will display the time elapsed in each write() syscall, you'll find the
    > fds in /proc/<pid>/fd. You may notice big times on logs or ssl sessions.
    nope... the times are in the range 0.000008 to 0.000045...
     
    > > (the non-virtual)
    > > # vmstat
    > > procs -----------memory---------- ---swap-- -----io---- --system------cpu----
    > > r b swpd free buff cache si so bi bo in cs us sy id wa
    > > 60 0 70300 115960 0 369244 0 0 79 32 90 45 73 7 21 0
    >
    > Same note for this one, although it does more user space work (php? ssl?).
    poorly written perl scripts

    > It's possible that some change in 2.4.28 touches the I/O subsystem and
    > increases your wait I/O time in this particular application.
    > (...)
    Any clues as to where too look? I examined the 2.4.28 changelog and saw
    nothing that would suggest such change, but then I'm not a kernel hacker, I
    might have easily missed something important.

    > > One other interesting thing to note is that we have one
    > > other box with the similar configuration to the virtuals (also a virtual
    > > host) but it runs 2.4.28 with SMP+HT enabled - no load problems there at
    > > all.
    >
    > So, to contradict myself, have you tried enabling HT on other boxes which
    > suffer from the load ?
    Yep, only one box boots fine with HT enabled (out of the ones with
    problems), the others just freeze (we thought it could have been the machine
    BIOS, but updating it didn't help)

    > > Let me know if you need more info,
    >
    > You have send fairly enough info right now. Other than I/O work, I have no
    > idea. You may want to play with /proc/sys/vm/{bdflush,max-readahead} and
    > others to see if it changes things.
    At this point I think we're gonna run them under the older kernels and wait
    for 2.4.29 to see whether the problem still exists there. If it does, we'll
    try 2.6 on the machines and if that doesn't help, we'll do some more testing
    with 2.4.28 - we have our hands tied, since they are production machines and
    we cannot let them run with such degraded performance for too long...

    > If your load is bursty, it might help to reduce the ratio of dirty blocks
    > before flushing (first field in bdflush), because although writes will
    > start more often, they will take fewer time.
    what about nfract_sync? Does it make sense to make it smaller as well? I've
    also decreased age_buffer to 15s
     
    > I already have solved similar problems by disabling keep-alive to decrease
    > the number of processes.
    Disabling keep-alive is a routine here... :) But, that is unlikely to be the
    cause since it's evidently a kernel thing.

    Well, I'll see what good the bdflush changes do to the machines when they
    run under the "good" kernel and we'll schedule for some testing with 2.4.28
    at some point.

    thanks for your help, it's greately appreciated!

    best regards,

    marek

    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/



  • Next message: Christoph Hellwig: "Re: XFS: inode with st_mode == 0"

    Relevant Pages

    • Re: Very high load on P4 machines with 2.4.28
      ... the machine with the highest load average is the ... increases your wait I/O time in this particular application. ... > other box with the similar configuration to the virtuals (also a virtual ...
      (Linux-Kernel)
    • Re: Dynamically loading binaries in Kernel mode.
      ... Right now only the kernel will load ... Windows 2k/XP/2k3 Filesystem and Driver Consulting ...
      (microsoft.public.development.device.drivers)
    • Xorg xserver blank using one kernel, but ok using another
      ... because xorg.conf is the same whether it's runinng under Xen or a normal kernel. ... The main difference beteween the blank screen and the ok screen is that when the display works, I get "VESA VBE DDC supported" but when it goes blank I get "VESA VBE DDC not supported". ... # This file was generated by dexconf, the Debian X Configuration tool, using ... Load "bitmap" ...
      (Debian-User)
    • Re: Xorg xserver blank using one kernel, but ok using another
      ... because xorg.conf is the same whether it's runinng under Xen or a normal kernel. ... The main difference beteween the blank screen and the ok screen is that when the display works, I get "VESA VBE DDC supported" but when it goes blank I get "VESA VBE DDC not supported". ... # This file was generated by dexconf, the Debian X Configuration tool, using ... Load "bitmap" ...
      (Debian-User)
    • [RFC 10/11] remove mention of CONFIG_KMOD from documentation
      ... this will allocate the first available loopback device (and load loop.o ... kernel module if necessary) automatically. ... plugged into slots found on all modern laptop computers. ...
      (Linux-Kernel)