Re: kernel BUG at mm/filemap.c:332!

From: Mihai RUSU (dizzy_at_roedu.net)
Date: 12/04/03

  • Next message: Jörn Engel: "Re: partially encrypted filesystem"
    Date:	Thu, 4 Dec 2003 19:26:38 +0200 (EET)
    To: Linus Torvalds <torvalds@osdl.org>
    
    

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Hi Linus

    First of all thanks for the answer!

    On Thu, 4 Dec 2003, Linus Torvalds wrote:

    >
    > Nathan,
    > you're not off the hook yet. This is a smoking gun on XFS, and this time
    > with a big clue: large directories, and a low-memory situation.

    Sorry to have misguided you guys in the first post. After rebooting the
    machine I have some more information, the actual directory size its about
    some hundred entries (~400) and not thousands as I previously speculated
    (I didnt know the exact number until I could ls it and I couldnt do that
    until I had to reboot the machine).

    Beeing just several hundred entries I know that I have at least one more
    2.6.0-test11 machine (SMP, no MD but hw DAC960 RAID) with more entries in
    one directory and I didnt got any such message (yet), it has only 5 days
    uptime, we will see if I get anything there too. It could be just that on
    the other machine I dont have much action happening in the directories
    with many entries. The machine which got the kernel error has a lot of
    things going in that directory with many entries (mostly stats gathered
    every 5 mins from cron with mrtg and written to binary image files).

    However I have some more usefull (I hope) information about the subject.
    Before rebooting I wanted to first install a do_brk() patched 2.4.21-xfs
    kernel with lilo. Unfortunetly lilo stuck in a fsync() call after writing
    to screen that it did added all kernel images to MBR as configured in
    lilo.conf. When I booted I had no problem to boot from the new do_brk()
    fixed kernel so lilo seems it did the job, I dont know why it stuck
    in fsync().

    ctrl-alt-del didnt do the trick (I had online ssh session on the machine
    which was working , I could do ps ax, vmstat etc, but probably init was
    doing something which also stuck in D state) so I had to reboot it "hard".
    After power on, one coleague complained that a file on which he worked a
    couple of minutes before I took the machine down had NULL bytes instead of
    actual content. I know that "dirty" data gets flushed to disk every 30
    seconds so this seems a little bit strange (in general I know that XFS
    leaves NULL bytes in files modified just before a unclean reboot but this
    file was modified some 5 minutes before the "hard" reboot).

    > Also, this time the config file doesn't have any MD/RAID support according
    > to the attachment:
    >
    > # Multi-device support (RAID and LVM)
    > #
    > # CONFIG_MD is not set
    >
    > so it looks like the XFS and MD issues really are totally unrelated.

    Yep, Im very conservative to the features I use in the kernel :)

    > Mihai: the oops itself is in this case not very telling, since it's just a
    > result of corruption of some fundamental data structures (probably
    > somebody using a page cache page after having free'd it - and it probably
    > only shows up when memory gets low and pages have to be cleaned). Can you
    > tell Nathan more about the filesystem setup (block size, as much as
    > possible about the affected directory, etc).

    Ok.

    $ xfs_info /var
    meta-data=/var isize=256 agcount=18, agsize=262144 blks
    data = bsize=4096 blocks=4482127, imaxpct=25
             = sunit=0 swidth=0 blks, unwritten=0
    naming =version 2 bsize=4096
    log =internal bsize=4096 blocks=1200
    realtime =none extsz=65536 blocks=0, rtextents=0

    Mount options are "rw,noatime".

    Please let me know if you need any other infos. Thanks!

    > Linus

    - --
    Mihai RUSU Email: dizzy@roedu.net
    GPG : http://dizzy.roedu.net/dizzy-gpg.txt WWW: http://dizzy.roedu.net
                           "Linux is obsolete" -- AST
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.3 (GNU/Linux)

    iD8DBQE/z25QPZzOzrZY/1QRAnvLAKDmlFPQEYyzVmxgAgopuar3hhGZ5ACeOq6H
    Zwty+roqa5JqjBZBJDF0xnc=
    =gPtb
    -----END PGP SIGNATURE-----
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Jörn Engel: "Re: partially encrypted filesystem"

    Relevant Pages

    • Re:SOLVVED vinum crashes the Box... WRONG POSTING
      ... vinum crashes the Box by using more then ten disks on ... > With this configuration vinum crashes the Machine with this Message: ... > I'm going to enable the kernel debugger now... ... >> while devfs rules works for those, but they vanish on reboot. ...
      (freebsd-stable)
    • SUMMARY: device database locked by cluster member
      ... A reboot did clear up the problem. ... kernel, probably one of the interactions between the ... and get a crash dump (or how to force a panic, ... > Cluster shared component has no entry in the cluster database. ...
      (Tru64-UNIX-Managers)
    • instant reboot when trying to load recent RELENG_5 kernel
      ... but it still happens on a kernel compiled with CPUTYPE ... instantly reboot with ACPI enabled. ... port ... can't assign resources ...
      (freebsd-current)
    • Re: Microsoft Internet Explorer Malformed HTML Parsing Denial of Service Vulnerability
      ... it would be too complicated to implement, so we better restart the whole ... NEVER have to reboot... ... I meant kernel services from a system view, ... Some probably are small design errors and some probably are deep structural ...
      (alt.computer.security)
    • Re: ATA_DMA errors
      ... Right after the upgrade things worked well for a couple of hours, ... then I got a reboot all of a sudden. ... Jun 24 18:22:19 kernel: ... 0x5000-0x500f,0x4000-0x4041,0xcf8-0xcff on acpi0 ...
      (freebsd-stable)