Re: Fw: 2.6.17 oops, possibly ntfs/mmap related



Hi,

On Thu, 2006-09-21 at 10:54 +0100, Anton Altaparmakov wrote:
On Tue, 2006-09-12 at 20:56 -0700, Andrew Morton wrote:
Andrew, thanks for forwarding me the message...
Begin forwarded message:

We have a machine which is currently making heavy use of a usb hard disc
formatted with ntfs. There have been two occasions where the kernel has
oopsed while this disc was being accessed heavily. Before adding this HDD
the machine in question was rock solid which leads me to think that it
might be related to ntfs. USB drives formatted with other filesystems do
not appear to suffer from this problem.

I have now seen such an oops too with 2.6.18 kernel. Note no NTFS file
systems were mounted at the time (but I had an NTFS file system mounted
earlier in the day).

The oops is caused by kswapd0 kernel thread, the stack trace is:

Call Trace:
[<c10470a3>] shrink_inactive_list+0x46b/0x790
[<c104747c>] shrink_zone+0xb4/0xd3
[<c104797d>] kswapd+0x2de/0x3cf
[<c102c18e>] kthread+0xc2/0xf0
[<c1000bf1>] kernel_thread_helper+0x5/0xb
DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
Leftover inexact backtrace:
[<c1003e6c>] show_stack_log_lvl+0x8c/0x97
[<c1003fc8>] show_registers+0x151/0x1c6
[<c10041af>] die+0x172/0x27b
[<c145f22c>] do_page_fault+0x42c/0x4f9
[<c10037dd>] error_code+0x39/0x40
[<c10470a3>] shrink_inactive_list+0x46b/0x790
[<c104747c>] shrink_zone+0xb4/0xd3
[<c104797d>] kswapd+0x2de/0x3cf
[<c102c18e>] kthread+0xc2/0xf0
[<c1000bf1>] kernel_thread_helper+0x5/0xb

And the EIP is at fs/buffer.c::try_to_release_page() the code of which
is here:

int try_to_release_page(struct page *page, gfp_t gfp_mask)
{
struct address_space * const mapping = page->mapping;

BUG_ON(!PageLocked(page));
if (PageWriteback(page))
return 0;

if (mapping && mapping->a_ops->releasepage)

^^^ bug happens here when the value of mapping->a_ops is used to obtain
mapping->a_ops->releasepage

return mapping->a_ops->releasepage(page, gfp_mask);
return try_to_free_buffers(page);
}

This bug seems to suggest that there is a page which the kernel is
trying to release private data which has page->mapping set to a valid
value and page->mapping->a_ops apparently set to an invalid value and
when page->mapping->a_ops->releasepage is dereferenced it causes an oops
with the kernel saying:

BUG: unable to handle kernel paging request at virtual address 020030d2

The values of the relevant variables from the oops are:

page = 0xc2248fa0
page->mapping = 0xe3a79eac
page->mapping->a_ops = 0x020030aa

Note that 0x020030aa+0x28 = 020030d2 which is the oops causing address
and 0x28 is the offset of the releasepage function pointer in the
address space operations structure...

This oops is not identical to the oopses pointed out by Jonathan at:

http://www.atrad.com.au/~jwoithe/kernel/oopses-20060913.txt

But those oopses have to do with pages also so could be related...

Anyone have any ideas how a page can end up in such a weird state?

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://www.linux-ntfs.org/ & http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [PATCH 0/7] OMFS filesystem version 3
    ... ntfs to fuse, still that was done and the resulting filesystem at the ... moment happens to outperform the kernel one in every respect;) ... still needs a lot of work on several areas (the size of the Microsoft NTFS ... Over time this changed for me and I started working primarily on the kernel and pushing things back into libntfs and as these things go I eventually was so busy that I no longer had the time to do any back porting to libntfs and since then I have only been working on the kernel driver. ...
    (Linux-Kernel)
  • 2.6.17 oops, possibly ntfs/mmap related
    ... We have a machine which is currently making heavy use of a usb hard disc ... There have been two occasions where the kernel has ... might be related to ntfs. ... I have instead put the full text of my original post ...
    (Linux-Kernel)
  • I really do!
    ... RH9, which I have been dual-booting with W-XPPro quite happily, ... So, I get my kernel to 'see' my NTFS install, and be happy with my ... and I can't include NTFS support because..... ...
    (Fedora)
  • Re: Fw: 2.6.17 oops, possibly ntfs/mmap related
    ... We have a machine which is currently making heavy use of a usb hard disc ... might be related to ntfs. ... I have now seen such an oops too with 2.6.18 kernel. ...
    (Linux-Kernel)
  • Re: A question about NTFS
    ... > everytime i upgrade the linux core i have to download a new version of the ... > ntfs on my machine. ... acquire a _kernel_ with the NTFS support enabled. ... write a script to do it. ...
    (Fedora)