Re: [bug] SLUB + mm/slab.c boot crash in -rc9




* Christoph Lameter <clameter@xxxxxxx> wrote:

Pretty please, could you pay more than cursory attention to this bug
i already spent two full days on and which is blocking the v2.6.25
release?

Yeah trying to get to understand how exactly sparsemem works and how
the 32 bit highmem stuff interacts with it... Sorry not code that I am
an expert in nor the platform that I am familiar with. Code mods there
required heavy review from multiple parties with expertise in various
subjects.

yeah - sorry about that impatient flame. And it could still be anything
from the page allocator to bootmem - or some completely unrelated piece
of code corrupting some key data structure.

sparsemem is supposed to work roughly like this on x86 (32-bit):

- the x86 memory map comes from the bios via e820.

- those individual chunks of e820-enumerated memory get
registered with mm/sparse.c's data structures via memory_present()
callbacks. [btw., this should be renamed to register_memory_present()
or register_sparse_range() - something less opaque.]

- there's really just 3 RAM areas that matter on this box, and the last
one is unusable for !PAE, which leaves 2.

- there's a 256 MB PCI aperture hole at 0xf0000000.

- out of the 64 sparse memory chunk the first 60 get filled in (all have
at least partially some RAM content) - the last 4 [the PCI aperture
hole] remains !present.

- we pass in an array of 3 zones to free_area_init_nodes().

- we free the lowmem pages into the buddy allocator via the usual
generic setup

- we have a special loop for highmem pages in arch/x86/mm/init_32.c,
set_highmem_pages_init(). This just goes through the PFNs one by one
and does an explicit __free_page() on all RAM pages that are in the
mem_map[] and which are non-reserved.

and that's it roughly.

my current guess would have been some bootmem regression/interaction
that messes up the buddy bitmaps - but i just reverted to the v2.6.24
version of bootmem.c and that crashes too ...

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/