Re: 2.6.25-rc7-git2: Reported regressions from 2.6.24



On Sat, 29 Mar 2008, Linus Torvalds wrote:


You don't have a f*cking clue about this cocde that you're supposed to be
maintaining, do you?

See "slab_alloc()". See the code:

if (unlikely((gfpflags & __GFP_ZERO) && object))
memset(object, 0, c->objsize);

and see how it does it regardless of anything else.

Yes I am very aware of that.

In short, if *any* code-path calls down to any allocator from that routine
with GFP_ZERO set, it's a bug. No ifs, buts or maybes about it. It
shouldn't do that, because the actual memset() is done by slab_alloc(),
and should not be done ANYWHERE ELSE.

It has *nothing* to do with "object is too big" or anything else.

It has to do how large objects are allocated through kmalloc_large().
kmalloc_large() is elsewhere called with unfiltered gfpflags and relies
on zeroing being handled by the page allocator. It can take unfiltered gfp
flags.

The filtering of __GFP_ZERO that you added avoids the double zeroing for
the fallback path (which is only called if all the partial lists are empty
and after the page allocator went through reclaim and did not get the
large sized memory we wanted). So okay the patch could be a performance
enhancement. But then it adds the filtering to the hot path instead of the
code path that containts the kmalloc_large that is executed once in a blue
moon. The hot path should only filter when we actually decide that we need
to allocate a new slab from the page allocator.

It seemed to me that the reason for inserting the filtering of __GFP_ZERO
there was the belief that the page allocator cannot take __GFP_ZERO
through kmalloc_large() if we are in an interrupt.

The use of kmalloc_large() in __slab_alloc() is a bit strange at this
point. The cleanup work in 2.6.26 will make this all nice again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [PATCH] 2/2 Prezeroing large blocks of pages during allocation
    ... This means we rarely win with zeroing in the allocator but the ... >> ordering for USERZERO has a tendency to clobber the reserved lists because ... My approach is to ignore zero pages during free/coalescing and to treat ...
    (Linux-Kernel)
  • Re: [PATCH] 2/2 Prezeroing large blocks of pages during allocation
    ... This means we rarely win with zeroing in the allocator but the ... Thanks for your efforts in integrating zeroing into your patches to reduce ... > ordering for USERZERO has a tendency to clobber the reserved lists because ... intentionally either zero all free pages (which means you can coalesce ...
    (Linux-Kernel)
  • Re: PG_zero
    ... >> shift the profiled cost of page zeroing out to the routine acutally using ... > I'm not a big fan of the idle zeroing myself despite I implemented it. ... > where the idle zeroing only refile across the two per-cpu lists like I ... > lists in the classzone before falling back to the buddy allocator. ...
    (Linux-Kernel)
  • Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?
    ... clear_pagemight have extra semantics. ... places instead should use the zeroing version of the allocator instead ... if maybe the zeroing allocators are better suited there.. ...
    (Linux-Kernel)
  • Re: Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal
    ... Christoph Lameter wrote: ... > to request zeroed pages from the page allocator. ... > o Replace all page zeroing after allocating pages by request for ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)