Re: [PATCH 0/5] make slab gfp fair



On Thu, 2007-05-17 at 15:27 -0700, Christoph Lameter wrote:
On Thu, 17 May 2007, Peter Zijlstra wrote:

The way I read the cpuset page allocator, it will only respect the
cpuset if there is memory aplenty. Otherwise it will grab whatever. So
still, it will only ever use ALLOC_NO_WATERMARKS if the whole system is
in distress.

Sorry no. The purpose of the cpuset is to limit memory for an application.
If the boundaries would be fluid then we would not need cpusets.

Right, I see that I missed an ALLOC_CPUSET yesterday; but like Paul
said, cpusets are ignored when in dire straights for an kernel alloc.

Just not enough to make inter-cpuset interaction on slabs go away wrt
ALLOC_NO_WATERMARK :-/

But the same principles also apply for allocations to different zones in a
SMP system. There are 4 zones DMA DMA32 NORMAL and HIGHMEM and we have
general slabs for DMA and NORMAL. A slab that uses zone NORMAL falls back
to DMA32 and DMA depending on the watermarks of the 3 zones. So a
ZONE_NORMAL slab can exhaust memory available for ZONE_DMA.

Again the question is the watermarks of which zone? In case of the
ZONE_NORMAL allocation you have 3 to pick from. Its the last one? Then its
the same as ZONE_DMA, and you got a collision with the corresponding
DMA slab. Depending the system deciding on a zone where we allocate the
page from you may get a different watermark situation.

Isn't the zone mask the same for all allocations from a specific slab?
If so, then the slab wide ->reserve_slab will still dtrt (barring
cpusets).

On x86_64 systems you have the additional complication that there are
even multiple DMA32 or NORMAL zones per node. Some will have DMA32 and
NORMAL, others DMA32 alone or NORMAL alone. Which watermarks are we
talking about?

Watermarks like used by the page allocator given the slabs zone mask.
The page allocator will only fall back to ALLOC_NO_WATERMARKS when all
target zones are exhausted.

The use of ALLOC_NO_WATERMARKS depends on the contraints of the allocation
in all cases. You can only compare the stresslevel (rank?) of allocations
that have the same allocation constraints. The allocation constraints are
a result of gfp flags,

The gfp zone mask is constant per slab, no? It has to, because the zone
mask is only used when the slab is extended, other allocations live off
whatever was there before them.

cpuset configuration and memory policies in effect.

Yes, I see now that these might become an issue, I will have to think on
this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Freeze
    ... >> this code path, the slabzone would get a new slab for each calling thread, ... > They will not flood the system with allocation requests. ... even if it is for the same zone. ...
    (freebsd-current)
  • Re: UMA zone allocator memory fragmentation questions
    ... UMA will need to do locking if it manages ... Currently, UMA supports limits on allocation by keg, so if two zones don't ... Because of fragmentation that can occur in a zone due ... The zone API also provides custom page allocation and free hooks. ...
    (freebsd-hackers)
  • [PATCH] Avoiding fragmentation through different allocator V2
    ... Instead of having one global MAX_ORDER-sized array of free lists, ... one for each type of allocation. ... * Used by page_zone() to look up the address of the struct zone whose ... +static struct page *__rmqueue(struct zone *zone, unsigned int order, int flags) ...
    (Linux-Kernel)
  • Re: UMA zone allocator memory fragmentation questions
    ... Currently, UMA supports limits on allocation by keg, so if two zones don't share the same keg, they won't share the same limit. ... Even though the zone API provides scope for custom item constructors ...
    (freebsd-hackers)
  • Re: UMA zone allocator memory fragmentation questions
    ... My problem is that I need to enforce a single memory limit ... TMPFS uses multiple UMA zones to store filesystem metadata. ... Even though the zone API provides scope for custom item constructors ... The zone API also provides custom page allocation and free hooks. ...
    (freebsd-hackers)