Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;



Andi Kleen a écrit :
Is it possible virt_to_slab(objp)->nodeid being different from pfn_to_nid(objp) ?
It is possible the page allocator falls back to another node than requested. We would need to check that this never occurs.

The only way to ensure that would be to set a strict mempolicy.
But I'm not sure that's a good idea -- after all you don't want
to fail an allocation in this case.

But pfn_to_nid on the object like proposed by Eric should work anyways.
But I'm not sure the tables used for that will be more often cache hot
than the slab.

pfn_to_nid() on most x86_64 machines access one cache line (struct memnode).

Node 0 MemBase 0000000000000000 Limit 0000000280000000
Node 1 MemBase 0000000280000000 Limit 0000000480000000
NUMA: Using 31 for the hash shift.

On this example, we use only 8 bytes of memnode.embedded_map[] to find nid of all 16 GB of ram. On profiles I have, memnode is always hot (no cache miss on it).

While virt_to_slab() has to access :

1) struct page -> page_get_slab() (page->lru.prev) (one cache miss)
2) struct slab -> nodeid (one other cache miss)


So using pfn_to_nid() would avoid 2 cache misses.

I understand we want to do special things (fallback and such tricks) at allocation time, but I believe that we can just trust the real nid of memory at free time.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Possible ways of dealing with OOM conditions.
    ... There is more to networking that skbs only, what about route cache, ... With power-of-two allocation SLAB wastes 500 bytes for each 1500 MTU ... Well, if you have such hardware its not rare at all, But yeah that ... adds to the fragmentation issues on the page-allocator level. ...
    (Linux-Kernel)
  • Re: [PATCH 2.6.9-rc2-mm1 0/2] mm: memory policy for page cache allocation
    ... >>allocation of page cache pages on NUMA machines. ... and for other workloads you can specify the default policy ... Since there is (with this patch) a separate policy to control ...
    (Linux-Kernel)
  • New: Documentation/vm/slabinfo.txt
    ... after creating a slab cache, ... out of that slab cache. ... allocation and require more overhead to track used vs. unused objects). ... name active-objects total-objects object-size active-allocs total-allocs ...
    (Linux-Kernel)
  • Re: [PATCH] VM: add vm.free_node_memory sysctl
    ... then the existing mapped memory will get swapped out in order to ... clean page-cache pages will get ... > that allows one to just get rid of the clean page cache (or at least enough ... > dominates the extra cost of doing the page allocation. ...
    (Linux-Kernel)
  • Re: [PATCH] VM: add vm.free_node_memory sysctl
    ... if we have a node that has some mapped memory ... as to clear some clean page cache pages from that node, ... dominates the extra cost of doing the page allocation. ...
    (Linux-Kernel)