Re: remove zero_page (was Re: -mm merge plans for 2.6.24)
- From: Hugh Dickins <hugh@xxxxxxxxxxx>
- Date: Tue, 9 Oct 2007 14:00:25 +0100 (BST)
On Tue, 9 Oct 2007, Nick Piggin wrote:
The commit b5810039a54e5babf428e9a1e89fc1940fabff11 contains the note
A last caveat: the ZERO_PAGE is now refcounted and managed with rmap
(and thus mapcounted and count towards shared rss). These writes to
the struct page could cause excessive cacheline bouncing on big
systems. There are a number of ways this could be addressed if it is
an issue.
And indeed this cacheline bouncing has shown up on large SGI systems.
There was a situation where an Altix system was essentially livelocked
tearing down ZERO_PAGE pagetables when an HPC app aborted during startup.
This situation can be avoided in userspace, but it does highlight the
potential scalability problem with refcounting ZERO_PAGE, and corner
cases where it can really hurt (we don't want the system to livelock!).
There are several broad ways to fix this problem:
1. add back some special casing to avoid refcounting ZERO_PAGE
2. per-node or per-cpu ZERO_PAGES
3. remove the ZERO_PAGE completely
I will argue for 3. The others should also fix the problem, but they
result in more complex code than does 3, with little or no real benefit
that I can see. Why?
Sorry, I've no useful arguments to add (and my testing was too much
like yours to add any value), but I do want to go on record as still
a strong supporter of approach 3 and your patch.
Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- -mm merge plans for 2.6.24
- From: Andrew Morton
- remove zero_page (was Re: -mm merge plans for 2.6.24)
- From: Nick Piggin
- Re: remove zero_page (was Re: -mm merge plans for 2.6.24)
- From: Linus Torvalds
- Re: remove zero_page (was Re: -mm merge plans for 2.6.24)
- From: Nick Piggin
- -mm merge plans for 2.6.24
- Prev by Date: Re: [RFC PATCH RT] push waiting rt tasks to cpus with lower prios.
- Next by Date: Re: __LITTLE_ENDIAN vs. __LITTLE_ENDIAN_BITFIELD
- Previous by thread: Re: remove zero_page (was Re: -mm merge plans for 2.6.24)
- Next by thread: A kernel Tracing interface (was Re: -mm merge plans for 2.6.24)
- Index(es):