[RFC/PATCH] Optimize zone allocator synchronization



From: Donald E. Porter <porterde@xxxxxxxxxxxxx>

In the bulk page allocation/free routines in mm/page_alloc.c, the zone
lock is held across all iterations. For certain parallel workloads, I
have found that releasing and reacquiring the lock for each iteration
yields better performance, especially at higher CPU counts. For
instance, kernel compilation is sped up by 5% on an 8 CPU test
machine. In most cases, there is no significant effect on performance
(although the effect tends to be slightly positive). This seems quite
reasonable for the very small scope of the change.

My intuition is that this patch prevents smaller requests from waiting
on larger ones. While grabbing and releasing the lock within the loop
adds a few instructions, it can lower the latency for a particular
thread's allocation which is often on the thread's critical path.
Lowering the average latency for allocation can increase system throughput.

More detailed information, including data from the tests I ran to
validate this change are available at
http://www.cs.utexas.edu/~porterde/kernel-patch.html .

Thanks in advance for your consideration and feedback.

Don

Signed-off-by: Donald E. Porter <porterde@xxxxxxxxxxxxx>

---

diff -uprN linux-2.6.23.1/mm/page_alloc.c linux-2.6.23.1-opt/mm/page_alloc.c
--- linux-2.6.23.1/mm/page_alloc.c 2007-10-12 11:43:44.000000000 -0500
+++ linux-2.6.23.1-opt/mm/page_alloc.c 2007-10-29 18:29:05.000000000 -0500
@@ -477,19 +477,19 @@ static inline int free_pages_check(struc
static void free_pages_bulk(struct zone *zone, int count,
struct list_head *list, int order)
{
- spin_lock(&zone->lock);
zone->all_unreclaimable = 0;
zone->pages_scanned = 0;
while (count--) {
struct page *page;
+ spin_lock(&zone->lock);

VM_BUG_ON(list_empty(list));
page = list_entry(list->prev, struct page, lru);
/* have to delete it as __free_one_page list manipulates */
list_del(&page->lru);
__free_one_page(page, zone, order);
+ spin_unlock(&zone->lock);
}
- spin_unlock(&zone->lock);
}

static void free_one_page(struct zone *zone, struct page *page, int order)
@@ -665,14 +665,17 @@ static int rmqueue_bulk(struct zone *zon
{
int i;

- spin_lock(&zone->lock);
for (i = 0; i < count; ++i) {
- struct page *page = __rmqueue(zone, order);
+ struct page *page;
+ spin_lock(&zone->lock);
+
+ page = __rmqueue(zone, order);
if (unlikely(page == NULL))
break;
list_add_tail(&page->lru, list);
+ spin_unlock(&zone->lock);
}
- spin_unlock(&zone->lock);
+
return i;
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: UMA zone allocator memory fragmentation questions
    ... UMA will need to do locking if it manages ... Currently, UMA supports limits on allocation by keg, so if two zones don't ... Because of fragmentation that can occur in a zone due ... The zone API also provides custom page allocation and free hooks. ...
    (freebsd-hackers)
  • [PATCH] Avoiding fragmentation through different allocator V2
    ... Instead of having one global MAX_ORDER-sized array of free lists, ... one for each type of allocation. ... * Used by page_zone() to look up the address of the struct zone whose ... +static struct page *__rmqueue(struct zone *zone, unsigned int order, int flags) ...
    (Linux-Kernel)
  • Re: [RFC/PATCH] Optimize zone allocator synchronization
    ... lock is held across all iterations. ... especially at higher CPU counts. ... thread's allocation which is often on the thread's critical path. ... recently-posted ticket spinlocks patcheswill reduce the need for this ...
    (Linux-Kernel)
  • Re: Freeze
    ... >> this code path, the slabzone would get a new slab for each calling thread, ... > They will not flood the system with allocation requests. ... even if it is for the same zone. ...
    (freebsd-current)
  • Re: UMA zone allocator memory fragmentation questions
    ... Currently, UMA supports limits on allocation by keg, so if two zones don't share the same keg, they won't share the same limit. ... Even though the zone API provides scope for custom item constructors ...
    (freebsd-hackers)