Re: Kernel RCU: shrink the size of the struct rcu_head
- From: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
- Date: Fri, 23 Oct 2009 08:29:38 -0400
* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
On Wed, Oct 21, 2009 at 10:53:15AM -0400, Mathieu Desnoyers wrote:
* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
On Sun, Oct 18, 2009 at 07:29:18PM -0400, Mathieu Desnoyers wrote:
Hi Paul,
I noticed that you already discussed the possibility of shrinking the
struct rcu_head by removing the function pointer.
(http://kernel.org/pub/linux/kernel/people/paulmck/rcutodo.html)
The ideas brought in so far require having per-callback lists, which
involves a bit of management overhead and don't permit keeping the
call_rcu() in cpu order.
But please note that this is on the "Possibly Dubious Changes" list. ;-)
You might want to look into the Userspace RCU urcu-defer.c
implementation, where I perform pointer encoding to compact the usual
case, expected to be the same callback passed as parameter multiple
times in a row to call_rcu(). This is very typical with multiple free()
calls for different data structures next to each other.
This typically keeps the size of the information to encode per callback
down to a minimum: the size of a single pointer. It would be good to
trace the kernel usage of call_rcu() to see if my assumption holds.
I just thought I should tell you before you start looking at this
issue further.
So the idea is to maintain a per-CPU queue of function pointers, but
with the pointers on this queue encoded to save space, correct?
Yes, exactly.
OK, I will add something to this effect on my rcutodo page.
If I
understand correctly, the user-level rcu-defer implementation relies on
the following:
1. It is illegal to call _rcu_defer_queue() within an RCU read-side
critical section (due to the call to rcu_defer_barrier_thread()
which in turn calls synchronize_rcu(). This is necessary to
handle queue overflow. (Which appears to be why you introduce
a new API, as it is legal to invoke call_rcu() from within an
RCU read-side critical section.)
When dealing with queue overflow, I figured we have 4 alternatives.
Either:
1, 2, 3) We proceed to execution of {the single, all, thread local}
callback(s) on the spot after a synchronize_rcu().
4) We expand the queue by allocating more memory.
The idea of pointer encoding to save space could be used with any of 1,
2, 3, or 4. As you say, call_rcu() requires (4), because it tolerates
being called from an rcu read-side C.S.. 1, 2, 3 are incompatible with
read-side C.S. context because they require to use synchronize_rcu()
within the C.S., which would deadlock on its calling context.
Now, there is a rationale for the choice of (3) in my urcu-defer
implementation:
* It's how I can deal with memory full (-ENOMEM) without letting the
system die with exit(). How does the kernel call_rcu() deal with this
currently ? BUG_ON, WARN_ON ?
It doesn't have to do anything -- the caller of call_rcu() is responsible
for allocating any required memory. So call_rcu() never allocates memory
and thus never needs to worry about a memory-allocation failure.
I see. So it puts the burden of memory allocation onto the creation of
the original object we are dealing with. I understand why it's done like
that now: it does not add another memory allocation failure point.
* It acts as a rate limiter for urcu_defer_queue(). Basically, if a
thread starts enqueuing callbacks too fast, it will eventually fill its
queue and have to empty it itself. AFAIK, It's not possible to do that
if you allow call_rcu() to be called from read-side C.S..
Yep! ;-)
From userland point of view, it would be good to provide an API thatallows rate-limiting, especially to deal with DoS types of attacks. I
I just renamed rcu_defer_queue() to defer_rcu() and removed the define
for call_rcu(), because this last mapping introduced an API with the
same name as the kernel API, but with different parameters.
So the idea is to have:
* defer_rcu(fct, p): fixed-sized queue rcu work deferral. Cannot be
called from rcu read-side C.S..
* defer_rcu_ratelimit(fct, p, rl): same as above, but with added rate
limiter callback.
and, eventually, to add back a standard call_rcu(), which can be called
from within RCU read-side C.S., but which provides no rate limiting
whatsoever.
I could even extend rcu_defer_queue() to take a second rate-limiter
callback, which would check if the thread went over some threshold and
give a more precise limit (e.g. amount of memory to be freed) on the
rate than the "4096 callbacks in flight max", which have been chosen by
benchmarks, but is a bit arbitrary in terms of overall callback effect.
How important is it to permit enqueuing callbacks from within rcu
read-side C.S. in terms of real-life usage ? If it is really that
important to fill this use-case, then I could have a mode for call_rcu()
that expands the RCU callback queue upon overflow. But as I argue above,
I really prefer the control we have with a fixed-sized queue.
There are occurrences of this in the Linux kernel. In theory, you could
always hang onto the object until leaving the outermost RCU read-side
critical section, but in practice this is not always consistent with
good software-engineering practice.
One use case is when you have an RCU-protected list, each element of
which has an RCU-protected list hanging off it. In this case, you might
scan the upper-level list under RCU protection, but during the scan you
might need to remove elements from the lower-level list and pass them
to call_rcu().
So it really needs to be legal for call_rcu() to be invoked from within
an RCU read-side critical section.
I agree that not being able to use call_rcu() from a read-side C.S.
could really be a problem here.
This is why I think aving two interfaces, one permitting calling
call_rcu() from within C.S., but requiring the added struct rcu_head,
and the other using per-thread queues with maximum size, rate limiting,
but which cannot be used in read-side C.S. seems like a good tradeoff.
2. It is OK to wait for a grace period when a thread calls
rcu_defer_unregister_thread() while exiting. In the kernel,
this is roughly equivalent to the CPU_DYING notifier, which
cannot block, thus cannot wait for a grace period.
I could imagine copying the per-CPU buffer somewhere, though
my experience with the RCU/CPU-hotplug interface does not
encourage me in this direction. ;-)
As you say, we don't _have_ to empty the queue before putting a
thread/cpu offline. We could simply copy the unplugged cpu queue to an
orphan queue, as you currently do in your implementation. I agree that
it would be more suitable to the cpu hotplug CPU_DYING execution
context, due to its inherent trickiness.
Especially if you want something like rcu_barrier() to continue working.
Hmmm... Can user applications unload dynamically linked libraries? ;-)
Yes. With dlclose(). But I expect library developers to call
rcu_defer_barrier() in their destructor if they have queued any work.
(/me sneaks a README update in the git tree to that effect) ;)
Thanks,
Mathieu
Thanx, Paul
Thanks,
Mathieu
Thanx, Paul
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: Kernel RCU: shrink the size of the struct rcu_head
- From: Paul E. McKenney
- Re: Kernel RCU: shrink the size of the struct rcu_head
- References:
- Kernel RCU: shrink the size of the struct rcu_head
- From: Mathieu Desnoyers
- Re: Kernel RCU: shrink the size of the struct rcu_head
- From: Paul E. McKenney
- Re: Kernel RCU: shrink the size of the struct rcu_head
- From: Mathieu Desnoyers
- Re: Kernel RCU: shrink the size of the struct rcu_head
- From: Paul E. McKenney
- Kernel RCU: shrink the size of the struct rcu_head
- Prev by Date: Re: [PATCHv1 1/3] OMAP UART: Adds omap-serial driver support.
- Next by Date: Re: Commit 34d76c41 causes linker errors on ia64 with NR_CPUS=4096
- Previous by thread: Re: Kernel RCU: shrink the size of the struct rcu_head
- Next by thread: Re: Kernel RCU: shrink the size of the struct rcu_head
- Index(es):
Relevant Pages
|