Re: [patch 06/11] syslets: core, documentation



On Tue, 13 Feb 2007, Ingo Molnar wrote:


* Davide Libenzi <davidel@xxxxxxxxxxxxxxx> wrote:

+The Syslet Atom:
+----------------
+
+The syslet atom is a small, fixed-size (44 bytes on 32-bit) piece of
+user-space memory, which is the basic unit of execution within the syslet
+framework. A syslet represents a single system-call and its arguments.
+In addition it also has condition flags attached to it that allows the
+construction of larger programs (syslets) from these atoms.
+
+Arguments to the system call are implemented via pointers to arguments.
+This not only increases the flexibility of syslet atoms (multiple syslets
+can share the same variable for example), but is also an optimization:
+copy_uatom() will only fetch syscall parameters up until the point it
+meets the first NULL pointer. 50% of all syscalls have 2 or less
+parameters (and 90% of all syscalls have 4 or less parameters).

Why do you need to have an extra memory indirection per parameter in
copy_uatom()? [...]

yes. Try to use them in real programs, and you'll see that most of the
time the variable an atom wants to access should also be accessed by
other atoms. For example a socket file descriptor - one atom opens it,
another one reads from it, a third one closes it. By having the
parameters in the atoms we'd have to copy the fd to two other places.

Yes, of course we have to support the indirection, otherwise chaining
won't work. But ...



I can understand that chaining syscalls requires variable sharing, but
the majority of the parameters passed to syscalls are just direct
ones. Maybe a smart method that allows you to know if a parameter is a
direct one or a pointer to one? An "unsigned int pmap" where bit N is
1 if param N is an indirection? Hmm?

adding such things tends to slow down atom parsing.

I really think it simplifies it. You simply *copy* the parameter (I'd say
that 70+% of cases falls inside here), and if the current "pmap" bit is
set, then you do all the indirection copy-from-userspace stuff.
It also simplify userspace a lot, since you can now pass arrays and
structure pointers directly, w/out saving them in a temporary variable.




Sigh, I really dislike shared userspace/kernel stuff, when we're
transfering pointers to userspace. Did you actually bench it against
a:

int async_wait(struct syslet_uatom **r, int n);

I can fully understand sharing userspace buffers with the kernel, if
we're talking about KB transferd during a block or net I/O DMA
operation, but for transfering a pointer? Behind each pointer
transfer(4/8 bytes) there is a whole syscall execution, [...]

there are three main reasons for this choice:

- firstly, by putting completion events into the user-space ringbuffer
the asynchronous contexts are not held up at all, and the threads are
available for further syslet use.

- secondly, it was the most obvious and simplest solution to me - it
just fits well into the syslet model - which is an execution concept
centered around pure user-space memory and system calls, not some
kernel resource. Kernel fills in the ringbuffer, user-space clears it.
If we had to worry about a handshake between user-space and
kernel-space for the completion information to be passed along, that
would either mean extra buffering or extra overhead. Extra buffering
(in the kernel) would be for no good reason: why not buffer it in the
place where the information is destined for in the first place. The
ringbuffer of /pointers/ is what makes this really powerful. I never
really liked the AIO/etc. method /event buffer/ rings. With syslets
the 'cookie' is the pointer to the syslet atom itself. It doesnt get
any more straightforward than that i believe.

- making 'is there more stuff for me to work on' a simple instruction in
user-space makes it a no-brainer for user-space to promptly and
without thinking complete events. It's also the right thing to do on
SMP: if one core is solely dedicated to the asynchronous workload,
only running on kernel mode, and the other code is only running
user-space, why ever switch between protection domains? [except if any
of them is idle] The fastest completion signalling method is the
/memory bus/, not an interrupt. User-space could in theory even use
MWAIT (in user-space!) to wait for the other core to complete stuff.
That makes for a hell of a fast wakeup.

That makes also for a hell ugly retrieval API IMO ;)
If it'd be backed up but considerable performance gains, then it might be OK.
But I believe it won't be the case, and that leave us with an ugly API.
OTOH, if noone else object this, it means that I'm the only wierdo :) and
the API is just fine.




- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages