Re: shared memory between processes

From: Kasper Dupont (kasperd_at_daimi.au.dk)
Date: 05/04/05


Date: Wed, 04 May 2005 19:42:57 +0200


"Peter T. Breuer" wrote:
>
> [unaligned variables]
>
> That's never happened to me in user-space. I thought the kernel did
> some magic to fix that up so that users only see confusing results, they
> don't actually cause a bus fault.

The reason you don't see it is, that the compiler
doesn't align variables that way. You have to do
something to actually get an unaligned variable.

If you manage to somehow do unaligned access, what
happens depends on the architecture. Some CPUs will
fault, others will do the access in multiple steps
thus causing a performance decrease. For that reason
even if the architecture allows unaligned acess the
compiler will avoid it.

If the CPU doesn't allow unaligned access it can be
faked by the kernel. But that is going to be even
slower than if the CPU itself did it in multiple
steps. This is something you'd never do unless you
needed it for some weird compatibility reasons.

Now what Todd told us about was not an unaligned
access causing a bus fault, but rather that it was
no longer atomic thus causing synchronization
problems on an SMP system. There is nothing the
kernel could do about that, because the kernel
doesn't even get involved when it happens.

If the variable is too large to be transfered from/
to a register in a single instruction, the compiler
may generate multiple instructions for that. The
same approach could be used if the variable was
unaligned and the CPU didn't support that. And that
is going to be faster than doing it inside the
kernel, unfortunately that means you will need to
know statically if it is required or not.

If the access to the variable requires multiple
instructions again you don't have atomicity. And
if you do that in user mode, you may even run into
trouble on a single CPU system because you could
take an interrupt between two of the instructions.

-- 
Kasper Dupont -- der bruger for meget tid på usenet.
Note to self: Don't try to allocate 256000 pages
with GFP_KERNEL on x86.


Relevant Pages