Re: thread memory size
- From: Jan Helgesen <spam@xxxxxxxxxx>
- Date: Sat, 10 Oct 2009 10:34:13 +0200
David Schwartz wrote:
You are still confused. "Lock-free" is a misnomer. There are still
locks, they are just not locks that block. For example, the primitive
lock-free algorithms are made out of contain atomic 'compare and
replace' operations or atomic 'swap' operations. These operations
internally contain locks, for example, they lock cache lines.
The fact that in the strictest technical terms a CAS operation does
create a lock, but does not block, is true. But a CAS operation returns
if it fails to succeed. That gives it the opportunity to try again. At
the hardware level it will succeed, eventually. Even with four billion
threads contending for the value, it will succeed almost instantly (It
does not take that long to 4 billion such operations)
Also, CAS and similar operations depend on the cpu architecture and
which specific instruction it has implemented. The simpler ones allow
for less concurrency than the more advanced ones. The latest addition of
advanced concurrent CAS operations are AMDs ASF for AMD64 (Advanced
Synchorinisation Facility). Its experimental yet, but seems to promise
even more concurrency within CAS operations than previous types of CAS
operation.
On a software level, the operations are not single instruction atomic.
This means that a critical region can obtain one lock, then obtain
another lock and cause a deadlock. Even how one accesses a single
variable can be subject to the lock strategy chosen. If one chooses an
operation that specifically uses a CAS instruction only, then its the
same as if it were hardware, but if one uses critical regions or the
like, then there is a high probability it will cause problems.
So even if there is locking in the cpu, it is irrelevant. The reason for
this is that a (software level) lock-free algorithm, can use CAS
operations and obtain a much higher level of concurrency than a lock
based algorithm. Now, with a message passing implementation that is so
simple, in both implementation and model, it will be able to scale and
perform pretty well using these techniques.
+++++++++++++++++++++++++++++++++++
Now lets get back on track again. Unfortunately this discussion have
become a little pedantic and possibly a little self serving, that is
because it was sidetracked into discussing technical details of lesser
importance.
If I were to implement this within one process (i.e. threads shared
memory), I would probably implement a message queue algorithm, that uses
optimistic locking by help of CAS operations or using a state machine.
It depends on which gives the best performance.
That's a good implementation for single process communication between
threads. It can use message passing which reduced the thread level
contention one would normally get if one used lock based algorithms. So
it increases concurrency, scalability and performance by concurrency. It
does come with a small threads level decrease in efficiency, but the
scalability and concurrency outweighs that disadvantages by so much that
it is nothing to care about. As concurrency mechanisms increase its
performance other strategies can be chosen.
But that misses the functional requirement of being able to communicate
between processes on same or different machines. I.e the truly scalable
solution (the ultimate scalability would be atomic computing, but that's
a different matter). To be able to communicate between processes on
different machines network communication is need, that frees us from the
concurrency algorithms one normally would use if the program where
running in a single process. If set up properly, i,e, fiber cables for
networking and so on
But, using network to communicate messages within the same machine or
process is not very efficient so the implementation could be smart and
choose the mechanism depending on the proximity of the peer process.
That means it would use network, Shared Memory or process memory as needed.
But that leads us to a different discussion, how are the processes
interconnected, by 100/1000Mb network, by fibre, by supercomputer fibre
cable switches etc. It all depends on the efficiency of the
communicating technology chosen. With s slow network connection or IP
stack implementation one might want to try to use a Shared Memory
message passing implementation to make it more efficient.
The forest is this -- message passing is inefficient.
for the sake of clarity can you give some definitions and explain a
couple of things before we continue:
how would you implement it with shared memory and without message
passing? (overall solution)
what do you define as efficient and what do you define as inefficient?
The reason
people use it for performance and scalability is because if you limit
yourself to message passing only, you can get the concurrency of
threads without the inherent performance penalties of threads.
And that is not a good reason to use message passing?
Right, the advantage of message passing is that it doesn't have the
limitations of shared memory.
What limitations are you thinking of?
You would suffer all the disadvantages of
threads and get absolutely *none* of the benefits.
And what do you define as the disadvantages and benefits?
Regards
Jan
.
- Follow-Ups:
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- References:
- thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: Rainer Weikusat
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- Re: thread memory size
- From: Jan Helgesen
- Re: thread memory size
- From: David Schwartz
- thread memory size
- Prev by Date: Re: typical size of task_struct
- Next by Date: Re: thread memory size
- Previous by thread: Re: thread memory size
- Next by thread: Re: thread memory size
- Index(es):
Relevant Pages
|