Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Herbert Poetzl <herbert@xxxxxxxxxxxx>
- Date: Mon, 20 Feb 2006 16:08:11 +0100
On Mon, Feb 20, 2006 at 05:11:40PM +0300, Kirill Korotaev wrote:
I would disagree with you. These discussions IMHO led us to the wrong
direction.
Can I ask a bunch of questions which are related to other
virtualization issues, but which are not addressed by Eric anyhow?
- How are you planning to make hierarchical namespaces for such
resources as IPC? Sockets? Unix sockets?
in the same way as for resources or filesystems -
management is within the parent, usage within the
child
So taking example with IPC, you propose the following:
- parent is able to setup limits on segments, sizes, messages etc.
- parent doesn't see child objects itself, i.e. it is unable to share
segments with a child, send messages to child etc.
Am I correct?
that is limits, not resources, but for them, you are
right ...
Provided I got it correctly, how does this differ from the situation,
when one container is granted rights to manage another container?
So where is hierarchy?
this is permissions, yet another *space we have not
really covered that well ...
Moreover, granting/revoking rights is more fine grained I suppose. And
it is more secure, since uses the model - allow only things which are
safe, while heirarchy uses model "allow everything" to do with a child
and leads to possible DoS.
DoS of what? of course a hierarhical model would allow
to control 'everything' for/within a child space. this
doesn't mean that it would escape the refinements it
has received from the parent ...
in this way it is _very_ different from one container
being able to administrate a completely different
container (as there is no inheritance), which would
easily cause the beforementioned DoS situation ...
Process tree is hierarchical by it's nature. But what is
heirarchical IPC and other resources?
for resources it's simple, you have to 'give away'
a certain share to your children, which in turn will
enable them to utilize them up to the given amount,
or (depending on the implementation) up to the total
amount of the parent.
Again, how does this differ from the situation when one container is
granted to manage another one? In this case it grant some portion of
it's resources to anyone he wishes.
well, if one container can give away resources to
another one, and can maintain that context, as well
as inspect tasks and resources inside that context,
then the structure is hierarchical in my book :)
Take a look at this from another angle:
You have a child container, which was granted 1/2 of your resources.
But since parent consumed 3/4 of it, child will never be able to get
his 1/2 portion. And child will be unable to find out the reason for
resources allocation denies.
well, this is called overbooking, and if you want
to allow that, you ahve to accept the consequences
(doesn't mean I'm against it :)
(check out ckrm as a proof of concept, and examplelet's be more friendly to each other :)
how it should not be done :)
I'm not being 'unfriendly' here, we all agree that
the basic idea of ckrm was really great, but the
implementation and complexity makes it somewhat
unusable ...
And no one ever told me why we need heierarchy at all. No any _real_
use cases. But it's ok.
there are many use cases here, first, the VPS within
a VPS (of course, not the most useful one, but the
most obvious one), then there are all kind of test,
build and security scenarios which can benefit from
hierarchical structures and 'delegated' administrative
power (just think web space management)
If you are talking about management, then see my prev paragraph.
Rights can be granted. Can you provide some other example, what do you
want from hierarchy?
see my reply to your paragraph :)
- Eric wants to introduce name spaces, but totally forgots how much
they are tied with each other. IPC refers to netlinks, network
refers to proc and sysctl, etc. It is some rare cases when you will
be able to use namespaces without full OpenVZ/vserver containers.
well, we already visited the following:
- filesystem namespaces (works quite fine completely
independant of all other)
it is tightly interconnected with unix sockets, proc, sysfs, ipc, and
I'm sure something else :)
well, for procfs and sysfs that does not come
surprisingly, as they _are_ filesystems after
all, named sockets do belong there too, for ipc
I do not see the direct conenction as long as
the elements are not represented by files ...
Herber, Eric, I'm not against namespaces. Actualy OpenVZ doesn't care
whether we have single container or namespaces, I'm just trying to
show you, that all of them are not that separate namespaces as you are
trying to think of them.
I guess all of the involved parties _know_ that
they are somewhat interconencted, and as I said
several times now, I think we should try to split
them and separate 'em wherever possible, sometimes
even by changing existing semantics and structures
- pid spaces (again they are quite fine without anyonly if we remove all these pid uses from fown, netlinks etc.
other namespace)
again, some cleanup is probably required ...
- resource spaces (another one which makes perfectwhich one? give me an example please.
sense to have without the others)
- cpu resources (scheduler)
the fact that some things like proc or netlink is tiedit needs virtualization, really. But being virtualized they are still
to networking and ipc IMHO is just a sign that those
interfaces need revisiting and proper isolation or
virtualization ...
tied to the subsystems they were.
So if you have a socket in TCP_FIN_WAIT1 state, which can live long- How long these namespaces live? And which management interfacewell, the lifetime of such a space is very likely to
each of them will have?
be at least the time of the users, including all
created sockets, file-handles, ipc resources, etc ...
time, what do you do with it?
IMHO you have two options there:
- zap the sockets when you destroy the context
(a machine which is rebooted will do the same :)
- keep them hanging around for some days, until the
timeout is reached and they go away quite naturally
note: this is something which does not even have to
be a design decision right now, as the zapping can
be done later quite easily ...
Full example: the process dies, but network space is still alive due
to such a socket. You won't be able to reuse the address:port until it
died. I'm curios about how do you propose to handle similar issues in
separate namespaces?
see above, btw, the network space would only be
partially alive, i.e. no new sockets could be created
within it, or it might even be persistant and you can
'handle' the sockets after a restart as if they were
on a 'normal' machine ...
(btw, this issue is quite common on a real machine, no?)
Also as a continuation of this example, if all the processes exited,
how can you manage namespaces which leaked? where should you go to see
these sockets if /proc is tightly related to pspace on the task, but
there are no tasks?
that is where an administrative interfaces makes sense
you can review and investigate those spaces and take
proper actions if you like ...
So I really hate that we concentrated on discussion of VPIDs,
while there are more general and global questions on the whole
virtualization itself.
well, I was not the one rolling out the 'perfect'
vpid solution ...
ha ha :) won't start flaming with you.
My question is the same! Why?First of all, I don't think syscalls likeLinux-VServer does not have any of those syscalls
"do_something and exec" should be introduced.
and it works quite well, so why should we need such
syscalls?
because we don't need them?!
I have no problem at all to discuss a general planYup. Would be nice to switch to networking, IPC or something like this.
(hey I though we were already doing so :) or move
to some other area (like networking :)
best,
Herbert
Kirill-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Eric W. Biederman
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Kirill Korotaev
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Eric W. Biederman
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Kirill Korotaev
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Eric W. Biederman
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Kirill Korotaev
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Herbert Poetzl
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Kirill Korotaev
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Herbert Poetzl
- Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- From: Kirill Korotaev
- [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- Prev by Date: Re: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver
- Next by Date: Re: [PATCH 07/22] Hypercall definitions
- Previous by thread: Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- Next by thread: Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
- Index(es):
Relevant Pages
|