Re: [PATCH 10/30] cr: core stuff



The ability to streamline the checkpoint image IMHO is invaluable.
It's the unix way (TM) of doing things; it makes the process pipe-able.

You can do many nice things when the checkpoint can be streamed: you
can compress, sign, encrypt etc on the fly without taking additional
diskspace. You can transfer over the network (e.g. for migration),
or store remotely without explicit file system support. You can easily
transform the stream from one c/r version to another etc.

This should be a design principle. In my experience I never hit a wall
that forced me to "sacrifice" this decision.

sacrifised (read: child can ptrace parent)
Hmmm... if all tasks are created in user space, then this specific
becomes a no-brainer !

No!

Actually yes :)


A ptraces B. Container is checkpointed.

Kernel realizes ptrace is going on. A and B in theory can have any
realitionship.

Consequently, kernel doesn't know in which order to dump A and B.

And there is no such order:
*) A can be parent of B (you dump A, B),
*) A can be child of B (you want to dump B, A, but this conflicts with
->real_parent order)
*) A and B just tasks (any order).

Current code does not support ptrace() - which has a multitude
if tidy-bits issues to solve during restart regardless.

However, creating tasks in userspace uses (and will uses) only
"real" process relationships, not ptrace-relationships, when it
comes to decide on the fork/clone order.

Technically, that can be done in checkpoint (dumping the task tree)
or in restart-user-space (rearranging the data before fork/clone).


I'm showing that whole issue can be avoided:

If the issue can be avoided, then why would you need to sacrifice
the stream-ability of the checkpoint image ?

*) all tasks are simply created regardless of who is parent of whom
(see kernel_thread())
*) Every task_struct image among other things contains references to
->real_parent and ->parent.
*) After every task is created it's time to change references:
**) lookup who is ->real_parent, change ->real_parent _by hand_
not with some "correct clone(2)" order.
**) lookup who is ->parent, change ->parent.

You're probably escaping all of this with object numbers?

(Will be) escaping this by arranging to fork/clone in the proper order.

task_struct and reparenting is just an example.

There is another loop:

struct user_struct => struct user_namespace => struct user_namespace::creator

Before actual dump each struct user_struct gets unique id (objref, whatever)
and simply dumped regardless of order.

Image of struct user_namespace contains id of creator user and dumped.

On restart:
restart user_ns
restart user
lookup object by creator id
if found, rewrite ->creator
if not found, restore creator user, and rewrite ->creator.

So, yes, if object number is dumped on disk, you get streamability in
presence of loops.

Clever. Just needs a way to quickly lookup file position by object id.

BTW, this is why OpenVZ code have "section concept.
I hoped it won't be needed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [PATCH 00/80] Kernel based checkpoint/restart [v18]
    ... it can now checkpoint and restart not only batch ... kernel, reboot, and restart their interactive 'screen' session from ... Fix leak of ckpt_ctx when restoring zombie tasks ...
    (Linux-Kernel)
  • Re: [PATCH 00/80] Kernel based checkpoint/restart [v18]
    ... I am getting following build error while compiling linux-cr kernel. ... checkpoint and restart interactive sessions of 'screen' across kernel reboot. ... Fix leak of ckpt_ctx when restoring zombie tasks ...
    (Linux-Kernel)
  • Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart
    ... Serge E. Hallyn wrote: ... Admittedly the restart code should then do all the appropriate ... I'd say that checkpoint and restart are separate. ... be done with her credentials, of we trust the root user and allow ...
    (Linux-Kernel)
  • Re: checkpoint/restart ABI
    ... The kernel will ensure the the data isn't corrupted. ... the restart code needs better permission and sanity checks. ... the whole checkpoint is very useful - to be able to stream it (eg. over ... In an execve model, the parent process can ...
    (Linux-Kernel)
  • Re: C/R review
    ... I have my doubts about parallel checkpoint especially how large container ... and during restart they will plug-in the reconstructed MM. ... +checkpoint image: it is constructed dynamically during both checkpoint ...
    (Linux-Kernel)