Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Oren Laadan <orenl@xxxxxxxxxxxxxxx>
- Date: Mon, 13 Oct 2008 12:12:43 -0400
Cedric Le Goater wrote:
Ingo Molnar wrote:
* Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote:
On Thu, 2008-10-09 at 15:44 +0200, Ingo Molnar wrote:it's a concept: one task installing some state into another task (which
there might be races as well, especially with proxy state - andWhat do you mean by proxy state? nsproxy?
current->flags updates are not serialized.
So maybe it should be a completely separate flag after all? Stick it
into the end of task_struct perhaps.
state must be restored after a checkpoint event), while that other task
is running. Such as a pi-futex state for example.
So a task can acquire state not just by its own doing, but via some
other task too.
thinking aloud,
hmm, that's rather complex, because we have to take into account the
kernel stack, no ? This is what Andrey was trying to solve in his patchset
back in September :
http://lkml.org/lkml/2008/9/3/96
the restart phase simulates a clone and switch_to to (not) restore the kernel
stack. right ?
the self checkpoint and self restore syscalls, like Oren is proposing, are
simpler but they require the process cooperation to be triggered. we could
image doing that in a special signal handler which would allow us to jump
in the right task context.
This description is not accurate:
For checkpoint, both implementations use an "external" task to read the state
from other tasks. (In my implementation that "other" task can be self).
For restart, both implementation expect the restarting process to restore its
own state. They differ in that Andrew's patchset also creates that process
while mine (at the moment) relies on the existing (self) task.
In other words, none of them will require any cooperation on part of the
checkpointed tasks, and both will require cooperation on part of the restarting
tasks (the latter is easy since we create and fully control these tasks).
I don't have any preference but looking at the code of the different patchsets
there are some tricky areas and I'm wondering which path is easier, safer,
and portable.
I am thinking which path is preferred: create the processes in kernel space
(like Andrew's patch does) or in user space (like Zap does). In the mini-summit
we agreed in favor of kernel space, but I can still see arguments why user space
may be better. (note: I refer strictly to the creation of the processes during
restart, not how their state is restored).
any thoughts ?
Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Cedric Le Goater
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- References:
- [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Oren Laadan
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Ingo Molnar
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Dave Hansen
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Ingo Molnar
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Dave Hansen
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Ingo Molnar
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Dave Hansen
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Ingo Molnar
- Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- From: Cedric Le Goater
- [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- Prev by Date: Re: [kerneloops] regression in 2.6.27 wrt "lock_page" and the "hwclock" program
- Next by Date: Re: [kerneloops] regression in 2.6.27 wrt "lock_page" and the "hwclock" program
- Previous by thread: Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- Next by thread: Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart
- Index(es):
Relevant Pages
|