Re: Hibernate a single process (aka checkpointing)

From: David Konerding (dek_at_compbio.berkeley.edu)
Date: 12/01/03


Date: Mon, 1 Dec 2003 15:58:32 +0000 (UTC)

In article <c3492661.0311302332.5ce9c412@posting.google.com>, Israel Hsu wrote:
> Kasper Dupont <kasperd@daimi.au.dk> wrote in message news:<3FC88A0A.17C3BB0B@daimi.au.dk>...
>> Wayne Hayes wrote:
>> >
>> > I've had this idea for a long time and it's just such a wonderful idea
>> > I'm surprised it hasn't been done yet.
>>
>> Has been asked and answered before, I don't know why
>> I haven't added it to the FAQ yet. Well, if anybody
>> can find a good answer from an archive I will add it.
>
> A good starting point is "libckpt" (Plank, Univ. of Tennessee) - a
> user-level library for checkpointing a single process. It did deal
> with file pointers, but not other things like sockets.

There's also one from NERSC, but this one focuses on MPI jobs by integrating
with LAM.

http://www.nersc.gov/research/ftg/checkpoint/

Although a lot of people in this newsgroup criticize the possibility of a checkpoint library
"because it's hard", the reality is that it's been attempted in UNIX (successfully, but on expensive hardware),
and is being attempted on Linux as well.