checkpoint-restart (was Re: suspending programs)

From: michael (linux_at_networkingnewsletter.org.uk)
Date: 04/19/05

  • Next message: Jonathan Kaye: "Re: howto update PATH"
    To: debian user <debian-user@lists.debian.org>
    Date: Tue, 19 Apr 2005 12:10:50 +0100
    
    

    On Tue, 2005-04-19 at 03:32 -0700, Alvin Oga wrote:
    > hi ya
    >
    > On Tue, 19 Apr 2005, roberto wrote:
    > > Hello,
    > > i usually run large simulations, and it happened sometimes power to go down
    > > suddenly, i don't know why.
    >
    > > I'd like to restart my programs just at the same point where they have been
    > > suspended.
    >
    > are you save the sate fo the simulation ?? if not ...it's impossible to
    > continue from where you crashed
    >

    I've done a quick apt-cache search (sarge apt sources) for something
    like IRIX's `cpr`:

         IRIX Checkpoint and Restart (CPR) offers a set of user-transparent
         software management tools, allowing system administrators,
    operators, and
         users with suitable privileges to suspend a job or a set of jobs in
    mid-
         execution, and restart them later on. The jobs may be running on a
         single machine or on an array of networking connected machines.
    CPR may
         be used to enhance system availability, provide load and resource
    control
         or balancing, and to facilitate simulation or modeling.

    which he could use, eg by checkpointing every N hours and then use the
    restart with power back on. The chk pt in this case writes memory image
    to disk, the other alt is for Roberto to do checkpointing within the
    simulation (eg write state vars to a file) and knock up a quick restart
    routine. For any decent/large sim code, I'd guess these already exist...

    It's hugely unlikely you'd be able to restart at the exact same place,
    but cpr would allow you to not lose everything (don't forget there's a
    large cost assoc with writing lots of data to disk)

    -- 
    Michael Bane
    Atmospheric Physics Group
    University of Manchester
    -- 
    To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
    

  • Next message: Jonathan Kaye: "Re: howto update PATH"

    Relevant Pages

    • Re: Weird XP Start-Up Problem - Please help!
      ... Basically, when I power on for the first time in the day, after the ... I can restart and/or power off numerous times throughout the day, ... CHECKING FILE SYSTEM ON D: ... reboot I get the standard "Sorry for the inconvenience" screen to ...
      (microsoft.public.windowsxp.general)
    • Re: Log Off VS Restart
      ... I know you really did mean restart Luke. ... Unfortunately I can't answer the question about cutting power. ... The dialogue with the client had started because of the old question ... lose their drives "way before their time" usually on startup in the ...
      (microsoft.public.windowsxp.general)
    • Re: Sudden reboot / restart
      ... First thing: What pattern of restart? ... second of the above list) and whenever the RPC service fails (the ... Power problems may be: ... bad motherboard power circuitry ...
      (microsoft.public.windowsxp.hardware)
    • Re: ups installation question
      ... > surge) which means electronic appliances power up slowly. ... > inrush limiter inside electronics; ... I was refering not to the user end restart, ... circuits be dropped at the generator and then the equiptment be restarted. ...
      (alt.sys.pc-clone.dell)
    • WORKAROUNDED - all is in order now - but real solutions remain to be found
      ... Hibernation is available everywhere it should, ... Windows adds along the year without curing any more the older ... the Power button, which hibernated. ... Shut Down, Restart. ...
      (microsoft.public.windowsxp.general)