Re: Debian Raid Crash Repair

From: Henrique de Moraes Holschuh (hmh_at_debian.org)
Date: 11/14/05

  • Next message: Daniel Nilsson: "Re: Compile problems - how do I know what source to install?"
    Date: Mon, 14 Nov 2005 10:27:01 -0200
    To: debian-user@lists.debian.org
    
    

    On Mon, 14 Nov 2005, Siju George wrote:
    > On 11/14/05, Alvin Oga <aoga@mail.linux-consulting.com> wrote:
    > > On Mon, 14 Nov 2005, Siju George wrote:
    > > > I had a mirror o sarge with 2 disks. One of them failed now. I had
    > > > given an option for 1 spare disk while configuring Raid. Could some
    > > > one please tell me what I should do to Place a new disk and recreate
    > > > the mirror?? Should I manually partition the new disk or is there a

    Write a boot sector to both hdc and to a floppy or other removable media
    just in case.

    Add new disk, removing the failed one from the system.

    Now, what you do depends on how you want the new disk to be used...

    If you want to expand the RAID:

      Boot single user. Kill udevd if necessary.

      Partition new disk. Create a new RAID array in degraded mode using mdadm.

      Move data to new RAID array (creating filesystems and lvm volumes as
      needed, don't forget the swap partition!). Edit data in new RAID array to
      refer to md1 instead of md0 if you are using kernel autorun (AFAIK it is
      non-trivial to get it to go to md0 without booting a live-cd system and
      renumbering the minor device number).

      Remove old *working disk*, store it somewhere as the valuable backup of
      all your data that it is :-) Add the second "new" disk, boot, make sure
      everything is correct, partition the second "new" disk and hotadd to the
      array. Rerun LILO. All done.

    If you will use the new extra space in another way (a second RAID array,
    perhaps?):

      Partition and hotadd new disk to the array. Rerun LILO. All done.

    > > > command that I can run after connecting the disk so that the Raid
    > > > Partitions will be created automatically and the rest of the space in
    > > > the hard disk be freely available? I would like to place an 80 GB disk
    > > > instead of a 40 GB one.
    > >
    > > - it would be pointless to use a new 80gb disk instead of a 40gb disk
    > > - the other 40gb is sorta wasted and unused

    Only if you want to let them go to waste. And maybe the 80gb disks are
    cheaper than 40gb ones where he lives? Anyway, it does not have to be
    pointless at all.

    > > - if your system crashed:
    > > - why did it crash

    Was the swap over RAID1 too? If it was *not*, we have a damn good reason
    for the box to crash.

    As for the boot, use a proper configured LILO. It does the right thing for
    RAID arrays if your BIOS isn't braindead, and the system will be able to
    boot from hdc if hda goes missing in action.

    Since LILO *has* the bad brain disease of writing crap to the first sector
    of a partition unless told to do its job right, here's what one needs (for
    completeness):
            boot=/dev/<raid device>
            raid-extra-boot=mbr-only
    That keeps the LILO crap where it belongs: the MBR, and *only* there.

    Other useful hints:

    You can re-read a partition table using fdisk like this:

       fdisk <device>
       w
       q

    This doesn't change the partition, and forces a reread if the kernel
    doesn't have a partition locked for some reason (used as / device,
    or mounted, or in an active lvm vg or md device...).

    You can duplicate a DOS-like partition table doing this:

       dd if=/dev/<source disk> of=/dev/<dest. disk> bs=512 count=1
       fdisk <dest disk>
       w
       q
       
    *IF* you have no extended partitions. Warning: this duplicates MBR
    loaders partially, so reinstall the loader (LILO, grub, etc) on the new
    disk after you do this.

    Do *NOT* do this if you use a partition table that uses UUIDs.

    > Bad sectors on "hda" was the reason for the crash. Now the Server is

    Bad sectors in a md component do not cause a crash. md just drops the
    component from the active pool of the array.

    > Is there a quick way to partition it the sane way as the disk I am
    > replacing?? Again I am replacing the 40 GB with the 80 GB one.

    See above.

    > > - cat /proc/mdstat to see what is doing or not doing
    > > - if its syncing .. leave it alone .. do not power off,
    > > or add new files, unless you like to be on the bleeding
    > > edge and test that the raid stuff is working "right"

    If you cannot trust this, you cannot trust the RAID. Modifying a running md
    RAID array while it is syncining *IS* to be safe and work right. If it
    doesn't, your kernel is crap and you cannot trust its md at all... and you
    better find it out sooner than later.

    AFAIK, the md device will ignore writes past the current resync cursor on
    component devices that are being rebuilt (it writes only to the rest of the
    components), and write anything behind the resync cursor to all component
    devices including the one(s) being rebuilt.

    You can test the RAID sync, you know. Just compare the two md component
    devices, and ignore errors in the last 128KiB (the md superblock).

    -- 
      "One disk to rule them all, One disk to find them. One disk to bring
      them all and in the darkness grind them. In the Land of Redmond
      where the shadows lie." -- The Silicon Valley Tarot
      Henrique Holschuh
    -- 
    To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
    with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
    

  • Next message: Daniel Nilsson: "Re: Compile problems - how do I know what source to install?"

    Relevant Pages

    • SUMMARY: Moving /usr From Under Root "/" To Its Own Partition
      ... One of the reasons for doing this is to end up with a smaller root ... Install the boot block and boot off the new drive. ... " In order for the root partition to be fscked and remounted ... D> temporarily on the existing disk. ...
      (SunManagers)
    • Re: laptop - new HD - no CD or floppy drive
      ... I put the laptop HD back in the PC and I could boot from it. ... If the primary partition has an incorrect boot sector. ... If the disk geometry is incorrect. ...
      (microsoft.public.windowsxp.hardware)
    • Re: laptop - new HD - no CD or floppy drive
      ... I put the laptop HD back in the PC and I could boot from it. ... If the primary partition has an incorrect boot sector. ... If the disk geometry is incorrect. ...
      (microsoft.public.windowsxp.hardware)
    • Re: Joining C and D drives
      ... Are C: and D: on the same disk? ... If there are too many files, move them to your backup disk. ... Delete the D: partition. ... Change the boot order in the BIOS setup screen, ...
      (microsoft.public.windowsxp.setup_deployment)
    • Re: Can I boot of an XP System disk, nested in a logical volume
      ... I'll boot of Partition Magic or some other kind of magic and fix it that-a-way. ... It's been a long time since I messed with partitions like this, but in the back of my head I have this fragment about boring registry hands-on editing, 'derived Disk ID's' and the 'Master Boot Sector' -- that's S as in 'Senile'. ... One reason for the drive letter change on cloned drives is to keep the parent drive hooked up the first time the clone is booted, being that the clone has the same Mount Manager database, and being that the Mount Manager *always* respects drive letter assignments, it will see the parent drive and its valid disk signature and assign the C: drive letter to the original C: drive, so there will be no C: letter available for the clone. ...
      (microsoft.public.windowsxp.basics)