Re: software raid with pending failure



Aaron Bliss wrote:
Hi everyone,
I'm running redhat es 5 with several raid 1 partitions setup. It looks like /dev/sdb is getting ready to fail. I noticed the following in the logwatch report:

/dev/sdb - 29 Time(s)
1 offline uncorrectable sectors detected

So, in order to correct the pending failed drive, I marked each /dev/sdbx partition that was partitioning in a raid1 as failed with mdadm, and then removed the dev/sdbx with madam. So I was running the os from /dev/sda only. So far so good.

I then took the box down and unplugged the device that I believed was /dev/sdb, however the box wouldn't boot. It just sat at the grub prompt. So, I thought, maybe the box is seeing the other drive as /dev/sdb. So, I turned the box back off, plugged in the previous drive, unplugged the other drive, and the box wouldn't boot. I got to the grub splash screen, however the box just kept resetting itself. So, I plugged that drive back in, and the box booted up fine. So, I'm now working with what I believe to be a good drive and a soon to be failed drive. So, a few questions here. 1. How do I identify which hard drive is /dev/sda and which is /dev/sdb? 2. Why wasn't I able to boot with a single drive (assuming that at least 1 of them is good)? 3. How do I go about replacing the bad drive? Thanks for your help. Below is a print out of /dev/mdstat before failing and removing /dev/sdb from the mirrors (all raid partitions were setup during the install of the operating s
yst
em)

Have you tried to use MDADM to remove the bad drive, then replace and use MDADM to add the good drive back into your raid config (Ofcourse after creating the Linux Raid partition on the drive)?

<snip>

Regards,

J

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list



Relevant Pages

  • Re: [opensuse] 10.2 no RAID to 11.0 RAID 1
    ... 10.2 no RAID to 11.0 RAID 1 ... you can force rebuilds in mdadm in situations where no firmware raid will ever let you. ... Send me your problem disks that you think are impossible to assemble and I bet in a little while I can tell you how to assemble the array as long as there actually is enough there to use. ... One thing I've seen, which I don't think is your problem but shows the kind of thing that happens, a disk will drop out of the array and reappear instantly as a higher drive letter than the system really has. ...
    (SuSE)
  • HELP Recover software RAID5...
    ... I am hoping some brilliant sysadm can help me recover my software raid 5. ... Over Time as the media drive was filled up, I add additional drives to the ... mdadm: /dev/sdh1 requires wrong number of drives. ... mdadm: /dev/sdh has wrong uuid. ...
    (comp.os.linux.misc)
  • Re: Mdadm -- Restoring an array
    ... past, when I've tried to restore a RAID, I've had trouble with it. ... The man page for mdadm makes it look like a RAID can be reassembled ... to tell mdadm to scan local drives and re-assemble an existing RAID. ... Of course, if a drive is replaced, you'll need to create a new conf file. ...
    (Debian-User)
  • Re: Help: F11 anaconda doesnt see my hard drives
    ... When you say softraid, you mean linux software raid, rather than bios raid or hardware raid? ... Something in Anaconda is failing to see the second and the third drive on a machine with the first drive being an IDE drive, and the 2nd and the 3rd drive being SCSI drives. ... It doesn't see the drives, but I can flip over to the ALT-F2 shell, and run mdadm --assemble --scan. ... This sounds as if either the raid or scsi modules are not being loaded. ...
    (Fedora)
  • Re: RAID 0+1 (Problem)
    ... I followed the steps in the link below but my RAID 10 array had a problem ... But after a reboot, I got errors re this new RAID device ... mdadm: ... And definitely talk to your boss about WHY he wants mirrored stripes. ...
    (Fedora)