Re: software raid: replace failing HD
From: bombadil (me_at_privacy.net)
Date: 02/06/04
- Next message: LEE Sau Dan: "Re: Sort Command help required"
- Previous message: Davide Bianchi: "Using Microsoft's AD from a 'normal' LDAP client"
- In reply to: John-Paul Stewart: "Re: software raid: replace failing HD"
- Next in thread: John-Paul Stewart: "Re: software raid: replace failing HD"
- Reply: John-Paul Stewart: "Re: software raid: replace failing HD"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 06 Feb 2004 11:43:41 +0100
John-Paul Stewart wrote:
> Are they 80-pin SCA drives in a hot swap chassis? If not, then I
> wouldn't recommend you try hot swapping normal SCSI drives (50- or
> 68-pin SCSI connector with seperate power connector).
Here are the specifications of the disk in question:
<http://www.support.gateway.com/s/Servers/COMPO/HARDDRV/Seagate/5502564/5502564sp2.shtml>
I am not sure if it is hot swappable, probably it will be better to shut
down the server anyway.
> If the hardware is indeed hot swappable, then make sure the RAID
> software has marked the failed drive as failed ('man mdadm' for info, in
> particular the "--fail" and "--remove" parameters), pull it out and put
> in the new one. Then partition the drive as needed and add it to the
> array ('mdadm --add /dev/md0 /dev/sda1' for example) and let the
> software RAID resync itself.
Here are the kernel messages about the failing drive:
Feb 1 11:28:02 myservername kernel: SCSI disk error : host 0 channel 0
id 1 lun 0 return code = 8000002
Feb 1 11:28:02 myservername kernel: Info fld=0x6f7454, Current sd08:15:
sense key Hardware Error
Feb 1 11:28:02 myservername kernel: Additional sense indicates
Mechanical positioning error
Feb 1 11:28:02 myservername kernel: I/O error: dev 08:15, sector 26768
Feb 1 11:28:02 myservername kernel: raid1: Disk failure on sdb5,
disabling device.
Feb 1 11:28:02 myservername kernel: ^IOperation continuing on 1 devices
Feb 1 11:28:02 myservername kernel: md: updating md3 RAID superblock on
device
Feb 1 11:28:02 myservername kernel: md: (skipping faulty sdb5 )
Feb 1 11:28:02 myservername kernel: md: sda5 [events:
0000007e]<6>(write) sda5's sb offset: 14137088
Feb 1 11:28:02 myservername kernel: md: recovery thread got woken up ...
Feb 1 11:28:02 myservername kernel: md3: no spare disk to reconstruct
array! -- continuing in degraded mode
Here is the current raid status:
Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 sdb1[1] sda1[0]
48064 blocks [2/2] [UU]
md0 : active raid1 sdb2[1] sda2[0]
3068288 blocks [2/2] [UU]
md1 : active raid1 sdb3[1] sda3[0]
522048 blocks [2/2] [UU]
md3 : active raid1 sdb5[1](F) sda5[0]
14137088 blocks [2/1] [U_]
unused devices: <none>
Only a partition (sdb5) has been marked as failed and removed from the
raid array (md3).
If I got it right, I should mark the other partitions (sdb1, sdb2 and
sdb3) on the failing disk (sdb) as failed and remove them from the raid
arrays (md2, md0 and md1).
Then I shut down the server, replace the disk, start up the server,
partition the new disk exactly as the other one (what is the best way to
do that?), add the partitions to the raid arrays and let the arrays
resync themselves.
Is that right?
What is the advantage of using mdadm over raidtools?
BTW, probably the new disk will be bigger (32GB instead of 18GB) than
the current one (though same RPM speed and same SCSI technology), is
there any problem with that?
Thanks in advance.
- Next message: LEE Sau Dan: "Re: Sort Command help required"
- Previous message: Davide Bianchi: "Using Microsoft's AD from a 'normal' LDAP client"
- In reply to: John-Paul Stewart: "Re: software raid: replace failing HD"
- Next in thread: John-Paul Stewart: "Re: software raid: replace failing HD"
- Reply: John-Paul Stewart: "Re: software raid: replace failing HD"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|