software RAID detects extra (failed) drive; why?

From: Robert Nagle (idiotprogrammer_at_yahoo.com)
Date: 01/30/04

  • Next message: Henrik Carlqvist: "Re: Can't compile dpt_i2o Driver for AMD 64 (Redhat AS 3),Urgent!!! Post 2"
    Date: 30 Jan 2004 10:58:14 -0800
    
    

    I'm having a software RAID problem.Several problems actually.

    When trying to reconstruct a RAID 5 array that showed a bad hard
    drive,I couldn't boot and thought the reason might be that I needed to
    have redundant hard drives on my primary controller, which I didn't. I
    decided to start again from scratch, and basically to make new file
    systems (mkreiserfs) for all the partitions on two of the three hard
    drives (the third one was probably defective).

    My setup: 3 hd all 60 gigs
    md1 ....boot on ext3
    md3 ..../ on reiser
    md4 /home on reiser

    I used mdadm to create these arrays, and all three seem to be
    successful and mountable. (I haven't tested them to see if they can
    rebuild though; I'm still working from the live cd environment before
    going to chroot) .

    Here's what I'm noticing:
    raid5 print conf messages: on the root command line, every once in a
    while I'll see raid log gobblegook; something like this: RAID5 conf
    printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00] (the full log is
    below)

    When I do a query using mdadm, I see that md1 seems to have normal
    messages, while both md3 and md4 (the raid5 reiser, consisting of
    three hard drives on hda3,hdb3,hdg3 and hda4,hdb4,hdg4) show one more
    device than I actually have. Also, it shows one device as failed.
     Raid Devices : 3
      Total Devices : 4
    Preferred Minor : 3

    What's going on here? First, does intermittent command line chatter
    automatically imply a serious problem? Second, why would mdadm detect
    one more device than is actually there.

    I'm guessing that it might be related to the fact that two of the
    hard drives had RAID arrays previously on them. Anyone agree? How
    could I verify/correct this?

    rj

    **********************************************
    /dev/md1:
            Version : 00.90.00
      Creation Time : Thu Jan 29 23:42:46 2004
         Raid Level : raid1
         Array Size : 96256 (94.00 MiB 98.57 MB)
        Device Size : 96256 (94.00 MiB 98.57 MB)
       Raid Devices : 3
      Total Devices : 3
    Preferred Minor : 1
        Persistence : Superblock is persistent

        Update Time : Thu Jan 29 23:42:46 2004
              State : dirty, no-errors
     Active Devices : 3
    Working Devices : 3
     Failed Devices : 0
      Spare Devices : 0

        Number Major Minor RaidDevice State
           0 3 1 0 active sync
    /dev/ide/host0/bus0/target0/lun0/part1
           1 3 65 1 active sync
    /dev/ide/host0/bus0/target1/lun0/part1
           2 34 1 2 active sync
    /dev/ide/host2/bus1/target0/lun0/part1
               UUID : 443f705c:d8077816:472d852b:64f98318
             Events : 0.1
    /dev/md3:
            Version : 00.90.00
      Creation Time : Thu Jan 29 23:43:07 2004
         Raid Level : raid5
         Array Size : 48821376 (46.56 GiB 49.99 GB)
        Device Size : 24410688 (23.28 GiB 24.100 GB)
       Raid Devices : 3
      Total Devices : 4
    Preferred Minor : 3
        Persistence : Superblock is persistent

        Update Time : Fri Jan 30 00:03:19 2004
              State : dirty, no-errors
     Active Devices : 3
    Working Devices : 3
     Failed Devices : 1
      Spare Devices : 0

             Layout : left-symmetric
         Chunk Size : 64K

        Number Major Minor RaidDevice State
           0 3 3 0 active sync
    /dev/ide/host0/bus0/target0/lun0/part3
           1 3 67 1 active sync
    /dev/ide/host0/bus0/target1/lun0/part3
           2 34 3 2 active sync
    /dev/ide/host2/bus1/target0/lun0/part3
               UUID : 29b24560:22b0189a:720ef946:db3090cc
             Events : 0.2
    /dev/md4:
            Version : 00.90.00
      Creation Time : Thu Jan 29 23:43:28 2004
         Raid Level : raid5
         Array Size : 66251904 (63.18 GiB 67.84 GB)
        Device Size : 33125952 (31.59 GiB 33.92 GB)
       Raid Devices : 3
      Total Devices : 4
    Preferred Minor : 4
        Persistence : Superblock is persistent

        Update Time : Fri Jan 30 00:31:15 2004
              State : dirty, no-errors
     Active Devices : 3
    Working Devices : 3
     Failed Devices : 1
      Spare Devices : 0

             Layout : left-symmetric
         Chunk Size : 64K

        Number Major Minor RaidDevice State
           0 3 4 0 active sync
    /dev/ide/host0/bus0/target0/lun0/part4
           1 3 68 1 active sync
    /dev/ide/host0/bus0/target1/lun0/part4
           2 34 4 2 active sync
    /dev/ide/host2/bus1/target0/lun0/part4
               UUID : 576dae1b:a61d715c:a8145ab8:66456d0d
             Events : 0.2

    md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
    md: bind<ide/host0/bus0/target0/lun0/part1,1>
    md: bind<ide/host0/bus0/target1/lun0/part1,2>
    md: bind<ide/host2/bus1/target0/lun0/part1,3>
    md: ide/host2/bus1/target0/lun0/part1's event counter: 00000000
    md: ide/host0/bus0/target1/lun0/part1's event counter: 00000000
    md: ide/host0/bus0/target0/lun0/part1's event counter: 00000000
    md: raid0 personality registered as nr 2
    md1: max total readahead window set to 768k
    md1: 3 data-disks, max readahead per data-disk: 256k
    raid0: looking at ide/host0/bus0/target0/lun0/part1
    raid0: comparing ide/host0/bus0/target0/lun0/part1(97664) with
    ide/host0/bus0/target0/lun0/part1(97664)
    raid0: END
    raid0: ==> UNIQUE
    raid0: 1 zones
    raid0: looking at ide/host0/bus0/target1/lun0/part1
    raid0: comparing ide/host0/bus0/target1/lun0/part1(96256) with
    ide/host0/bus0/target0/lun0/part1(97664)
    raid0: NOT EQUAL
    raid0: comparing ide/host0/bus0/target1/lun0/part1(96256) with
    ide/host0/bus0/target1/lun0/part1(96256)
    raid0: END
    raid0: ==> UNIQUE
    raid0: 2 zones
    raid0: looking at ide/host2/bus1/target0/lun0/part1
    raid0: comparing ide/host2/bus1/target0/lun0/part1(97664) with
    ide/host0/bus0/target0/lun0/part1(97664)
    raid0: EQUAL
    raid0: FINAL 2 zones
    raid0: zone 0
    raid0: checking ide/host0/bus0/target0/lun0/part1 ... contained as
    device 0
      (97664) is smallest!.
    raid0: checking ide/host0/bus0/target1/lun0/part1 ... contained as
    device 1
      (96256) is smallest!.
    raid0: checking ide/host2/bus1/target0/lun0/part1 ... contained as
    device 2
    raid0: zone->nb_dev: 3, size: 288768
    raid0: current zone offset: 96256
    raid0: zone 1
    raid0: checking ide/host0/bus0/target0/lun0/part1 ... contained as
    device 0
      (97664) is smallest!.
    raid0: checking ide/host0/bus0/target1/lun0/part1 ... nope.
    raid0: checking ide/host2/bus1/target0/lun0/part1 ... contained as
    device 1
    raid0: zone->nb_dev: 2, size: 2816
    raid0: current zone offset: 97664
    raid0: done.
    raid0 : md_size is 291584 blocks.
    raid0 : conf->smallest->size is 2816 blocks.
    raid0 : nb_zone is 104.
    raid0 : Allocating 832 bytes for hash.
    md: updating md1 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part1 [events: 00000001]<6>(write)
    ide/host2/bus1/target0/lun0/part1's sb offset: 97664
    md: ide/host0/bus0/target1/lun0/part1 [events: 00000001]<6>(write)
    ide/host0/bus0/target1/lun0/part1's sb offset: 96256
    md: ide/host0/bus0/target0/lun0/part1 [events: 00000001]<6>(write)
    ide/host0/bus0/target0/lun0/part1's sb offset: 97664
    md: array md1 already exists!
    md: marking sb clean...
    md: updating md1 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part1 [events: 00000002]<6>(write)
    ide/host2/bus1/target0/lun0/part1's sb offset: 97664
    md: ide/host0/bus0/target1/lun0/part1 [events: 00000002]<6>(write)
    ide/host0/bus0/target1/lun0/part1's sb offset: 96256
    md: ide/host0/bus0/target0/lun0/part1 [events: 00000002]<6>(write)
    ide/host0/bus0/target0/lun0/part1's sb offset: 97664
    md: md1 stopped.
    md: unbind<ide/host2/bus1/target0/lun0/part1,2>
    md: export_rdev(ide/host2/bus1/target0/lun0/part1)
    md: unbind<ide/host0/bus0/target1/lun0/part1,1>
    md: export_rdev(ide/host0/bus0/target1/lun0/part1)
    md: unbind<ide/host0/bus0/target0/lun0/part1,0>
    md: export_rdev(ide/host0/bus0/target0/lun0/part1)
    md: bind<ide/host0/bus0/target0/lun0/part1,1>
    md: bind<ide/host0/bus0/target1/lun0/part1,2>
    md: bind<ide/host2/bus1/target0/lun0/part1,3>
    md: ide/host2/bus1/target0/lun0/part1's event counter: 00000000
    md: ide/host0/bus0/target1/lun0/part1's event counter: 00000000
    md: ide/host0/bus0/target0/lun0/part1's event counter: 00000000
    md: md1: raid array is not clean -- starting background reconstruction
    md: RAID level 1 does not need chunksize! Continuing anyway.
    md: raid1 personality registered as nr 3
    md1: max total readahead window set to 124k
    md1: 1 data-disks, max readahead per data-disk: 124k
    raid1: device ide/host2/bus1/target0/lun0/part1 operational as mirror
    2
    raid1: device ide/host0/bus0/target1/lun0/part1 operational as mirror
    1
    raid1: device ide/host0/bus0/target0/lun0/part1 operational as mirror
    0
    raid1: raid set md1 not clean; reconstructing mirrors
    raid1: raid set md1 active with 3 out of 3 mirrors
    md: updating md1 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part1 [events: 00000001]<6>(write)
    ide/host2/bus1/target0/lun0/part1's sb offset: 97664
    md: syncing RAID array md1
    md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.
    md: using maximum available idle IO bandwith (but not more than 100000
    KB/sec) for reconstruction.
    md: using 124k window, over a total of 96256 blocks.
    md: ide/host0/bus0/target1/lun0/part1 [events: 00000001]<6>(write)
    ide/host0/bus0/target1/lun0/part1's sb offset: 96256
    md: ide/host0/bus0/target0/lun0/part1 [events: 00000001]<6>(write)
    ide/host0/bus0/target0/lun0/part1's sb offset: 97664
    md: md1: sync done.
    md: bind<ide/host0/bus0/target0/lun0/part3,1>
    md: bind<ide/host0/bus0/target1/lun0/part3,2>
    md: bind<ide/host2/bus1/target0/lun0/part3,3>
    md: ide/host2/bus1/target0/lun0/part3's event counter: 00000000
    md: ide/host0/bus0/target1/lun0/part3's event counter: 00000000
    md: ide/host0/bus0/target0/lun0/part3's event counter: 00000000
    raid5: measuring checksumming speed
       8regs : 1845.200 MB/sec
       32regs : 1156.800 MB/sec
       pII_mmx : 2826.400 MB/sec
       p5_mmx : 3610.800 MB/sec
    raid5: using function: p5_mmx (3610.800 MB/sec)
    md: raid5 personality registered as nr 4
    md3: max total readahead window set to 512k
    md3: 2 data-disks, max readahead per data-disk: 256k
    raid5: spare disk ide/host2/bus1/target0/lun0/part3
    raid5: device ide/host0/bus0/target1/lun0/part3 operational as raid
    disk 1
    raid5: device ide/host0/bus0/target0/lun0/part3 operational as raid
    disk 0
    raid5: md3, not all disks are operational -- trying to recover array
    raid5: allocated 3284kB for md3
    raid5: raid level 5 set md3 active with 2 out of 3 devices, algorithm
    2
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    md: updating md3 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part3 [events: 00000001]<6>(write)
    ide/host2/bus1/target0/lun0/part3's sb offset: 24414144
    md: recovery thread got woken up ...
    md3: resyncing spare disk ide/host2/bus1/target0/lun0/part3 to replace
    failed disk
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    md: syncing RAID array md3
    md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.
    md: using maximum available idle IO bandwith (but not more than 100000
    KB/sec) for reconstruction.
    md: using 124k window, over a total of 24410688 blocks.
    md: ide/host0/bus0/target1/lun0/part3 [events: 00000001]<6>(write)
    ide/host0/bus0/target1/lun0/part3's sb offset: 24410688
    md: ide/host0/bus0/target0/lun0/part3 [events: 00000001]<6>(write)
    ide/host0/bus0/target0/lun0/part3's sb offset: 24414144
    md: bind<ide/host0/bus0/target0/lun0/part4,1>
    md: bind<ide/host0/bus0/target1/lun0/part4,2>
    md: bind<ide/host2/bus1/target0/lun0/part4,3>
    md: ide/host2/bus1/target0/lun0/part4's event counter: 00000000
    md: ide/host0/bus0/target1/lun0/part4's event counter: 00000000
    md: ide/host0/bus0/target0/lun0/part4's event counter: 00000000
    md4: max total readahead window set to 512k
    md4: 2 data-disks, max readahead per data-disk: 256k
    raid5: spare disk ide/host2/bus1/target0/lun0/part4
    raid5: device ide/host0/bus0/target1/lun0/part4 operational as raid
    disk 1
    raid5: device ide/host0/bus0/target0/lun0/part4 operational as raid
    disk 0
    raid5: md4, not all disks are operational -- trying to recover array
    raid5: allocated 3284kB for md4
    raid5: raid level 5 set md4 active with 2 out of 3 devices, algorithm
    2
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    md: updating md4 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part4 [events: 00000001]<6>(write)
    ide/host2/bus1/target0/lun0/part4's sb offset: 33126848
    md: ide/host0/bus0/target1/lun0/part4 [events: 00000001]<6>(write)
    ide/host0/bus0/target1/lun0/part4's sb offset: 33125952
    md: ide/host0/bus0/target0/lun0/part4 [events: 00000001]<6>(write)
    ide/host0/bus0/target0/lun0/part4's sb offset: 33126848
    md: md3: sync done.
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:3 fd:0
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part3
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part3
     disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part3
    md: updating md3 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part3 [events: 00000002]<6>(write)
    ide/host2/bus1/target0/lun0/part3's sb offset: 24414144
    md: ide/host0/bus0/target1/lun0/part3 [events: 00000002]<6>(write)
    ide/host0/bus0/target1/lun0/part3's sb offset: 24410688
    md: ide/host0/bus0/target0/lun0/part3 [events: 00000002]<6>(write)
    ide/host0/bus0/target0/lun0/part3's sb offset: 24414144
    md4: resyncing spare disk ide/host2/bus1/target0/lun0/part4 to replace
    failed disk
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    md: syncing RAID array md4
    md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc.
    md: using maximum available idle IO bandwith (but not more than 100000
    KB/sec) for reconstruction.
    md: using 124k window, over a total of 33125952 blocks.
    md: md4: sync done.
    RAID5 conf printout:
     --- rd:3 wd:2 fd:1
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
    RAID5 conf printout:
     --- rd:3 wd:3 fd:0
     disk 0, s:0, o:1, n:0 rd:0 us:1 dev:ide/host0/bus0/target0/lun0/part4
     disk 1, s:0, o:1, n:1 rd:1 us:1 dev:ide/host0/bus0/target1/lun0/part4
     disk 2, s:0, o:1, n:2 rd:2 us:1 dev:ide/host2/bus1/target0/lun0/part4
    md: updating md4 RAID superblock on device
    md: ide/host2/bus1/target0/lun0/part4 [events: 00000002]<6>(write)
    ide/host2/bus1/target0/lun0/part4's sb offset: 33126848
    md: ide/host0/bus0/target1/lun0/part4 [events: 00000002]<6>(write)
    ide/host0/bus0/target1/lun0/part4's sb offset: 33125952
    md: ide/host0/bus0/target0/lun0/part4 [events: 00000002]<6>(write)
    ide/host0/bus0/target0/lun0/part4's sb offset: 33126848
    md: recovery thread finished ...
    md: recovery thread got woken up ...
    md: recovery thread finished ...
    eth0: no IPv6 routers present


  • Next message: Henrik Carlqvist: "Re: Can't compile dpt_i2o Driver for AMD 64 (Redhat AS 3),Urgent!!! Post 2"

    Relevant Pages

    • Re: Nvidia RAID Image Restore
      ... Ghost is/was/will-be a junk program. ... but I have two Raptors in RAID0 on separate ... >since one needs a backup of the OS/Boot drive anyway (RAID or not) it makes ... >>>Nvidia RAID 0 array. ...
      (alt.comp.periphs.mainboard.asus)
    • Re: Nvidia RAID Image Restore
      ... but I have two Raptors in RAID0 on separate ... >since one needs a backup of the OS/Boot drive anyway (RAID or not) it makes ... Ghost and Driveimage both error in loading the NVRAID.SYS driver ... >>>Nvidia RAID 0 array. ...
      (alt.comp.periphs.mainboard.asus)
    • Re: Need some M3A RAID help
      ... the other 2 in a RAID0 array but after following the directions in the ... what I'm reading it's either all RAID or none at all. ... Are you saying that in Boot Device Priority setting under the Boot ... not see it in the BIOS boot menu, only a floopy the RAID array. ...
      (alt.comp.periphs.mainboard.asus)
    • Re: ATA panic: Duplicate free of item 0xc9eca088 from zone 0xc1022600(g_bio)
      ... I'm having the same failure attempt to use atacontrol to create a raid ... array (either RAID0 or RAID0+1). ... the kvm between my FreeBSD machine and this box. ...
      (freebsd-current)
    • Software RAID-1 with Mandrake9.0 problems - LONG
      ... I have configured RAID -1 ... md: md0: raid array is not clean -- starting background reconstruction ... ...trying to set up timer as Virtual Wire IRQ... ... host bus clock speed is 132.8612 MHz. ...
      (comp.os.linux.setup)