Re: Too much hard drives failing




Albert, thanks a lot for your answer.

The only log i can trace back now is
Device: /dev/sda, ATA error count increased from 11605 to 11610
because it's stored on our RequesTracker

I have also some notes about
ATA: abnormal status 0xD0 on port 0xE407

Jan 26 06:12:29 ns kernel: ata2: command 0x25 timeout, stat 0xd0
host_stat 0x21
Jan 26 06:12:29 ns kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
Jan 26 06:12:29 ns kernel: ata2: status=0xd0 { Busy }
Jan 26 06:12:29 ns kernel: SCSI disk error : host 3 channel 0 id 0 lun 0
return code = 8000002
Jan 26 06:12:29 ns kernel: Current sd08:10: sns = 70 b
Jan 26 06:12:29 ns kernel: ASC=47 ASCQ= 0
Jan 26 06:12:29 ns kernel: Raw sense data:0x70 0x00 0x0b 0x00 0x00 0x00
0x00 0x0a 0x00 0x00 0x00 0x00 0x47 0x00 0x00 0x00 0x00 0x00
Jan 26 06:12:29 ns kernel: I/O error: dev 08:10, sector 0
Jan 26 06:12:29 ns kernel: ATA: abnormal status 0xD0 on port 0xE407

This is taken from of those broken disks that's still attached on the
second port of one of the servers, i left it only to try figuring this
out.

I tried replacing the cables, etc. But the disks are really broken.
replacing them it works, maybe with no further errors.

Yes, i can remember about different error messages.

And i can only wait for it to happen again, if you need more logs

How did you work the 3Ware firmware thing?

Thanks a lot for the help

Pablo



On Fri, 2007-01-26 at 04:57 +0000, Albert Graham wrote:
Hi Pablo,

What kind of failures are these ? hardware or disk corruption/software ?

I have about 40 SM servers (with SATA2, SG 500GB etc. running FC5) also
using Reiser 3, I also had failures which I eventually traced to 3ware
controller firmware, however I have not had any hardware failures.


Thanks.
Albert.



Pablo Povarchik wrote:
Hello there

Im starting here because i really don't know which would the best place
to look for help. If this is not the correct list, please advise. And if
you can recommend any ML right for this, please let me know.

Words said, let's go to the point:

We have recently added 20 servers to our little farm, 7 of which were
having hard failuers on disks (SATA, Seagate, good brand new SuperMicro
boxes)

The fact is that this failures are coming up right after we decided to
move to reiserfs.

Can 7 out of 20 hard drives be defective (yes, of course, but what is
the % probability for this)?
Can this anyhow be related with reiserfs?

Sata2
Seagate
SuperMicro
Fedora core 5

Any help will be more than appreciated


Thanks a lot




--
Pablo Povarchik - Level Next Ltd
CEO

Managed hosting services, the core of our work

+--------- Web Hosting - Dedicated Servers - Colocation (AS41578) ----------+
| info@xxxxxxxxxxxxxx - http://www.futurahost.com/ - (+39) 0461 592710
| Special! Get a Full Cabinet + 10Mbps full burst for only EUR 1,099 per month
| in our London (UK) facilities. Availability also in Fremont CA, USA.
+---------------------------------------------------------------------------+

--
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list



Relevant Pages

  • Re: [SLE] frequent hard disk failures - 9.3?
    ... Carlos F Lange wrote: ... > disks either crashing or showing bad sectors. ... SuSE and ReiserFS were the only things in common ... so guess under what conditions failures occur. ...
    (SuSE)
  • Re: Too much hard drives failing
    ... Wouldn't think this is Reiser/Fedora related, it could be BIOS related, Seagate Firmware related. ... This could be a faulty batch of disks too, I would contact Seagate and check, in my case I'm one of those guys who believes in hardware raid for production machines, so failures can be resolved by disk replacement by anyone. ... Can this anyhow be related with reiserfs? ...
    (Fedora)
  • Re: ZFS with Linux: An Open Plea
    ... Does it matter that google's recent report on disk failures indicated ... since something like 70% of disks were still working a year after the ... than for my personal laptop hard drive. ...
    (Linux-Kernel)
  • Re: Bad Power Supply?
    ... To add, I had a number of Samsung hd's that kept on failing, failure was ... Samsung rma'd the disks but failures ... Changed the pwr supply, failures went away. ... JamesJ - no useful information means no one could provide anything ...
    (microsoft.public.windowsxp.hardware)
  • Re: Too much hard drives failing
    ... But the disks are really broken. ... What kind of failures are these? ... We have recently added 20 servers to our little farm, ... the core of our work ...
    (Fedora)