Re: SCSI Error Troubleshooting



On 2006-10-22, Michael Nguyen <michaeln@xxxxxxxxxxxxx> wrote:

So, one of my servers has fallen ill and I'm not sure what's going on. The
server gets in states where it has trouble with disk access. Basically it
works, but things slow down to the point where the server can't perform its
assigned duties. dmesg gets filled with errors like this:

I/O error: dev 08:11, sector 927989768
I/O error: dev 08:11, sector 928251912
I/O error: dev 08:11, sector 928514056

So, this is clearly /dev/sdb1 which is a partition on a connected MSA1500 (a
fibre connected storage array). The problem is that the MSA1500 says that
everything is fine and the disks are all green and happy. What would be the
best way to find out what's wrong here? Can I map those sectors to a
particular disk somehow? What would be the next step in determining what's
wrong?

These dmesg errors are the only errors I've been able to find. They don't
appear on bootup but appear at some later point when the disk is being used.
Any help/suggestions would be appreciated.

Do you have the smartmontools installed? If not, pick them up from
http://smartmontools.sourceforge.net/. You can get a fair amount of
information for those tools:

Device: SEAGATE SX4234514 Version: 9E21
Serial number: LC244585
Device type: disk
(pass2:ahc0:0:2:0): MODE SENSE(06). CDB: 1a 0 19 0 40 0
(pass2:ahc0:0:2:0): CAM Status: SCSI Status Error
(pass2:ahc0:0:2:0): SCSI Status: Check Condition
(pass2:ahc0:0:2:0): ILLEGAL REQUEST asc:24,0
(pass2:ahc0:0:2:0): Invalid field in CDB field replaceable unit: 1:
Command byte 2 is invalid
Local Time is: Sun Oct 22 12:50:00 2006 CDT
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK
Vendor (Seagate) cache information
Blocks sent to initiator = 1327944188
Blocks received from initiator = 56391246
Blocks read from cache and sent to initiator = 16943769
Number of read and write commands whose size <= segment size = 4504293
Number of read and write commands whose size > segment size = 11767
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 60809.47
number of minutes until next internal SMART test = 120

Error counter log:
Errors Corrected by Total Correction
Gigabytes Total
EEC rereads/ errors algorithm
processed uncorrected
fast | delayed rewrites corrected invocations [10^9
bytes] errors
read: 15550 0 0 15550 17999 851216.541
31
write: 0 0 0 1602 1602 28.872
0

Non-medium error count: 0
(pass2:ahc0:0:2:0): LOG SENSE. CDB: 4d 0 47 0 0 0 0 0 4 0
(pass2:ahc0:0:2:0): CAM Status: SCSI Status Error
(pass2:ahc0:0:2:0): SCSI Status: Check Condition
(pass2:ahc0:0:2:0): ILLEGAL REQUEST asc:24,0
(pass2:ahc0:0:2:0): Invalid field in CDB field replaceable unit: a:
Command byte 2 bit 5 is invalid

Error Events logging not supported

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S
on']
Device does not support Self Test logging



Depending on your HBA, you may be able to run a non-destructive media
check/block remapping when you boot up and before your OS loads. Adaptec
HBAs do this under the CNTRL-A menu at boot time.

--

John (john@xxxxxxxxxxx)
.



Relevant Pages

  • Re: CANT LOGIN!! (urgent)
    ... Did you image the server disk recently? ... Do you use the server to surf the Internet? ... this command from a networked PC - ... "Pegasus (MVP)" wrote: ...
    (microsoft.public.windows.server.general)
  • Re: Part of new hard disk has disappeared - access only to about hal
    ... Rather than reload from OEM CDs, update to Serv Pack 3 etc and load all softwatre, tried to save a lot of time by using freeware to clone existing disk - as I accept, at own risk! ... I strongly suspect that I have corrupted the MBR, and perhaps especially the Volume Bytes forming part of the MBR in sector 0. ... The "of=" is the output target for the command. ... If you want to erase the whole drive, the "dd" command can do that also. ...
    (microsoft.public.windowsxp.general)
  • Re: New HDD Installation
    ... "John John" wrote: ... a single partition disk. ... confirm or change the active flag of the partition using the FDISK command. ... There are other ways to mark the partition active, either with a Windows ...
    (microsoft.public.windowsxp.general)
  • Re: Forgot to put old_rootvg to sleep before reboot - How to Wake
    ... Performs operations on existing alternate rootvg volume groups. ... To rename Alternate Disk Volume Group: ... The alt_rootvg_op command can be used to determine which disk is the ... flag) can be executed at this time. ...
    (comp.unix.aix)
  • PC8477 Demo Program
    ... The PC8477 Demo Program is designed to allow access to all software commands and registers of National Semiconductor's PC8477 Advanced Floppy Disk Controller. ... The left center indicates the number of bytes transferred during the last command issued. ...
    (comp.sys.ibm.ps2.hardware)