VIA SATA Raid needs a long time to recover from suspend

From: Phillip Susi (psusi_at_cfl.rr.com)
Date: 11/16/05

  • Next message: Luke Yang: "Re: ADI Blackfin patch for kernel 2.6.14"
    Date:	Tue, 15 Nov 2005 22:37:58 -0500
    To: linux-kernel@vger.kernel.org
    
    

    I have been debugging a power management problem for a few days now, and
    I believe I have finally solved the problem. Because it involved
    patching the kernel, I felt I should share the fix here in hopes that it
    can be improved and/or integrated into future kernels. Right now I am
    running 2.6.14.2 on amd64, compiled myself, with the ubuntu breezy amd64
    distribution.

    First I'll state the fix. It involved changing two lines in
    include/linux/libata.h:

    static inline u8 ata_busy_wait(struct ata_port *ap, unsigned int bits,
                       unsigned int max)
    {
        u8 status;

        do {
            udelay(100); <-- changed to 100
    from 10
            status = ata_chk_status(ap);
            max--;
        } while ((status & bits) && (max > 0));

        return status;
    }

    and:

    static inline u8 ata_wait_idle(struct ata_port *ap)
    {
        u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ,
    10000); <-- changed to 10,000 from 1,000

        if (status & (ATA_BUSY | ATA_DRQ)) {
            unsigned long l = ap->ioaddr.status_addr;
            printk(KERN_WARNING
                   "ATA: abnormal status 0x%X on port 0x%lX\n",
                   status, l);
        }

        return status;
    }

    The problem seems to be that my VIA SATA raid controller requires more
    time to recover from being suspended. It looks like the code in
    sata_via.c restores the task file after a resume, then calls
    ata_wait_idle to wait for the busy bit to clear. The problem was that
    this function timed out before the busy bit cleared, resulting in
    messages like this:

    ATA: abnormal status 0x80 on port 0xE007

    Then if there was an IO request made immediately after resuming, it
    would timeout and fail, because it was issued before the hardware was
    ready. Changing the timeout resolved this. I tried changing both the
    udelay and ata_busy_wait lines to increase the timeout, and it did not
    seem to matter which I changed, as long as the total timeout was
    increased by a factor of 100.

    Since increasing the maximum timeout, suspend and hibernate work great
    for me. While experiencing this bug, it may have exposed another bug,
    which I will mention now in passing. As I said before, after a resume,
    if there was an IO request made immediately ( before the busy bit
    finally did clear ) it would timeout and fail. It seemed the kernel
    filled the buffer cache for the requested block with garbage rather than
    retry the read. It seems to me that at some point, the read should have
    been retried. The symptoms of this were:

    1) When suspend.sh called resume.sh immediately after the echo mem >
    /sys/power/state line, then on resume, the read would fail in a block in
    the resierfs tree that was required to lookup the resume.sh file. This
    caused reiserfs to complain about errors in the node, and the script
    failed to execute. Further attempts to touch the script, even with ls
    -al /etc/acpi/resume.sh failed with EPERM. I would think that at worst,
    this should fail with EIO or something, not EPERM.

    2) At one point I tried running echo mem > /sys/power/state ; df. After
    the resume, the IO read failed when trying to load df, and I got an
    error message saying the kernel could not execute the binary file.
    Further attempts to run df failed also. Other IO at this point was fine.

    This leads me to think that when the IO failed, rather than inform the
    calling code of the failure, for example, with an EIO status, the buffer
    cache got filled with junk, and this should not happen. Either the
    operation should succeed, and the correct data be returned, or it should
    fail, and the caller should be informed of the failure, and not given
    incorrect data.

    When the first IO immediately following the suspend failed, I got these
    messages:

    [ 32.013538] ata1: command 0x35 timeout, stat 0x50 host_stat 0x1
    [ 32.045510] ata2: command 0x35 timeout, stat 0x50 host_stat 0x1

    As long as no IO was immediately requested after the resume ( i.e. if I
    echo mem > /sys/power/state on an otherwise idle system, rather than
    using suspend.sh ) then these errors did not happen, only the abnormal
    status messages did.

    For reference, my system is configured as follows:

    Motherboard: Asus K8V Deluxe
    CPU: AMD Athlon 64 3200+
    RAM: 1 GB of Corsair low latency pc3200 ddr sdram
    Video: ATI Radeon 9800 Pro with a Samsung 930B 19 inch LCD display
    Disks: 2 WD 36 gig SATA 10,000 rpm raptors in a raid0 configuration on
    the via sata raid controller
    Partitions:

    /dev/mapper/via_hfciifae1: 40 gig winxp NTFS partition
    /dev/mapper/via_hfciifae3: 10 gig experimental partition
    /dev/mapper/via_hfciifae5: 50 meg ext2 /boot partition
    /dev/mapper/via_hfciifae6: 1 gig swap partition
    /dev/mapper/via_hfciifae7: 22 gig reiserfs root partition

    If anyone has any suggestions of further tests I can perform to narrow
    down the problem, or a better solution for it, you have my full
    cooperation. If this fix seems acceptable, then I hope it can be merged
    in the next kernel release.

    PS> Please CC me on any replies, as I am not subscribed to this list

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Luke Yang: "Re: ADI Blackfin patch for kernel 2.6.14"

    Relevant Pages

    • Re: ieee1394 feature needed: overwrite SPLIT_TIMEOUT from userspace
      ... connection with the ieee1394 kernel module. ... avoid the transaction timeout, you could add a register to your ... Control the timeout like before via a write request to the ... I could post a patch which works as I outlined if it fits your ...
      (Linux-Kernel)
    • Re: Ajax sometimes stops executing
      ... but they too can timeout or otherwise fail to complete. ... So what I do is that each time I make a request I stick an object ... I have a setInterval function that periodically scans the array ...
      (comp.lang.javascript)
    • FreeBSD4.9 - panic: timeout table full
      ... panic: timeout table full ... It happens when running GENERIC kernel and on different custom kernels ... It happens when disks in PIO4 Mode or in UDMA mode ... I have read about these timeout panics in 'man crash' and looking at ...
      (freebsd-stable)
    • Re: Not enough info, so no point
      ... newer kernel, for Live CD, install DVD, and actual installation, and all ... fail appear to fail during kernel start up. ... the next kernel update another install will work. ... and try to boot the F15 Live ...
      (Fedora)
    • Re: [RFC/RFT] calloutng
      ... say user requests a timeout after X seconds and with a tolerance of D ... that may be just 100us earlier then next hardware interrupt. ... Surely scheduling the event at T_X+D instead of T_X increases the ... So while it may make sense to extend a 1us request ...
      (freebsd-arch)