Re: libata in 2.4.24?

From: bill davidsen (davidsen_at_tmr.com)
Date: 12/02/03

  • Next message: Mike Fedyk: "Re: libata in 2.4.24?"
    To: linux-kernel@vger.kernel.org
    Date:	2 Dec 2003 22:34:20 GMT
    
    

    In article <877k1f9e1g.fsf@stark.dyndns.tv>,
    Greg Stark <gsstark@mit.edu> wrote:
    | Jeff Garzik <jgarzik@pobox.com> writes:

    | > > This doesn't happen with SCSI disks where multiple requests can be pending so
    | > > there's no urgency to reporting a false success. The request doesn't complete
    | > > until the write hits disk. As a result SCSI disks are reliable for database
    | > > operation and IDE disks aren't unless write caching is disabled.
    | >
    | > This is not really true.
    | >
    | > Regardless of TCQ, if the OS driver has not issued a FLUSH CACHE (IDE)
    | > or SYNCHRONIZE CACHE (SCSI), then the data is not guaranteed to be on
    | > the disk media. Plain and simple.
    |
    | That doesn't agree with people's experience. People seem to find that SCSI
    | drives never cache writes. This sort of makes sense since there's just not
    | much reason to report a write success before the write can be performed.
    | There's no performance advantage as long as more requests can be queued up.

    I hope you mean the drives don't report completion until the data is on
    the platter, clearly the data is cached in the drive until it can be
    written.
    |
    |
    | > If fsync(2) returns without a flush-cache, then your data is not
    | > guaranteed to be on the disk. And as you noted, flush-cache destroys
    | > performance.
    |
    | It's my understanding that it doesn't. There was some discussion in the past
    | month about making the drivers issue syncs for journalled filesystems, but
    | even then the idea of adding it to fsync or O_SYNC files wasn't the
    | motivation.

    With O_SYNC files there is the possibility of having a don't cache bit
    in the packet to the drive, even with write caching. With fsync I don't
    see any way to do it after the fact for only some of the data in the
    drive cache. That's just an observation.

    Clearly with a completion status coming back after actual completion
    O_SYNC or fsync reduce to "wait for the ack from the drive."
    |
    |
    | > There are three levels:
    | >
    | > a) Data is successfully transferred to the controller/drive queue (TCQ).
    | > b) Data is successfully transferred to the drive's internal buffers.
    | > c) The drive successfully transfers data to the media.
    |
    | Only the third is of interest to Postgres or other databases. In fact, I
    | suspect only the third is of interest to other systems that are supposed to be
    | reliable like MTAs etc. I think Wietse and others would be shocked if they
    | were told fsync wasn't guaranteed to have waited until the writes had actually
    | hit the media.

    I think for reliability fsync has to flush cache, regardless of the
    performance hit. I think a drive would be unusably slow if you did it
    after each O_SYNC write, so that's probably not practical. Clearly the
    best solution is a full SCSI implementation over PATA/SATA, but that
    would eliminate some of the justification for SCSI devices at premium
    prices.

    -- 
    bill davidsen <davidsen@tmr.com>
      CTO, TMR Associates, Inc
    Doing interesting things with little computers since 1979.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Mike Fedyk: "Re: libata in 2.4.24?"

    Relevant Pages

    • Re: Tips sought for optimizing 10.4.10s performance
      ... upgrade to disks with those speeds. ... Most desktop drives have been 7200 RPM, ... 7200 RPM disks have only had 2MB of cache. ... a CPU upgrade, or do I just get a new Mac. ...
      (comp.sys.mac.system)
    • Re: MicroVAX 3500 questions
      ... DSSI nor SCSI, ... It's an inconvenience but the system will boot without it. ... etc are disks ... The RA7x drives are about the size of an RD54. ...
      (comp.sys.dec)
    • Re: 3B2 Disks
      ... 2 MFM drives on a custom controller. ... But a friend had a 3B2 which had a SCSI interface, ... tablets around for erasing disks. ... But -- the 3B2 was the porting base for unix for some time. ...
      (comp.sys.3b1)
    • Re: How do I find equipment (drives of various types) for my 6-year-old Linux system?
      ... > I would have thought that an 80-pin SCSI drive wouldn't work on my system! ... If you currently have 68-pin drives, ... Or has IDE improved in that area? ... SCSI's big advantage is that you can connect multiple disks more easily ...
      (comp.os.linux.hardware)
    • SUMMARY SCSI Disk Errors - sense key: not ready
      ... Possible reasons for the disk problems included high humidity, power supply ... problems, dying motor, poor cabling and a problem with the SCSI interface to ... I suspect the drives' motors were slowly dying; I replaced both disks one ...
      (SunManagers)