flush cache range proposal (was Re: ide errors in 7-rc1-mm1 and later)

From: Jeff Garzik (jgarzik_at_pobox.com)
Date: 06/10/04

  • Next message: Steve Lee: "APIC error on CPU1: 00(02) && APIC error on CPU0: 00(02)"
    Date:	Thu, 10 Jun 2004 13:50:05 -0400
    To: linux-kernel@vger.kernel.org
    
    

    Eric D. Mudama wrote:
    > On Thu, Jun 10 at 8:11, Jens Axboe wrote:
    >
    >> On Thu, Jun 10 2004, Bartlomiej Zolnierkiewicz wrote:
    >>
    >>>
    >>> /me just thinks loudly
    >>>
    >>> 'linear range' FLUSH CACHE seems so easy to implement that I always
    >>> wondered
    >>> why FLUSH CACHE command doesn't make any use of LBA address and number
    >>> of sectors.
    >>
    >>
    >> Indeed, that would be very helpful as well.
    >
    >
    > Neat idea... so you send us a LBA and a block count, and we return
    > good status if that region is flushed.
    >
    > Each command can specify a 32MiB region, assuming a device with
    > 512-byte LBAs.
    >
    > Propose an exact implementation and an opcode...

    Ok, I'll give it a shot:

    1) IDENTIFY DEVICE, Word 206, Command set/feature supported

    bit 15: shall be cleared to zero
    bit 14: shall be set to one
    bits 13:1: reserved
    bits 0: 1 == flush cache (range) supported

    Word 206:

    If bit 0 is set to one, the mandatory FLUSH CACHE and FLUSH CACHE EXT
    commands (if implemented) support the RANGE bit, and user-supplied LBA
    and sector count specifying the limits of the cache flush. This bit
    merely identifies the presence of this feature. Use word 207, bit 0, to
    determine if the feature is enabled.

    2) IDENTIFY DEVICE, Word 207, Command set/feature enabled

    bit 15: shall be cleared to zero
    bit 14: shall be set to one
    bits 13:1: reserved
    bits 0: 1 == flush cache (range) enabled

    Word 206:

    If bit 0 is set to one, the mandatory FLUSH CACHE and FLUSH CACHE EXT
    commands (if implemented) support the RANGE bit, and user-supplied LBA
    and sector count specifying the limits of the cache flush.

    3) Modify FLUSH CACHE (E7h) as follows:

    Inputs:
    -------

    Features: bit 0 (RANGE)
    Sector Count: sector count
    LBA Low: LBA(7:0)
    LBA Mid: LBA(15:8)
    LBA High: LBA(23:16)

    Features -
    If the RANGE bit is set, the cache flush operation shall be considered
    to be limited to the region specified in Sector Count / LBA registers.
    If the RANGE bit is not set, or the implementation does not support the
    RANGE bit, then the FLUSH CACHE operation shall flush the entire cache.

    Sector Count -
    Maximum number of sectors to be flushed from the cache. A value of 00h
    specified that 256 sectors are to be flushed.

    LBA Low / Mid / High -
    An LBA starting address for the flush. Register contents as specified
    in READ DMA command, and other commands.

    Normal outputs:
    ---------------
    Unchanged.

    Error outputs:
    --------------
    Error register -

    RANGE (bit 0) shall be set to one, if RANGE bit was specified in
    Features register when the command was submitted, indicating this was a
    range-based flush cache.

    Description
    -----------
    If RANGE bit is set to one, the flush cache operation at a minimum shall
    flush the specified range of sectors specified by LBA / Sector Count.
    An implementation may choose to flush more than the specified range, up
    to an entire cache flush in a compatible or "no op" implementation.

    If no data within the specified LBA range exists in cache to be flushed,
    that shall not be considered an error.

    4) Modify FLUSH CACHE EXT (EAh) as follows:

    Inputs:
    -------

    Features:
            curr bit 0 (RANGE)
            prev na
    Sector Count:
            curr sector count(7:0)
            prev sector count(15:8)
    LBA Low:
            curr LBA(7:0)
            prev LBA(31:24)
    LBA Mid:
            curr LBA(15:8)
            prev LBA(39:32)
    LBA High:
            curr LBA(23:16)
            prev LBA(47:40)

    Features -
    If the RANGE bit is set, the cache flush operation shall be considered
    to be limited to the region specified in Sector Count / LBA registers.
    If the RANGE bit is not set, or the implementation does not support the
    RANGE bit, then the FLUSH CACHE operation shall flush the entire cache.

    Sector Count -
    Maximum number of sectors to be flushed from the cache. A value of
    0000h specified that 65,536 sectors are to be flushed.

    LBA Low / Mid / High -
    An LBA starting address for the flush. Register contents as specified
    in READ DMA command, and other commands.

    Normal outputs:
    ---------------
    Unchanged.

    Error outputs:
    --------------
    Error register -

    RANGE (bit 0) shall be set to one, if RANGE bit was specified in
    Features register when the command was submitted, indicating this was a
    range-based flush cache.

    Description
    -----------
    If RANGE bit is set to one, the flush cache operation at a minimum shall
    flush the specified range of sectors specified by LBA / Sector Count.
    An implementation may choose to flush more than the specified range, up
    to an entire cache flush in a compatible or "no op" implementation.

    If no data within the specified LBA range exists in cache to be flushed,
    that shall not be considered an error.

    </proposal>

    Comments requested.

    We need to KISS, if this proposal has any hope getting accepted.

    People interested in filesystem journalling, barriers and such please
    review.

    Once people on lkml are happy, I'll write this up in a PDF in T13 form,
    and propose it on the T13 list (making sure everyone involved is
    properly credited, of course).

            Jeff

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Steve Lee: "APIC error on CPU1: 00(02) && APIC error on CPU0: 00(02)"

    Relevant Pages

    • Re: range-based cache flushing (was Re: Linux 2.6.29)
      ... will flush the data for all of its partitions from the write cache.... ... SCSI'S SYNCHRONIZE CACHE command already accepts an ... How difficult would it be to pass the "lower-bound" LBA to SYNCHRONIZE ...
      (Linux-Kernel)
    • Re: range-based cache flushing (was Re: Linux 2.6.29)
      ... will flush the data for all of its partitions from the write cache.... ... And I bet we could convince T13 to add FLUSH CACHE RANGE, ... How difficult would it be to pass the "lower-bound" LBA to SYNCHRONIZE ... barriers to the device. ...
      (Linux-Kernel)
    • Re: range-based cache flushing (was Re: Linux 2.6.29)
      ... When you issue an fsyncon a disk with multiple partitions, you will flush the data for all of its partitions from the write cache.... ... And I bet we could convince T13 to add FLUSH CACHE RANGE, if we could demonstrate clear benefit. ... How difficult would it be to pass the "lower-bound" LBA to SYNCHRONIZE CACHE, where "lower bound" is defined as the lowest sector in the range of sectors to be flushed? ... It seems extremely unlikely that sync-cache speed would _decrease_: for flush-everything firmwares, the sync-cache speed would remain unchanged. ...
      (Linux-Kernel)
    • range-based cache flushing (was Re: Linux 2.6.29)
      ... will flush the data for all of its partitions from the write cache.... ... SCSI'S SYNCHRONIZE CACHE command already accepts an ... How difficult would it be to pass the "lower-bound" LBA to SYNCHRONIZE CACHE, where "lower bound" is defined as the lowest sector in the range of sectors to be flushed? ...
      (Linux-Kernel)
    • Re: [PATCH 5/7] vfs: Add wbcflush sysfs knob to disable storage device writeback cache flushes
      ... we never need to flush. ... I completely agree that the suggested knob, for disabling cache flush ... Patch #4 adds mandatory cache flush to fsync() (based on earlier Jeff's ...
      (Linux-Kernel)