Re: [PATCH] fix readahead breakage for sequential after random reads

From: Ram Pai (linuxram_at_us.ibm.com)
Date: 08/04/04

  • Next message: Hugh Dickins: "Re: [2.6.8-rc2-mm2] oops report (crashes on booting)"
    To: Andrew Morton <akpm@osdl.org>
    Date:	04 Aug 2004 10:38:37 -0700
    
    
    

    On Mon, 2004-07-26 at 21:18, Ram Pai wrote:
    > On Mon, 2004-07-26 at 17:08, Andrew Morton wrote:
    > > Ram Pai <linuxram@us.ibm.com> wrote:
    > > >
    > > > Andrew,
    > > > Yes the patch fixes a valid bug.
    > > >
    > >
    > > Please don't top-post :(
    > > > RP
    > > >
    > > > On Mon, 2004-07-26 at 16:29, Andrew Morton wrote:
    > > > > Miklos Szeredi <miklos@szeredi.hu> wrote:
    > > > > >
    > > > > > Current readahead logic is broken when a random read pattern is
    > > > > > followed by a long sequential read. The cause is that on a window
    > > > > > miss ra->next_size is set to ra->average, but ra->average is only
    > > > > > updated at the end of a sequence, so window size will remain 1 until
    > > > > > the end of the sequential read.
    > > > > >
    > > > > > This patch fixes this by taking the current sequence length into
    > > > > > account (code taken from towards end of page_cache_readahead()), and
    > > > > > also setting ra->average to a decent value in handle_ra_miss() when
    > > > > > sequential access is detected.
    > > > >
    > > > > Thanks. Do you have any performance testing results from this patch?
    > > > >
    > > > Ram Pai <linuxram@us.ibm.com> wrote:
    > > >
    > > > Andrew,
    > > > Yes the patch fixes a valid bug.
    > >
    > > Fine, but the readahead code is performance-sensitive, and it takes quite
    > > some time for any regressions to be discovered. So I'm going to need to
    > > either sit on this patch for a very long time, or extensively test it
    > > myself, or await convincing test results from someone else.
    > >
    > > Can you help with that?
    >
    > yes I will run all my standard testsuites before we take this patch.
    > (DSS workload, iozone, sysbench). I will get back with some results
    > sooon. Probably by the end of this week.

    Ok I have enclosed the results. The summary is: there is no significant
    improvement or decrease in performance of (DSS workload, iozone,
    sysbench) The increase or decrease is in the margin of errors.

    I have also enclosed a patch that partially backs off Miklos's fix.
    Shane Shrybman correctly pointed out that the real fix is to set
    ra->average value to max/2 when we move from readahead-off mode to
    readahead-on mode. The other part of Miklos's fix becomes irrelevent.

     
    Sorry it took some time to get back on this. Its almost automated so
    turnaround time should be quick now-on-wards.

    RP

    
    

    --- linux-2.6.8-rc3/mm/readahead.c 2004-08-03 14:26:46.000000000 -0700
    +++ mypatchdir/mm/readahead.c 2004-08-04 17:08:47.665196424 -0700
    @@ -468,11 +468,7 @@ do_io:
                               * pages shall be accessed in the next
                               * current window.
                               */
    - average = ra->average;
    - if (ra->serial_cnt > average)
    - average = (ra->serial_cnt + ra->average + 1) / 2;
    -
    - ra->next_size = min(average , (unsigned long)max);
    + ra->next_size = min(ra->average , (unsigned long)max);
                     }
                     ra->start = offset;
                     ra->size = ra->next_size;

    
    

                            DSS WORKLOAD
                            ----------------

    ---------------------------------------------------------------------
    | 2.6.8-rc2 | miklos_patch | miklos_mod_patch |
    | | | |
    ----------------------------------------------------------------------
    | | | |
    | X | 0.9001% increase| 0.73649% increase |
    | | w.r.t X | w.r.t X |
    | | | |
    ---------------------------------------------------------------------

                            iozone
                            -------
                    iozone -c -t1 -s 4096m -r 128k -w
    --------------------------------------------------------------------
    | | 2.6.8-rc2 | miklos_patch |miklos_mod_patch |
    ---------------------------------------------------------------------
    | | | | |
    |Sequential | 29773.95KB/sec | 29777.38KB/sec| 29968.74KB/sec |
    | | | | |
    |reverse read | 17905.65KB/sec | 17897.85KB/sec| 17726.55KB/sec |
    | | | | |
    |stride read | 19927.43KB/sec | 19778.90KB/sec| 19625.75KB/sec |
    | | | | |
    |random read | 18454.49KB/sec | 18384.28KB/sec| 18666.47KB/sec |
    | | | | |
    |random mix | 19434.58KB/sec | 19480.43KB/sec| 18755.05KB/sec |
    | | | | |
    |pread | 29763.56KB/sec | 29780.05KB/sec| 29961.16KB/sec |
    | | | | |
    --------------------------------------------------------------------

                            iozone NFS
                            -------
                    iozone -c -t1 -s 4096m -r 128k -w

    ---------------------------------------------------------------------------
    | | 2.6.8-rc2 | miklos_patch | miklos_mod_patch|
    |---------------|---------------|-----------------------|-----------------|
    | | | | |
    |Sequential | 2272.77KB/sec | 2266.45KB/sec | 2269.98KB/sec |
    | | | | |
    |reverse read | 3078.14KB/sec | 3066.73KB/sec | 3079.87KB/sec |
    | | | | |
    |stride read | 3253.49KB/sec | 3220.70KB/sec | 3264.41KB/sec |
    | | | | |
    |random read | 5225.44KB/sec | 5231.68KB/sec | 5274.32KB/sec |
    | | | | |
    |random mix | 8389.64KB/sec | 8458.40KB/sec | 7722.72KB/sec |
    | | | | |
    |pread | 2264.64KB/sec | 2264.35KB/sec | 2256.15KB/sec |
    --------------------------------------------------------------------------

                            sysbench
                            --------

    sysbench --num-threads=256 --test=fileio --file-total-size=2800M --file-test-mode=rndrw run

    -------------------------------------------------------------------------
    | | 2.6.8-rc2 | miklos_patch |miklos_mod_patch|
    -------------------------------------------------------------------------
    | | | | |
    | Throughput | 2.745MB/sec | 2.721MB/sec | 2.607MB/sec |
    |---------------|-----------------------|---------------|----------------|
    | | | | |
    | | | | |
    |Requests/sec | 175.99 | 174.40MB/sec | 167.08MB/sec |
    |---------------|-----------------------|---------------|----------------|
    | | | | |
    | | | | |
    |Execution time | 56.8213s | 57.3400s | 59.8498s |
    --------------------------------------------------------------------------

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Hugh Dickins: "Re: [2.6.8-rc2-mm2] oops report (crashes on booting)"