Re: Higher than expected disk write(2) latency



Martin Sustrik <sustrik@xxxxxxxxxx> writes:

Hi Roger,

Fair enough. That exaplains the behaviour. Would AIO help here? If
we are able to enqueue next write before the first one is finished,
it can start writing it immediately without waiting for a
revolution.

If you could get them queued at the disk level, things that would
need to be watched were if the disk can queue things up (and all
controllers/drivers support it), and how many things the disk can
queue up, and how large each of those things can be, if they aren't
queued at the disk, there is the chance that the machine cannot get
the data to the disk faster enough for that next sector.

I have always avoided fully sync operations as things *ALWAYS* got
really really slow because of all of the requirements need to make
sure that it always got the data to disk correctly on a unexpected
crash, and typically the type of applications I dealt with, if the
machine crashed the currently outputting data was known to be
incomplete and generally useless, so things were reran.

Depending on your application you could always get a small fast
solid state device (no seek or RPM issues), and use it to keep a
journal that could be replayed on an unexpected crash...and then
just use various syncs to force things to disk at various points.

We've tried AIO and the results are quite disappointing. If you open
the file with O_SYNC, the latencies are the same as with sync I/O -
each write takes 8.3ms (7500rpm disk).

I thought you were doing I/O to the underlying block device. If so,
there's no need to open with O_SYNC. You do, however, need to open the
device with O_DIRECT and align your buffers (and buffer lengths)
properly.

Which AIO interface are you using, libaio or librt? How many I/Os are
you queueing to the device? You may want to take a look at aio-stress.c
as a way to test your device (this uses libaio, the in-kernel AIO
interface).

Cheers,

Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • I/O request taking longer than 15 seconds
    ... I/O request taking longer than 15 seconds to complete ... I've seen this same problem and my suspicions are that the disk defrag is ... reading or writing to your data or log files? ... getting your feedback but unfortunately that didn't resolve the problem ...
    (microsoft.public.sqlserver.server)
  • I/O request taking longer than 15 seconds
    ... I/O request taking longer than 15 seconds to complete ... I've seen this same problem and my suspicions are that the disk defrag is ... reading or writing to your data or log files? ... getting your feedback but unfortunately that didn't resolve the problem ...
    (microsoft.public.sqlserver.server)
  • Re: Caching control
    ... |> | invalidate/unmap them in order to discard the data from memory. ... |> writing out to disk. ... | easy to discard as clean disk cache. ... stating that a specific amount of RAM can be used only for I/O ...
    (comp.os.linux.development.system)
  • Re: Dynamic configure max_cstate
    ... fio is a disk I/O workload ... which doesn't spend much time with cpu, ... I also thought it's related to timer. ...
    (Linux-Kernel)
  • Re: Dynamic configure max_cstate
    ... fio is a disk I/O workload ... I also thought it's related to timer. ... But oprofile data shows acpi_pm has more cpu utilization. ...
    (Linux-Kernel)