Re: why I told dd command to write but it reads in first?

From: Doug Gale (dgaleSPAMTHISYOUSCUM_at_mailexcite.com)
Date: 06/19/04


Date: Sat, 19 Jun 2004 00:14:30 GMT


"Josef Moellers" <josef.moellers@fujitsu-siemens.com> wrote in message
news:c91suc$3c3$1@nntp.fujitsu-siemens.com...
tcs wrote:
>> Greetings!
>> I am using `dd` command to measure raw disk write speed.
>> I did "dd if=/tmp/test.bin of=/dev/sda1"
>>
>> According to my device driver, `dd` command reads in the whole
>> /dev/sda1 first then writes to it. Should dd just read /dev/sda1? The
>> even weird thing is that dd command reads in as many sectors as the size
>> of /tmp/test.bin file, can someone tell me why?
>
> dd's block size is 512 by default. Linux' buffer size is 4k by default.
> Linux uses the buffer cache by default.
> When dd starts writing, the first 512 bytes arrive in the buffer cache
> and the buffer cache must assume that this will be an update of the
> first 512 bytes of the first buffer, so it must read the first buffer
> and modify the first 512 bytes. This will happen with each 4k buffer of
> the entire disk.

> --
> Josef Möllers (Pinguinpfleger bei FSC)
> If failure had no penalty success would not be a prize
> -- T. Pratchett

Processors have the same behaviour. Instead of 4KB blocks, it is 64-byte
lines. Instead of 512-byte sectors, it is individual bytes. You could say,
using disk terminology and a little imagination, a modern x86 CPU has a
64-byte block size and a 1-byte sector size.

You can't write one byte to memory on a modern processor because it will
read that cache line and modify the byte in the cache. However, really
modern processors use "write combining". Write combining is like a special
cache line that optimizes sequential writes. If you write to ALL the bytes
in the write combining buffer, it does not bother with the read, and just
writes over what is in main memory. If you DON'T write all the bytes in the
cache line, it must read the line being modified, then it merges all the
bytes in the (partially filled) write combining buffer to the cache line, as
it would if there were no write combining.

Why doesn't Linux use a "write combining" algorithm for the sectors in a
block? It could have a special buffer that caches sequential sector writes,
allowing it to avoid unnecessary reads entirely.

If you want a really clear explanation of write combining, see the AMD
Athlon Optimization Guide from the AMD website.

Doug Gale

---
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.707 / Virus Database: 463 - Release Date: 6/15/2004


Relevant Pages

  • Re: Naive questions about Core i7 SMT
    ... Once the cache is full then obviously it becomes a synchronous throttle. ... However portions of file create operations are disk synchronous ... The problem with unzip is that you usually have a _lot_ of smallish files, and that would work better if you did sequential writing only, i.e. buffer full output files and write them one by one. ...
    (comp.arch)
  • Re: Good settings for BlockRead
    ... Too large a block will result in stalling as the massive buffer is read ... even if it is optimal for disk access. ... The read comes from windows which may preread or cache, ...
    (comp.lang.pascal.delphi.misc)
  • Re: When does `write return?
    ... the write could return as soon as the data get to a memory buffer. ... > But the hardware controller for the disk drives allowed you to read any ... > per cylinder and 20 sectors per track. ... > the whole cylinder and do an automatic seek with no intervention from the OS ...
    (comp.os.linux.misc)
  • the unit of dealing with buffer in address_space and buffer_head, bio_vec in Linux 2.6.10
    ... These days I'm working on pagecache of Linux 2.6.10. ... I also read one chapter of my book "Linux Kernel Development" by Rovert Love ... However, confusingly, the unit of dealing with the buffer differs in those ... which are "consecutive" on the disk. ...
    (comp.os.linux.development.system)
  • Re: is there a user mode way to flush disk cache
    ... > arrived on the disk. ... Cache contains data that is already present on ... My understanding is that the buffer cache and page cache are completely ... one caches information keyed by (block device, ...
    (comp.os.linux.development.system)