Re: [patch 0/4] [RFC] Another proportional weight IO controller



On Fri, Nov 07, 2008 at 11:31:44AM +0100, Peter Zijlstra wrote:
On Fri, 2008-11-07 at 11:41 +1100, Dave Chinner wrote:
On Thu, Nov 06, 2008 at 06:11:27PM +0100, Peter Zijlstra wrote:
On Thu, 2008-11-06 at 11:57 -0500, Rik van Riel wrote:
Peter Zijlstra wrote:

The only real issue I can see is with linear volumes, but
those are stupid anyway - non of the gains but all the
risks.

Linear volumes may well be the most common ones.

People start out with the filesystems at a certain size,
increasing onto a second (new) disk later, when more space
is required.

Are they aware of how risky linear volumes are? I would
discourage anyone from using them.

In what way are they risky?

You loose all your data when one disk dies, so your mtbf decreases
with the number of disks in your linear span. And you get non of
the benefits from having multiple disks, like extra speed from
striping, or redundancy from raid.

Fmeh. Step back and think for a moment. How does every major
distro build redundant root drives?

Yeah, they build a mirror and then put LVM on top of the mirror
to partition it. Each partition is a *linear volume*, but
no single disk failure is going to lose data because it's
been put on top of a mirror.

IOWs, reliability of linear volumes is only an issue if you don't
build redundancy into your storage stack. Just like RAID0, a single
disk failure will lose data. So, most people use linear volumes on
top of RAID1 or RAID5 to avoid such a single disk failure problem.
People do the same thing with RAID0 - it's what RAID10 and RAID50
do....

Also, linear volume performance scalability is on a different axis
to striping. Striping improves bandwidth, but each disk in a stripe
tends to make the same head movements. Hence striping improves
sequential throughput but only provides limited iops scalability.
Effectively, striping only improves throughput while the disks are
not seeking a lot. Add a few parallel I/O streams, and a stripe will
start to slow down as each disk seeks between streams. i.e. disks
in stripes cannot be considered to be able to operate independently.

Linear voulmes create independent regions within the address space -
the regions can seek independently when under concurrent I/O and
hence iops scalability is much greater. Aggregate bandwidth is the
same a striping, it's just that a single stream is limited in
throughput. If you want to improve single stream throughput,
you stripe before you concatenate.

That's why people create layered storage systems like this:

linear volume
|->stripe
|-> md RAID5
|-> disk
|-> disk
|-> disk
|-> disk
|-> disk
|-> md RAID5
|-> disk
|-> disk
|-> disk
|-> disk
|-> disk
|->stripe
|-> md RAID5
......
|->stripe
......

What you then need is a filesystem that can spread the load over
such a layout. Lets use, for argument's sake, XFS and tell it the
geometry of the RAID5 luns that make up the volume so that it's
allocation is all nicely aligned. Then we match the allocation
group size to the size of each independent part of the linear
volume. Now when XFS spreads it's inodes and data over multiple
AGs, it's spreading the load across disks that can operate
concurrently....

Effectively, linear volumes are about as dangerous as striping.
If you don't build in redundancy at a level below the linear
volume or stripe, then you lose when something fails.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Pyramidable
    ... The PROVED decipherment, i.e. the "Proto-Ionic Solution" has shown ... That the Phaistos Disk would be "Minoan" is ALSO WRONG. ... If Linear B is Greek, ... > Linear A and even ealier cannot be in the Greek language. ...
    (sci.archaeology)
  • Re: /dev/hda1 not umounting.
    ... I was trying to use the disk as per-normal? ... errors first. ... Any imaging program basically tries to read "linear" sector by sector. ...
    (alt.os.linux)
  • Re: FGC to Grapheus
    ... interpretate Linear A Signs on the base of Linear B correspondants or ... Phaistos Disk is NOT related in one way or the other to the LinearA/ ... This Greek Ethnic Group moved South in three ...
    (sci.lang)
  • lossless growing of linear RAID (resizing)
    ... linear and then grow the system. ... running RAID-0. ... the disk cannot become 100% full, ... edit raidtab to include a new raid linear device. ...
    (Debian-User)
  • RE: Gvinum RAID5 performance
    ... > parity to data ratio, and the less benefit you would get from ... reading every disk in the stripe makes sense. ... random non-conflicting) data can be pulled from different drives, ...
    (freebsd-current)

Loading