Re: [ANNOUNCE] Ramback: faster than a speeding bullet



On Sat, Mar 15, 2008 at 01:17:13PM -0800, Daniel Phillips wrote:
On Saturday 15 March 2008 13:59, Willy Tarreau wrote:
On Thu, Mar 13, 2008 at 11:14:39AM -0800, Daniel Phillips wrote:
On Thursday 13 March 2008 06:22, Alan Cox wrote:
...Ext3 cannot recover well from massive loss of intermediate
writes. It isn't a normal failure mode and there isn't sufficient fs
metadata robustness for this. A log structured backing store would deal
with that but all you apparently want to do is scream FUD at anyone who
doesn't agree with you.

Scream is an exaggeration, and FUD only applies to somebody who
consistently overlooks the primary proposition in this design: that the
battery backed power supply, computer hardware and Linux are reliable
enough to entrust your data to them. I say this is practical, you say
it is impossible, I say FUD.

All you are proposing is that nobody can entrust their data to any
hardware. Good point. There is no absolute reliability, only degrees
of it.

Many raid controllers now have battery backed writeback cache, which
is exactly the same reliability proposition as ramback, on a smaller
scale. Do you refuse to entrust your corporate data to such
controllers?

RAID controllers do not have half a terabyte of RAM.

And? Either you have battery backed ram with critical data in it or
you do not. Exactly how much makes little difference to the question.

It completely changes the method to power it and the time the data may
remain in RAM. The Smart 3200 I have right here simply has lithium
batteries directly connected to the static RAM chips. Very low risk of
power failure. The way your presented your work shows it rely on a UPS
to sustain the PC's power supply, which it turn maintains the PC alive,
which in turn tries not to reboot to keep its RAM consistent. There are
a lot of reasons here to get a failure.

Don't get me wrong, I still think your project has a lot of usages. But
you have to admit that there are huge differences between using it in
an appliance with battery-backed RAM which is able to recover data after
a system crash, power outage or anything, and the average Joe's PC setup
as an NFS server for the company with a cheap UPS to try not to lose the
data should a power outage occur.

I think it could get major adoption with ordered writes.

Also, you are always
invited to choose between speed (write back) and reliability (write through).

As is the case with ramback. Just echo 1 >/proc/driver/ramback/<name>.

Also, please note that the problem here is not related to the number of
nines of availability. This number only counts the ratio between uptime
and downtime. We're more facing a problem of MTBF, where the consequences
of a failure are hard to predict.

That is why I keep recommending that a ramback setup be replicated or
mirrored, which people in this thread keep glossing over. When
replicated or mirrored, you still get the microsecond-level transaction
times, and you get the safety too.

I agree, but in this case, you should present it this way. You have been
insisting too much on the average PC's reliability, the fact that no kernel
ever crashed for you, etc... So you are demonstrating that your product is
good provided that everything goes perfectly. All people who have experienced
software or hardware problems in the past (ie mostly everyone here) will not
trust your code because it relies on pre-requisites they know they do not
have.

Then there is a big class of applications where the data on the ramdisk
can be reconstructed, it is just a pain and reduces uptime. These are
potential ramback users, and in fact I will be one of those, using it
on my kernel hacking partition.

What I'm thinking about is that considering the fact that storage
technologies are moving towards SSD (and I think 2008 will be the
year of SSD), you should implement ordered writes (I've not said
write through) since there's no seek time on those devices. Thus
you will have the speed of RAM with the reliability of a properly
synced FS. If your system crashes once a week, it will not be a
problem anymore.

There will be a whole bunch of patches from me that are SSD oriented,
over time. The fact is, enterprise scale ramdisks are here now, while
enterprise scale flash is not. Getting close, but not here. And flash
does not approach the write performance of RAM, not now and probably
not ever.

My goal is not to replace RAM with flash, but disk with flash. You are
against ordered writes for a performance reason. Use SSD instead of
hard drives and it will be as fast as sequential writes. Also, when
you say that enterprise scale flash is not there, I don't agree. You
can already afford hundreds of gigs of flash in 3,5" form factor. An
1.6 TB SSD has even been presented at CES2008, with sales announced
for Q3. So clearly this will replace your hard drives soon, very soon.
Even if it costs $5k, that's a very acceptable solution to replace a
disk in a RAM-speed appliance.

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [ANNOUNCE] Ramback: faster than a speeding bullet
    ... What I mean is that in a PC, RAM contents are very fragile: ... So please don't underestimate the reliability of a PC. ... I never spoke about waiting for disk transactions. ... You keep the RAM for the speed, and use flash ...
    (Linux-Kernel)
  • Re: Performance and Flash Pipelining on TI 28F12 DSPs
    ... > of "critical code" we could move to RAM. ... > from internal flash? ... Since the external RAM is as big as the internal flash, ... the timers and all other interrupts are shut off, ...
    (comp.dsp)
  • Re: [ANNOUNCE] Ramback: faster than a speeding bullet
    ... The fact is, enterprise scale ramdisks are here now, while ... enterprise scale flash is not. ... does not approach the write performance of RAM, ... My goal is not to replace RAM with flash, but disk with flash. ...
    (Linux-Kernel)
  • XIP vs RAM
    ... Maybe the system can even get away with the next small size RAM ... Does anyone know if/what the premimum of the "K" Strata FLASH is? ... Also what are the steps needed to transition to a XIP OS? ... >>> My bootloader create a BINFS partition and an EXTENDED partition on ...
    (microsoft.public.windowsce.platbuilder)
  • Re: Relocate from nor to ddr CE 5.0
    ... programmed into flash. ... but the image info says it belongs to ram. ... Your bootloader needs to have code that recognizes if the image is ... blt CODEINRAM ...
    (microsoft.public.windowsce.platbuilder)