Re: Filesystem corrupts after power failure



On Thursday 31 August 2006 16:32, Markvr stood up and addressed the masses
in /comp.os.linux.misc/ as follows...:

I've seen several linux boxes get their file system corrupted after a
sudden power failure. Usually the partitions and the data is still
there, but for whatever reason it won't mount and boot correctly and
I've had to end up rebuilding the box from scratch which is a pain in
the arse as they aren't always on site.

This is normal because the kernel keeps certain file handlers open during
normal operation. An unexpected shutdown due to a sudden loss of power
will therefore not synchronize the buffers with the disk and will not close
all files.

I'm not a linux expert, but it seems that sometimes the GRUB gets
stuffed, or the partitions lose their labels or whatever so they won't
mount properly.

GRUB requires a filesystem to read from, as opposed to LILO, which reads a
binary logical block address to find its second stage loader and the
kernels. Therefore, if the filesystem is corrupted, GRUB is clueless.

Linux seems to handle power failures far more poorly than Win 2K /XP as
I've know very few Windows boxes corrupt this much after a power failure.

That's because Windows already corrupts itself during normal operation. It
takes a power loss to obtain the same effect in GNU/Linux.

I've tried using the sync option in fstab but it slows the box down too
much (a simple file copy goes from 7sec to 1min23sec).

That all depends on what file you are trying to copy, and how your system is
laid out. A good server administrator also splits off several of the UNIX
directory structure onto separate partitions and has several of those
mounted read-only during normal operation.

We tend to use software raid if that has issues with power failures?

Possibly more so than a non-RAID set-up.

At the moment we're using ext3, is there a more reliable filesystem that
will handle power failures better?

No sufficiently advanced filesystem is intended to cope with power outages,
but I think /ext3/ may still be the most reliable in that respect. SGI's
XFS is also a very good filesystem on account of power outage risks, but
XFS caches aggressively and only commits the data to the physical disk at
the last moment to provide for the highest possible speed and efficiency.

Therefore a power failure will make XFS lose all your most recent - and
often not so recent; this depends on the kernel and XFS driver version -
data.

Would using "sync" for the /boot and / partitions stop it from
corrupting the tables?

It is always advised to use /sync/ on the root filesystem for safety
reasons. As */boot,* */usr* and */opt* should be mounted read-only during
normal operation, /sync/ is irrelevant there, but may be preferred
over /async/ whenever those filesystems do need being written to.

These aren't massive enterprise systems, they're mainly firewalls and
small mail servers, so is ReiserFS or any of the others better?

/reiserfs/ is known to sometimes suffer severe filesystem corruption at some
power loss occurrences - as opposed to XFS, which will only lose the data -
but in overall, it is very reliable.

Any advice is appreciated,

What you need is a UPS. Running a server without one is irresponsible. ;-)

--
With kind regards,

*Aragorn*
(registered GNU/Linux user #223157)
.



Relevant Pages

  • Re: Saving Objectstore to a WinCE OS device
    ... instead of storing those objects in RAM, youstorethem on "disk", flash ... filesystem, any persistent filesystem, as the root of the Windows CE ... have to worry about files going away when power is removed. ... structure, folders, and databases are stored in the objectstore. ...
    (microsoft.public.windowsce.embedded)
  • Re: Ext4 and the "30 second window of death"
    ... you're going to be burning a lot of power. ... Linux is still trying to catch up with Windows when it ... comes to battery life. ... But honestly, if we all start blaming the filesystem for it, I don't ...
    (Linux-Kernel)
  • Re: EXT4-ish "fixes" in UBIFS
    ... files even though power off. ... guarantee, use fsync() before renaming. ... their application when filesystem is changed. ... on rename which unlinks a direntry. ...
    (Linux-Kernel)
  • Filesystem stabilty
    ... We still have problems regarding the filesystem stability during multiple ... device after power levels fall below the required levels. ... Flash chips are 28F256J3C125 Intel strata and we have VPEN pins ...
    (microsoft.public.windowsce.platbuilder)
  • Re: Saving Objectstore to a WinCE OS device
    ... ANY filesystem can be used. ... the registry and mount it as the root and you'll be 80% of the way to ... have to worry about files going away when power is removed. ... (True Flat File System) ...
    (microsoft.public.windowsce.embedded)