Re: Serious linux kernel bug ??

From: Måns Rullgård (mru_at_users.sourceforge.net)
Date: 08/30/03


Date: Sat, 30 Aug 2003 01:36:50 +0200

Wim Valcke <Wim.Valcke@skynet.be> writes:

> I am working for a company who makes test equipment for the automotive
> industry. We are using Linux several years for the test equipment and it
> has shown a very robust OS for use in the industry. A couple of months ago
> we discovered a very weird thing. Our field engineers said that they
> couldn't start anymore the software. After investigation the problem was a
> corrupt binary. Very strange, as no process is writing to its own binary !
> How did it became corrupt ? We found out that the (random) file corruption
> occurs somewhere between the start and a stop of the software. So the files
> get corrupt when the system is running. So far all the files that went
> corrupt were files belonging to our software. All software is running under
> a normal user account but still files (belonging to the application) who
> were only writable by root became corrupt also ! Also was the file access
> date of the corrupt file not changed !! Weeks later during a software test
> i could't start anymore our software at all. After investigation i
> discovered that the libc++ libs went completely corrupt. I checked the fs
> and saw numerous filesystem errors. How can a filesystem become corrupt on
> a running system ?? (no powerdown or unclean shutdown was done!)
> We suspect our software responsible for this trouble (we do not have this
> behaviour with other applications which are running on the same Linux
> system) , but how can a bunch of normal user processes create filesystem
> corruption ??? This must be a kernel bug !

It sounds like a typical failed disk to me. If, for whatever reason,
the heads have touched the disks, there will be small particles spread
over the surface of the disk. Whenever you read a location with scrap
on it, the surface will be damaged there, possibly spreading even more
particles. When this has gone on for a while you start noticing
errors. Unfortunately, the scrap doesn't care about user permissions
of the files it's sitting on.

-- 
Måns Rullgård
mru@users.sf.net


Relevant Pages