Re: interesting drive problem
From: m_f_h (no_one_at_example.com)
Date: Thu, 20 Nov 2003 04:50:23 -0500
Zap Radon wrote:
> I have a laptop drive that had a head-crash a while back. Not having the
> dough for a new drive, I simply partitioned around the bad spot and went
> on--at the time it was a fat32 setup with Win2k on it.
> Later I converted the usable parts to NTFS. Everything has worked fine
> I decided to take the plunge into Linux recently, and wiped the drive and
> attempted to set up RH 9, with a similar partitioning scheme. Ouch.
> First it simply wouldn't let me skip the bad area. Eventually I went back
> to windows to set up barrier partitions, and tried to put RH on either good
> partition. Then I attempted Mandrake, since its installer was more
> I've finally come to the conclusion that neither linux will tolerate any
> part of the drive--if it succeeds in installing, it has so many file errors
> that it won't run.
> Yet, with NTFS, I can't even detect any bad blocks, and it runs flawlessly
> (as flawlessly as windows ever runs, at least.)
> Anyone know what's going on?
Mandrake is a fork of Red Hat, and while they have grown apart in
many ways, I suspect that there is little difference in the way they
behave toward problem disks.
As I understand partitioning under Linux and Windows, there are
differences that may explain what is happening. I remember reading
somewhere that Linux rigidly enforces the rule that partitions begin
at the beginning of a cylinder, and Windows does not. So, you may
have thought you partitioned around a bad spot, but you may not have
Windows' FORMAT command is not a reliable test of the hard disk, nor
does it destroy much data. When Windows formats a drive, it tests
the surface for readability only over the cylinders that you
specify, and actually does write tests on only about 1% of that
area. Just because you are able to format all or part of a drive
does not mean that Windows has fixed the disk's problems; it just
means that when the bad spots are needed for file storage, only then
will Windows complain, and by then it may be too late to do anything
to save your data.
Note: The reason Windows users are told that formatting makes data
irretrievable is that the partition info area is changed, making
recovery impossible. What they are not told is that it is highly
likely that the data is still there, but that Windows can't find it.
Linux/Unix users know how easy it is to examine every byte on a
drive even when the partition table has been zeroed out. When you
get up and running Linux, try using "gpart" or typing in a console
window (as root)
dd if=/dev/hda | grep "Credit Card Number"
(insert your credit card number inside the quotes) and you will see
what I mean. Keep that in mind if you ever sell or give away any
kind of magnetic media. When I clean a disk, I zero out each and
every byte using Linux's badblocks program.
Linux, on the other hand, imposes higher standards during
installation, on the "Better Let Them Know Now Than Later Theory"
and will refuse to install completely if it finds any bad spots that
can't be remapped. I suspect that the head crash you experienced may
have used up your cache of spare blocks that all drives come with.
Based on my experience, Linux is less trusting of hard drives in
this condition than Windows is.
If you have not done so already, I suggest two things:
1. On many distros, there is an option to test the integrity of the
installation: take advantage of that and test the CDs. If you get a
failure notice, don't assume it's the media; the problem could be a
failing CD drive. I once had a situation where the integrity test
insisted that the CD was bad, but the point of failure changed every
time I ran the test. After replacing the CD drive, all went well. In
your case, replacing a laptop CD drive may be out of your price
range; in that case I'd look into a network install, either by using
another computer's CD drive or by getting the install files from the
2. However you manage to get the install files to your laptop,
always, always select the option offered when partitioning the disk
to test for bad blocks. When Linux performs such a test, it creates
a file of bad spots that the file system uses to keep from storing
data to, and this file may be of any size. Under Windows, bad block
info is stored in the FAT area, and there is a hard limit to the
number of bad blocks it will keep track of.
Keep in mind that the file system you select may be an issue. Ext2
and Ext3 file system creation allows for block checking at the time
of creation. Reiser file systems (version 3.x) do not check for bad
spots at the time of creation; rather, when bad spots are found,
they are noted in the file system's journal. I don't know how other
file systems, such as XFS, handle this issue.
Regardless of which Linux file system you use, a message is sent the
/var/log/messages file that a bad block has been found and whether
corrective action was taken.
Don't dismiss the possibility that your problem may not be with the
hard disk at all. Everything said above about how stringently
Windows and Linux tests hard disks also apply to RAM.
Windows relies on BIOS to tell it if the RAM is OK, and BIOS tests
of RAM are just "Are You There" type tests. Linux is more rigorous;
it actually runs several timing tests to figure out something called
BogoMIPS. As a result, if Linux boots and runs, that's a good sign
that the RAM is good. When Windows boots, all you can say for sure
is that it hasn't encountered a bad RAM spot YET.
Before spending any more time on the installation process, download
and install on a floppy MEMTEST86 v. 3.0, then boot off that floppy.
MEMTEST86 tests i86 RAM using 11 different tests and is considered
by many to be the most reliable in situ RAM testers available. The
only way to test RAM better is in a stand-alone RAM testing machine.
I once bought RAM that I could not boot with and that MEMTEST
failed. So I returned the chip to the merchant. When I got no
response, I called them and they gave me what sounded like a canned
answer, "The RAM tests OK here. No replacement." When I told them
that their chip failed MEMTEST, I got my replacement chip in three
On the other hand, another merchant sent me a chip that failed one
of the more obscure of MEMTEST's test, but I could boot and run with
it for weeks at time. The merchant told me that he had mixed results
with MEMTEST and was not willing to take back the chip unless his
stand-alone tester indicated that the chip was bad. I kept the chip
and have not had any regrets.
If MEMTEST finds bad RAM, it'll also tell you exactly where the bad
spots are. Armed with that info, you may be able to live with the
problem. Do a Google search for "Linux BootPromt-HowTo" and in that
document look for the discussion on mem=exactmap. As you might
expect (by now), there is no comparable way to do this under Windows.
Finally, check out http://www.linux-laptop.net/.