Re: Setup problems - fdisk, overlapping sectors?

From: Enrique Perez-Terron (enrio_at_online.no)
Date: 10/02/05


Date: Sun, 02 Oct 2005 04:38:53 +0200

On Sun, 02 Oct 2005 02:10:51 +0200, <matthewt@gmail.com> wrote:

> It's a small appliace type machine running a Via processor. I have
> built several of these w/ 20gig drives - no problems...

Since you fail to quote the earlier messages I have to gather the
information from three messages.

Ummm...

You have built several machines of the same type,
     o Appliance type
     o running a Via processor
     o with 20 Gb drives, except two have 40 Gb drives instead

The 40 Gb drives are Samsung 2.5".

The two with 40 Gb behave differently.

You cannot boot those two from CD without setting ide=nodma on the
kernel boot command line, do I get it right? (What are the symptoms
when you don't supply ide=nodma? Can we infer that the disk interface
is traditional IDE?

You use fdisk => meaning you have booted some kernel and are
running fdisk under that kernel. With ide=nodma.

You set up one or more partitions, then hit "w" and return.

Fdisk queries the kernel about the number of bytes per sector
and the total capacity in bytes of the disk, and uses llseek()
and read() to read the second and the last sector of the disk.

In my test case, the disk had no extended partition.

Fdisk uses llseek() and write() to write a 512-byte block to
the start of the disk, calls sync(), waits for two seconds,
calls an ioctl() to make the kernel reread the partition table,
and repeats sync()-sleep()-ioctl(), sleeps for another four seconds,
and exits.

If you changed any logical partitions fdisk probably called llseek
to reach the proper sector, read the sector, modified the memory
copy of the sector, called llseek() again to access the same sector
again, then wrote the modified sector back. Repeat for each partition
modified. Or perhaps for all logical partitions.

Now you restart fdisk, but the partition data do not correspond to
the data written.

The exact nature of the differences between written and subsequently
read is perhaps not so important as noting that the contenst of one
or more sectors have changed.

Just for the record, the number of partitions is the same...
as before? as written? I guess the disks had empty partition
tables when you started, so you mean the same as written.
Some partition(s) have changed type to ntfs (where ext3?)
Partition start (first sector) overlap ... overlap what? The end
of previous partition? If the disks had empty partition
tables when you started, it means the sectors have been written.
Number of partitions read is the same as written.

We have some options.

Did the write hit the disk? Or did the kernel keep it in
its buffercache?

Block device reads and writes go through the buffercache.
Sync should force out everything. Is there any possibility that
something delayed the write so much that the call to reread the
partition table clobbered unwritten buffercache blocks with the
old data from the disk?

I don't know if sync() waits until the data are actually safely on
the platter. I *believe* it does.

Do some hardware conditions corrupt the data on the way to the disk?
Do some hardware conditions corrupt the data being read from the disk?

Are there kernel bugs that give timing errors or other corrupting
errors when using ide=nodma? Are there memory errors?

Could there be electrical problems, with the disks drawing too much
current and making the power drop below what the ram likes?
Do you have any means of observing that?

You probably have mostly readonly filesystems as long as you
are running from a CD. But /dev is a tmpfs, so if you have some
memory to spare, you can hold short files there. Can you
do

   dd bs=512 count=1 if=/dev/hda (or whatever) of=/dev/mbr-before

then run fdisk, and from the advanced menu hexdump the partition
table, then partition as it should be, hexdump again, save the
dumps to /dev/something, and finally repeat the dd command, but
with of=/dev/mbr-after. Then od -t x1 /dev/mbr-before and
see if there are any pattern in the differences. I cannot imagine
you can do much more from software except debug the kernel.

The rest would be fairly lowlevel properties of the disks and the
chips on the appliance.

-Enrique
of=/



Relevant Pages

  • Re: Linux community software-update-anarchy polemic
    ... Remember, I'm just a monkey. ... That implies that if you have one disk per partition, ... Trust the kernel or don't trust the kernel, but either way, both ...
    (comp.os.linux.misc)
  • RE: Change Tracks Per Sector on system Partition
    ... mirroring system partitions of a cluster to an offsite SAN due to differing ... We cannot physically read the file system boot sector usually at ... If your software mirror a Windows NT/2000/2003 operating system partition ... another disk that is translated as 32spt, ...
    (microsoft.public.windows.server.setup)
  • RE: Change Tracks Per Sector on system Partition
    ... >Subject: RE: Change Tracks Per Sector on system Partition ... We cannot physically read the file system boot sector usually at ... >another disk that is translated as 32spt, ...
    (microsoft.public.windows.server.setup)
  • Re: Hard Drive Issues
    ... I'm assuming that particular sector on the drive is dying, ... Looks like you disk is on its way out, from the look of the above ... bsdlabel and newfs the new disk the way you want it. ... Ignore all the stuff above where it displays the partition information. ...
    (freebsd-questions)
  • Re: [opensuse] fdisk calculations
    ... I'm revising the partition HOWTO, ... Disk identifier: 0x04030201 ... The Cylinder / Head / Sector concept is an anachronism from the 1980's ... hard drives configured in the factory to have one alignment or the ...
    (SuSE)