Re: Linux community software-update-anarchy polemic

From: P.T. Breuer (ptb_at_oboe.it.uc3m.es)
Date: 01/18/04


Date: Sun, 18 Jan 2004 16:30:22 GMT

Anonymous Coward <acoward@mail.ru> wrote:
> ptb@oboe.it.uc3m.es (P.T. Breuer) wrote in message news:<p7ibub.m5u.ln@news.it.uc3m.es>...
> > Then use something else. Parted is not exactly a standard partition
> > editor, and will not have seen anywhere near the usage over the years
> > as fdisk, which has been in use on linux since day one a decade ago (and
> > which has forever been deprecated in favor of cfdisk, which I hate).

> I'd never heard of parted before using Knoppix, and I used it only
> because it's what knoppix-installer presented to me.

It's a partition resizer and manager, rather than a partition editor.
That is, it works on the contents of a partition on disk, raher than
(only) on the partition table (in the first 512 bytes of the disk,
mainly, but also in sundry references scattered at points in other
partition records on disks).

  NAME
         GNU Parted - a partition manipulation program

You are free to look it up :-).

> fdisk has been
> what I've always used for partitioning in the past. I'd never heard of
> cfdisk before you mentioned it.

It's what the author of fdisk has been telling everyone to use for ages,
because he refuses to "fix" fdisk. Pfaugh ... fdisk has nothing hugely
wrong with it in particular - its bugs are known, and I prefer that to a
utility whose bugs are unknown.

> past couple years, and assumed that if Knoppix was presenting parted
> to me, it was because it had become the standard tool, obsoleting

Assuming that somebody else knows better is always a mistake. I just
made that one with the shoe-mender. I handed him a perfectly good set
of shoes for a re-sole, assuming he knows best, and they came back with
the old sole and heel ripped off and a new, darker pair glued and
hammered on, with glue that does not stick. The sole is completely
unsuitable, being far stiffer than the leather of the rest of the shoe,
causing the join to tear immediately. I have never seen such a messup.
I am deeply angry. I wanted him to add a piece of plastic on to the
bottom, nothing more! He's ruined a wonderful pair of shoes with
completely incompetent behaviour.

> fdisk. But Knoppix's (and Linux distributions' in general) usage of
> known-buggy software, especially without warning the user, was the

One does not know of a bug until one discovers it. If you have found
one, tell the author. I don't know of any bugs in parted (I know
of some horrible misfeatures). The man page says:

  REPORTING BUGS
         Report bugs to <bug-parted@gnu.org>

> main point of my original post anyway. (Simply stamping the entire
> distribution with "use at your own risk" is not the correct level of

Of course it is. Your responsibility entirely.

> granularity.) It shouldn't be necessary for me to know that
> partitioning tools other than the distribution's default even exist.

I'm afraid that everyone knows that, and necessary is as necessity
feels. I think the choice of parted as a partitioning tool is rather
daring and innovative, myself. Very interesting. I imagine it's
because it's a gnu program that Mr. Knopp chose to use it. I think it's
a good idea - it'll get more use and thus more bug reports.

> A brief bit of research shows "fdisk is a buggy program that does
> fuzzy things - usually it happens to produce reasonable results. Its

Sure - it works pretty near perfectly, in other words.

> single advantage is that it has some support for BSD disk labels and
> other non-DOS partition tables. Avoid it if you can." at

The author has always said that. He's always been nuts. Sure, it makes
choices that are pretty near standard the whole time - I don't call
that buggy! I call that "solid". It can also mess the partition table
when one of the partitions straddles a 1024 cyl boundary (as I recall).
Shrug. Don't do that then. That's no problem.

> http://www.die.net/doc/linux/man/man8/fdisk.8.html Not encouraging.
> The same page says "cfdisk is a beautiful program that has strict
> requirements on the partition tables it accepts, and produces high

In other words it's almost completely useless in practice.

> quality partition tables. Use it if you can." Why do you hate cfdisk?

Because it has an incomprehensible interface, for only one thing. A
sort of silly text-mode gui. Fdisk, on the other hand, has a perfectly
comprehensible interface, into which I type my instructions and it
types back "yes, master", or something to that effect.

> > Choose your utility according to the match between its bugs and your
> > bugs.

> In general, without expending excessive time on research, I don't know
> what the bugs are, or whether they even exist. That's what I was

Then nobody can help you, because bugs are ubiquitous.

> complaining about. A standard, universally used mechanism for marking,
> after release, released software as buggy could prevent problems such

Buggy often is in the eyes of the beholder. And really depends to what
you apply the app to. Parted works for most everyone else - it is not
"buggy" for them. Fdisk is perfectly fine for me. On occasion I will
use parted to see what its opinion of life the universe and so on is
like. Usually it refuses to work and I then spend some time with tar
and sfdisk (not fdisk, nor cfdisk), trimming things until its
pernicketty idea of what reality should be like matches my partition
table.

Shrug.

> as Knoppix shipping with a known-buggy parted 1.6.4, as I discussed in
> my original post.

What's really buggy about it? Not working for you might not be
considered a real big bug! All programs have bugs, after all. Yours,
mine, everyones. So?

> > > > In any case, it's not a good idea to have One Humungous
> > > > Partition, despite what the Micros**t fans tell you.
> > > Even if it's presently good practical advice to have multiple smaller
> > > partitions, this is due to the way the software is currently written;
> >
> > Nonsense. Are you completely unaware of disks which break,
> Having multiple disks addresses this problem.

It doesn't, unless you have one disk per "partition". Which disk will
be your readonly disk? Which disk will be your variable data high
frequency change disk? Which disk will be your low frequency change
disk? Which disk will bear your log files? Which disk will bear your
mail spool? Which disk will be formatted with a journalling file
system? Which disk will be running sync? Which disk will be running
noatime? Etc.

> My comment was in the
> context of multiple partitions on a single disk. (As a side note, with
> regards to multiple disks, although presently popular filesystems do
> require multiple partitions if multiple disks are used, since those
> filesystems' partitions can't span disks,

Partitions can span disks perfectly easily - this is called "logical
volume management", and linux has had it forever. The idea is that you
use a kernel logical layer above the real disks which presents a single
logical area to be split up into "partitions" which may themselves span
several real disks (but you don't know).

The horror of this idea is that when you lose a real disk, you lose an
unknown part of your file system, possiboly several file systems. It
makes your system more vulnerable, not less! And you never know just
where the thing was physically, so you try fixing it!

Eccccccch. Arrrrrrrrgh.. I *hate* lvm. A broken lvm is a nightmare
Several of them simultaneously.

> this also is due merely to
> the present design of the filesystems, and isn't a fundamental rule.

That's lack of experience talking and sounding big! :-).

> But this also is irrelevant to the issues which I reported in my
> original post.)

[snip, I think]

> > software
> > which goes wild,

> The kernel can hose anything it wants, and partition barriers can't

But it doesn't. The kernel does not initiate writes to disk. It's a
service, not a client. The clients are what do the wrecking - the
kernel merely says "OK". And the kernel works fine - bugs are found
quickly, because everyone uses the service. Applications OTOH, as you
have discovered, can have many more bugs, because they are not used all
the time, and have much less exercise in general! Your parted is used
only one time in the lifetime of every installation. The kernel is used
every single moment by every single installation.

> stop it.

Anything that caused the kernel to overstep a partition barrier would
have had to overwrite the section in the block subsystem that checks
each request for conformance to the registered partition boundaries. If
it overwrote that section of code, all your access would be dead. You
would not be able to do anything.

If a cosmic ray flipped the right bit there, you would get inversion -
the kernel could only write outside the partition :-). But it wouldn't.
because requests would be aimed at the partition. You might, if you
were lucky, get a bit changed in the partition table image in memory,
so that the partition offset was registered wrong. THAT would produce
the effect you want.

But that bit would haveto be flipped quite precisely. It's a difficult
shot - it's much more likely that the cosmic ray will flip a bit in
some applicatiion instead!

> Outside the kernel, intrapartition barriers (filesystem
> security attibutes) are the standard means of protection.

I have no idea .. are you suggesting making a directory hierarchy
readonly? How would you then mark it noatime and sync? Or journalled?
The reason some things are partition rather than file attributes is
because it is unreasonable to engineer it any other way! It would mean
coalescing the ideas of file system and file to a much greater extent.
The result would be phenomenally inefficient. Imagine having to look up
the target of every write before you could decide if you should journal
it or not! How would you deal with races between mkdir and write? Etc.

> > users who have no social responsibility,

> Again, intra-, not inter-partition barriers address this.

I'm afraid it's more subtle. If /tmp is on the same file system as root,
then even if every user is quotaed, they can still get together and
together fill /tmp, thus filling the root file system. If /tmp
(/var/tmp) is on the same file system as /var/spool, then they can stop
mail by the same mechanism. And if they are so quotaed they can't even
do it as a group, they can mail each other stuff continuously just to
stop mail, AND to fill /var, maybe stopping logging. Etc etc etc.

> > and the
> > myriad of other damages that make partitioning a simple matter of
> > common sense?
> Examples? If you're referring to filesystem corruption due to kernel

There are generally two or three random corrupions on the file system
every day in any set of 20 or so machines. If that system is the root
file system, either the machine breaks and throws a hissy fit at next
fsck, possibly deciding to clear the root inode, or the corruption goes
unnoticed until things get even more interesting, or it goes blooie
with "odd effects" immediately.

I have learned by long experience to keep the root files system small
and duplicate it elsewhere on disk. That way you can always boot
and log in easily when /var/run and /var/adm decide to disappear,
thanks to gamma ray absorption, because the error will not be on the
root file system.

I have also learned through experience to keep /usr readonly. Things
get so much less corrupted when there is nothing being written anywhere
near them on the disk.

And then there are the bouncing screaming disk head crashes, that tear
of strips at the same area every cylinder for about 60MB worth (says he
remembering the last time he had to quarantine a disk zone, by hand,
after recovering what he could through 8 hours of manually attended
fsck, with the heads painfully retrying every unreadable sector 10
times while I noted down the sector number for later reference).

> malfunction, remember that the kernel can hose anything it wants on

It can, but it doesn't. I've explained why.

> any partition. If you're referring to filesystem corruption due to
> power loss, partition barriers can't provide any protection (or a

They do provide it. For one thing, if you are not writing there, then a
missed write cannot be a problem. Once written - always readable.
Read-only partitions are a godsend. Aaaaaaaah. Bliss.

> level of performance at a given level of protection for that matter)
> that can't be provided by a single partition with a filesystem which
> uses sufficiently deterministic write ordering (e.g. a journaling
> filesystem).

Journalled filesystems are among the most fragile in practice to
hardware (not software) corruption. I have learned never to journal
/var, because the high i/o at the journal seems to wear a hole in the
disk under it! I have learned to always have the journal external to
the partition so that when it breaks, I can move it to another 8MB spot
on the disk and nobody is hurt. You try quarrantining an 8MB unwritable
unreadable hole in the middle of a partition!

And then there is the issue of creeping corruption, which goes
unchecked.

> > > the software could be changed so that one large partition would not be
> >
> > Hilariously ignorant of real life, cosmic rays,

> Redundancy and ECC codes protect against small random errors.

But they don't - ecc memory is not generally used and is expensive
(therefore not bought). Redundancy requires a separate partition.
And then you get the issue of copying and replicating the errors, if
you are talking abut raid!

> Inter-partition redundancy provides no protection that can't be
> provided intra-partition.

Oh yes it does. Try turning of metadata read time updates per
directory. Yes - even a readonly directory on a stanadard writable file
system will have the time it was last read continuously written into it.
And that's only one example. See above for more and more and more ..

> > software bugs,
> See above.

You "see above"!

> > maladminsitration, human error,
> As root, rm -rf / can cross partition barriers.

Nope. Partitions are mounted readonly, so the rm will fail at the
boundary.

> > and a host of other things that happen
> > with probability one over a surprisingly short interval of time.

> More examples?

Don't you have enough already? People adding service and forgetting to
add a stanza to logroatet their logs every day, so that the log fills
up the partition after a week, and one thousand lusers start
telephoning your office to say they can't log into X windows. Umm ...
warez trying to fill ~ftp/upload to bursting, umm ... the monitor
software running unattended for two years as root finally gets to the
point at which its history files are so large that the real users on the
system have no room to do anything, and start complaining that they
are "losing mail" (yo! well read the message that says "mail file
corrupted, mail not written, do not quit editor until saved" then).

Etc. etc. etc. etc. etc. etc. etc. etc.

> > > any less reliable than multiple smaller ones. But in any case this is
> > > not relevant to the issues which I reported.
> >
> > It's relevant to us [a]ssessing your expertise,
> True, but this is irrelevant to the issues which I reported.

It is. What to you seems a big deal may not be a big deal to someone
else. I pointed out that you can use something else to make partitions
(I'd never trust any installer to do such a thing for me!). You may say
that you are just acting like a dumb monkey and reporting what things
look like to a dumb monkey. And I'd say that's true. Yes - it's great
that you can speak up about it. Most dumb monkeys can't. That's a real
service you are performing, and I mean it! It's useful to know what
things look like to a dumb monkey.

But to someone with more experience or skill, it can look rather like
"oi! whenever I open this cupboard door the handle pokes me in the
eye". Well, Don't Do That Then (TM). Stand to the left next time. Or
ask somebody else to open it. Don't scream about it - maybe it's not an
issue worth screaming about. Maybe only 0.0001% of the population
manage to stick themselves in the eye with that cupboard door in the
first place. Maybe the cupboard door said "use at your own risk" on it!

> > and thus the weight that
> > should be placed on anything you say.

> False. If you were to discover one day that your random number
> generator had just output a polynomial-time factoring algorithm, would
> you ignore the algorithm due to the source?

I would ignore the report, especially if the source proved in its
introduction or discussion that it were not cognizant of the issues, nor
of other peoples perceptions of them. Thanks - but I receive enough
such missives as it is. I can distinguish "crank" perfectly easily.
A random number generator would provide no convincing argument at
all, and hence would be ignored directly.

> Your sense of my expertise
> can reasonably determine whether you bother to read what I write, but
> after you read what I write, my expertise and your sense of it become
> irrelevant, with only that which was written remaining relevant.

This is not quite true. Issues are being discussed, and you must show
that you understand the background to them, and other peoples
arguments, before asserting your own opinions and arguments.

> Anyway, please remember that my comment was only that a single
> partition can be just as reliable as multiple ones if the software is
> so written, not that contemporary software actually is so written. And

Unfortunately, what you mean is "if all software is so written", not "if
software is so written". Partitions are only a virtuality and it is
precisely software (in the form of the kernel) that forces their
observance. Indeed, a "partition" is precisely a region of the disk
that the kernel treats in a particular uniform way, imposing the
boundaries on applications. So what you say is tautologous if by
"software" you include the kernel. But if you do not include it, then
it becomes false, because "ALL" software must then be written to observe
the boundaries that the kernel no longer imposes.

> please remember that it's completely irrelevant to the issues raised
> in my original post, issues which no one has yet addressed.

Because I don't think they deserve the name "issues". You had a problem
with parted? Well, tell the author. That's the end of it.

If you want to raise an issue, an appropriate one might be how one can
deliver feedback to authors automatically, and spread an awareness of
the development state of the software to potential users.

Peter



Relevant Pages

  • Re: Is FAT32 format gone?
    ... is nolimit> to partition size re FAT32-formatted partitions. ... FAT32 file system in a WinXPenvironment,> he or she can do ... Disk> Management utility, ...
    (microsoft.public.windowsxp.general)
  • Re: Chkdsk errors in SP2 installation
    ... Windows is on the first partition (that's the one I ... > the Error Checker in the disk Properties/Tools says the disk is OK. ... >>Windows found problems with the file system. ...
    (microsoft.public.windowsxp.general)
  • Re: Linux community software-update-anarchy polemic
    ... Remember, I'm just a monkey. ... That implies that if you have one disk per partition, ... Trust the kernel or don't trust the kernel, but either way, both ...
    (comp.os.linux.misc)
  • Re: EXL Cartridge Tape Reader in New England
    ... 440 words or data with an 8-word header containing 7 fields. ... In the Prime file system, records of a file are doubly linked, so the ... 65,536 disk records. ... bit integers and the maximum partition size increased to a theoretical ...
    (comp.sys.prime)
  • Re: Dont Fix It if it is Not Broken (was Looking at Macs...)
    ... partition as the OS, makes for a horrendously bigger partition that needs more time consuming maintenance. ... Very rarely should you actually need to manually perform disk maintanence. ... OSX, and the HFS+ file system, and even the partition structures. ...
    (comp.sys.mac.advocacy)