Re: amd64 sata_nv (massive) memory corruption



2008/8/5 Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>:

I'm game. Care to guide me through? So: on every write, this
new device mapper module computes a checksum and stores
it somewhere. On every read, it computes a checksum and
compares to the stored value. Easy enough I guess.

Several hard parts:
-- where to store the checksums?

That is the million dollar question - plus you can argue it is the fs
that should do it. There is stuff crawling through the standards world to
provide a small per block additional info area on disk sectors.

My objection to fs-layer checksums (e.g. in some user-space
file system) is that it doesn't leverage the extra info that RAID
has. If a block is bad, RAID can probably fetch another one
that is good. You can't do this at the file-system level.

I assume I can layer device-mappers anywhere, right?
Layering one *underneath* md-raid would allow it to
reject/discard bad blocks, and then let the raid layer
try to find a good block somewhere else.

I assume that a device mapper can alter the number
of blocks-in to the number of blocks-out; that it doesn't
have to be 1-1. Then for every 10 sectors of data, it
would use 11 sectors of storage, one holding the
checksum. I'm very naive about how the block layer
works, so I don't know what snags there might be.

The downside of this is that the disk wouldn't be
naively readable unless the specific mapper module
was in place -- so one would need a superblock of
some sort indicating the type of checksumming used,
etc. Is there any "standardized" way of managing
superblocks for use by the device mapper? I guess
the encrypting dm has to store meta-information
somewhere, too, specifying what kind of encryption
was used. I'll look at that.

Yes. If you can figure out where to keep the checksums without ruining
performance

Heh. Unlikely. The act of checksumming will impact
performance. It should end up similar to the impact
from encryption (maybe not quite as bad), or comparable
to raid-5 (which computes various kinds of parity).

(and of course if there isn't one lurking in device mapper
world not yet submitted).

I'm googling, but I don't see anything. However, I now see,
for the first time, pending workd for 2.6.27 for a field in bio
called "blk_integrity". I cannot figure out if this work requires
special-whiz-bang disk drives to be purchased.

Also, it seems to be limited to 8 bytes of checksums per 512
byte block? This is reasonable for checksumming, I guess,
but one could get even fancier and run ECC-type sums, if
one could store, say, an addtional 50 bytes for every 512
bytes. I'm cc'ing Martin Petersen, the developer, for
comments.


--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: project on the final stage
    ... You can store your images in a binary memo field. ... IF ldFirstLaunch + 30 < DATE ... You never can protect people from writing to dbf files ... This checksum is generated and checked inside your program. ...
    (microsoft.public.fox.programmer.exchange)
  • Re: growfs - Using on mounten FS - planned?- TWE probs..
    ... >We store GB sized files and we checksum them and we check the checksums ... When you setup the RAID5 array, what stripe size did you use? ...
    (freebsd-current)
  • Event ID: 474
    ... using Veritas Backup Exec. ... store "Information Store The database page read ... The expected checksum was 3573088186 ... please restore the database from a previous backup." ...
    (microsoft.public.exchange2000.information.store)
  • Re: Accessing Tables with Shift Key disabled
    ... I do not wish each supplied version of the database to have to be uniquely ... Don't store 999. ... > where checksum() returns a checksum of the specified value, ... > basics of encryption, and checksumming, before you could do this. ...
    (microsoft.public.access.security)
  • Re: Checksums and file sizes
    ... >> usually look at the output and manually decide what to fix. ... >> Do you want to fix the checksum list or the files? ... `custom -V strict` compares your system's _modified_ files to the ... `custom -V thorough` is the command you're looking for. ...
    (comp.unix.sco.misc)