Re: tarball won't extract

From: Noi (
Date: 03/22/04

Date: Mon, 22 Mar 2004 07:19:38 GMT

On Sun, 21 Mar 2004 19:40:25 -0800, Rick Boucher thoughtfully wrote:

>> Have you tried gzip -df files.tar.gz to create the tar file
>> and then tar -xfv files.tar ?
> gzip -df files.tar.gz results in
> gzip: files.tar.gz: invalid compressed data--format violated
> zcat -d files.tar.gx
> zcat: oolfiles.tar.gz: invalid compressed data--format
> violated
> I might ad that the original drive with the data is dead as in will
> not power up. I am only working with the ftp'd tarball. Which was to
> be the backup in lieu of the hd failing.
> Rick

Well try google "invalid compressed data -format violated"
there's a few answers. Here's one:

                Recovering a damaged .gz file

Here is how you can try to recover your data if it is extremely
valuable to you. Do this only if it's worth it, because it will
certainly take a lot of time, and it is not even guaranteed to get
back all the data correctly. First, make copies of all the files you
still have, to avoid deleting them by mistake. Then work only on the
copies. You will also have to patch the gzip sources.

You must hope that all the bad sectors are somewhat grouped
together. You can recover the portion before the bad sectors, and you
may be able to recover some data after all bad sectors. You can't
recover data bewteen those bad sectors, unless they are very far apart
from each other.

To recover the portion before the bad sectors, just do:

  gunzip < damaged.gz > part1

gunzip will stop when it sees the bad data. All data in the file "part1"
is guaranteed to be correct, but of course the rest will be missing.
If the file "damaged" is a .tar file, you can recover some files with:

  gunzip < damaged.tar.gz | tar xvf -

gunzip and tar will complain at some point, but tar may have recovered
some files already.

Now let's try to recover something after the bad sectors. You first
have to find the boundary of the first undamaged compression block
after the damaged portion. The boundary is bit aligned. To find the
damaged portion, add
            fprintf(stderr, "bytes_in %ld\n", bytes_in);
            error("invalid compressed data--format violated");

in unzip.c. Then round bytes_in this to the next disk block boundary
and create a new .gz file by concatenating a valid .gz header and
the data believed to be undamaged. Then try repeatedly "gzip -t"
on the new .gz file, removing from 1 bit to 8*64K bits from the
compressed data portion, until you get a crc error instead of
a "format violated" error. At this point do

  gunzip < damaged.gz > damaged

The gzip CRC will always fail because you will miss some 'history',
but after some time, the history effect will be reduced and you might
be able to recover part of the data. You will have no guarantee that
the data will be correct except by manual inspection.

To get a valid .gz header, look at the file algorithm.doc in the gzip
distribution, or just copy the header from any valid .gz file. The
header ends at the zero terminated file name. To speed up the search
for a block header, the first 3 bits should be 0,0,1 (starting from
least significant bit) so that when aligned on a byte boundary you get
first_byte & 7 == 4. So you only have to test about 1/8 of all
possible bit alignments. Of course if your block was not byte aligned
you have to bit-shift the entire file.

As you can see, all this is not a trivial task, so you should attempt
it only if your data is very valuable. gzip 2.0 will have a new
blocksize option, allowing to recover easily all undamaged blocks
after the damaged portion.

Jean-loup Gailly

Relevant Pages

  • [SLE] Hard Drive Maintenance and Recovery Utility
    ... I just thought that I would toss in a good word for a hard drive utility ... that I used recently to recover some bad sectors on my hard drive. ... partition until I deleted it. ...
  • Re: A tale of woe (lost files)
    ... stuff on another disk. ... As long as the sectors holding the data haven't been touched, ... the recovery tool may recover every interim copy ... Purposeful erasure is the most successful, ...
  • Re: Knoppix
    ... Sometimes your first clue that a drive is going bad is ... EIDE controllers will remap bad sectors ... way to recover more data before you make a backup. ... But entirely missing the point. ...
  • Re: Reformatted, lost data. Can I get it back?
    ... Reformatting just rewrites master boot record, the sector map, renumbers the sectors and marks them as available, and so on. ... overwriting the data does not change 100% of the magnetic particles in the surface of the HD: ... Because traces of previous data remain, it's possible to recover the data, but it takes technology and expertise that are not easily available. ...
  • Fwd: How to recovery deleted files?
    ... How can I recover deleted files from my ext3 partition? ... have to do download the trial version of stellar phoneix raw from ... show all the files with the matched header. ...