[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: archivemail screwed up (some of) my backups



Hi,

steve wrote:
> gzip: mail_archive.gz: invalid compressed data--crc error

It seems that
  gunzip <file.gz
is more tolerant than
  gunzip file.gz

(Most probably because it does not peek ahead to the end of the stream.
 With <file.gz it would be well able to perform random access, but not
 with cat file.gx | gunzip.)


I tested with a human readable text in file x:

  gzip x >xx.gz

which yields more than 1000 bytes. Then i damaged it:

  dd if=/dev/zero bs=1 count=1 of=/u/x.gz seek=1000 conv=notrunc

and tried to uncompress

  gunzip xx.gz

which yields

  gzip: xx.gz: invalid compressed data--crc error
  gzip: xx.gz: invalid compressed data--length error

On the other hand

   gunzip <xx.gz >xx

yields the start part of the human readable text and begins to
show garbled text after about 3000 bytes of uncompressed output.

So depending on where the bad spot is in your .gz file, you
might be able to retrieve some of it. After the bad spot, there
is few hope to retrieve more valid data.
The main problem is to locate the byte address where the output
begins to be invalid. Not hard with human readable text. But
quite a challenge with binary data stored as base64 or uuencode.

-------------------------------------------------------------------

> should I buy a couple of
> handkerchiefs and start a period of mourning for my lost messages?

A few handkerchiefs seem appropriate.

But you should not blame the archiver program before you have
outruled that your storage system spoiled the data.
Especially since unreliable storage would be a much more severe
problem than a buggy archiver.

(Further: gzip compression is normally done via zlib, which is
 supposed to do it right. So if the archiver can produce some
 uncompressable .gz, then it it probably does its part of the job
correctly.)


Have a nice day :)

Thomas


Reply to: