[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: tar oddness



In foo.debian-user, you wrote:
> Hi all,
> 
> I was pondering on the problem of splitting output over multiple
> mountable media, and discovered the following:
> 
> $ tar czf foo elmfract
> $ tar czf - elmfract > bar
> $ ls -l foo bar
> -rw-r--r--   1 tgakem   users       10240 Mar 29 16:59 bar
> -rw-r--r--   1 tgakem   users        7950 Mar 29 16:59 foo
> 
> The file bar is a valid archive, but contains trailing garbage.  Any
> idea how this comes about?  It does not happen if the z flag to tar is
> not present.
> 
> I noticed this when I wrote a small program that takes input from stdin
> (through a pipe), and writes this to different files on different
> volumes of a mountable medium (say, floppies).

When writing to tape (or pipe) gzip will pad to the default block size.
The block size of tar is 10K

You can see a similar result by getting a file that zips to just over 10K.
I have one here called blah.

$ tar czf foo blah
$ tar czf - blah > bar
$ ls -l foo bar
-rw-rw-r--   1 mblevin  mblevin     20480 Mar 29 10:34 bar
-rw-rw-r--   1 mblevin  mblevin     11406 Mar 29 10:34 foo

If you want tar to be efficient for small file sizes, you need to change
the default block size... but you must also remember to change it during
your untar...

This is documented in the man pages for gzip (the padding) and tar (the
blocksize)

-Mitch


Reply to: