[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Determining the usefulness of compression



* Charlie Reiman (creiman@kefta.com) [021204 18:26]:
> > -----Original Message-----
> > From: Rob VanFleet [mailto:rvf@linux.wku.edu]
> > Sent: Wednesday, December 04, 2002 5:40 PM
> >
> > I am writing a script that will compress certain files passed to it
> > (well, that's a part of the script) and I was wondering if there was a
> > simple way to determine if a file is worth compressing or not.  I know
> > that with some very small files, compression actually increases the file
> > size.  Should I just look at the file size and only compress if over a
> > certain size or is there a more efficient method?

> Not really. The best (indeed, only 100% accurate) way to determine if a file
> is compressible is to compress it. That doesn't mean you can't use some good
> heuristics. Good ones are:
> 
> filename suffixes (never bother with gz, tgz, bz2, zip, jpg, jpeg, gif, z,
> Z, mpg, mpeg, avi, wav, mov....)

Huh? some AVIs and all WAVs are uncompressed, and will benefit
enormously from compression.  The theory here is correct, though:
don't try to compress already-compressed data; it won't work.

> compress a sample of the file (first and last 4k blocks maybe?)
> test for magic numbers that indicate compressed types
> Don't bother for files less than 1k

Or tar them up and compress the whole thing.  If you have a lot of
little text files, or a lot of little wavs, you'll save a lot of space
with a compressed tarball.

good times,
Vineet
-- 
http://www.doorstop.net/
-- 
						--Nick Moffitt
A: No.
Q: Should I include quotations after my reply?

Attachment: pgpmv0XkXI_1b.pgp
Description: PGP signature


Reply to: