[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: Determining the usefulness of compression




> -----Original Message-----
> From: Rob VanFleet [mailto:rvf@linux.wku.edu]
> Sent: Wednesday, December 04, 2002 5:40 PM
> To: debian-user@lists.debian.org
> Subject: Determining the usefulness of compression
>
>
> I am writing a script that will compress certain files passed to it
> (well, that's a part of the script) and I was wondering if there was a
> simple way to determine if a file is worth compressing or not.  I know
> that with some very small files, compression actually increases the file
> size.  Should I just look at the file size and only compress if over a
> certain size or is there a more efficient method?
>
> Thanks,
> Rob

Not really. The best (indeed, only 100% accurate) way to determine if a file
is compressible is to compress it. That doesn't mean you can't use some good
heuristics. Good ones are:

filename suffixes (never bother with gz, tgz, bz2, zip, jpg, jpeg, gif, z,
Z, mpg, mpeg, avi, wav, mov....)
compress a sample of the file (first and last 4k blocks maybe?)
test for magic numbers that indicate compressed types
Don't bother for files less than 1k

It all depends on the kind of data you will be working with.






Reply to: