[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Field compression



On Wed, Jul 19, 2000 at 01:39:18PM +0200, Robert Bihlmeyer wrote:
> 
> What you've essentially done is replaced the plain-text available file
> with a binary format understood only by special tools (no more grep
> '^Package:'). I don't think the 2 % saving is worth that, especially
> when the gzip -> bzip2 move gives 11 % improvement.
> 
> If going binary, you could just as well go the whole way - this will
> save much more: numbers like size and MD5sum no longer need to be
> represented inefficiently, package references (in Depends-like fields)
> simply contain an offset or id instead of the package name, etc.

I don't think Edward's intention was to make a new binary file
format.  

If we could take this idea of "customized zipping" for that particular
file, but not going all the way to bzip2, we could achieve a good
percentage of the gzip->bzip2 without incurring all of the losses
involved in doing bzip2.

what if we just have a small script on everyone's system, so that the
file is not just gunzipped, but also run through the script...

the end result is still the same old file (no new binary formats),
but the combination of gzip and the magical Ed script would give us
some compression gains but not as much of the slow-down and memory
usage that bzip2 has.

As edward said, we don't know how much benefit we can squeeze out of
it before we start to incur more time and memory penalties, but it's
an interesting experiment.

+-----------------------------+--------------------------------+
| Pete Lypkie      Developer  | Telephone: 688-9137 Ext. 117   | 
| Stormix Technologies Inc.   | Encrypted email preferred.     |
| http://www.stormix.com/     | see http://www.keyserver.net/  |
+-----------------------------+--------------------------------+
Opinions expressed in this email are not necessarily those of my employer.

Attachment: pgprqjTnghRro.pgp
Description: PGP signature


Reply to: