Re: Let's shrink Packages.xz
Jeff Epler wrote:
First, I tried encoding the various digests as base64 or base93, rather
than hex. In each case, the file grew in size; base93 was the worst.
Are you sure you performed this calculation correctly?
"ASCII hex" encodes 4 bits as 8 (or 7. but really 8.), as each ASCII
character is a nibble of the digest; that's a 100% increase (factor of
2) over the bare digest (or a "raw mapping" of 8 bits of digest to an 8
bit character set).
base64 encodes 6 bits as 8; that should only be a 33.3% increase (factor
I've never heard of base93, but I found a reference that I think
describes what you mean . This should provide even better efficiency
over base64, as should any binary-to-ascii mapping of higher radix.
What are we looking for in an encoding? I'm guessing this needs to be
printable, suitable for human consumption (or at least "copy/paste" /
"consumption via text editor"), and "7-bit compat"?
Is this even up for debate? The community at large ("computer users"),
Debian included, seems to have standardized on "message digests as ASCII