[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please test gzip -9n - related to dpkg with multiarch support



Wouter Verhelst wrote:
> On Tue, Feb 07, 2012 at 10:04:04PM +0000, Neil Williams wrote:
> > Maybe the way to solve this properly is to remove compression from the
> > uniqueness check - compare the contents of the file in memory after
> > decompression. Yes, it will take longer but it is only needed when the
> > md5sum (which already exists) doesn't match.
> 
> Actually, I think the real way to fix this properly is to not compress
> files in the package at all.
> 
> The contents.tar.gz is already a .tar.gz, which means it's compressed.

s/contents/data/

> Doubly-compressing files hardly ever nets a benefit, so we're not
> compressing files for the benefit of our mirrors.
> 
> The only reason why we compress files in /usr/share/doc is so that that
> directory doesn't waste too much space. If that is the case, I think it
> makes much more sense for files to be packaged inside .debs
> uncompressed, and (optionally) for dpkg to compress them on the fly
> should the system administrator request it. It would then make much more
> sense for dpkg to consider the contents of the file, rather than the
> on-disk representation, and not cause this kind of issues.

I agree with this entirely.  Doing this would actually save *more* space
in the .deb files, since it allows gzip (or xz, or whatever compresses
the data.tar) to see the contents of multiple files at once.  It also
allows the administrator to set local policies for compression to cover
cases like the one you mentioned below.  Those local policies would also
allow the use of compression formats other than .xz, as well as deciding
to leave files uncompressed due to the use of a filesystem with built-in
compression.

It wouldn't work in all cases, since sometimes the package requires a
compressed file in a certain location, but it should work for just about
all files in /usr/share/doc.

The only downside that I can see: packages couldn't refer to a
particular file under /usr/share/doc/$package/ by path, because those
packages wouldn't know how the administrator might choose to compress
their files.  Given the policy of not depending on files under
/usr/share/doc/ to function, at most this will result in manpages and
similar referencing paths that then need a .gz or .xz appended, and that
doesn't seem like a big deal; people will cope and tools can learn to
check for compressed variants.

> As an additional benefit, this will also allow those among us (like me)
> who hate having to use 'gunzip -c /usr/share/doc/foo/bar.pdf.gz >
> /tmp/bar.pdf; xpdf /tmp/bar.pdf' in order to be able to read some
> documentation, to just request that files are not compressed.

Try "zrun" from the moreutils package:
zrun xpdf /usr/share/doc/foo/bar.pdf.gz

Or use evince, which can handle compressed files directly.

- Josh Triplett


Reply to: