[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#289096: tetex-doc: pdf files should not be gzipped



Sanjoy Mahajan <sanjoy@mrao.cam.ac.uk> wrote:

> Package: tetex-doc
> Version: 2.0.2c-3
> Severity: minor
>
> PDF is usually already compressed (often a PDF file is roughly the
> same size as the gzipped postscript), so gzipping a PDF file usually
> does not save much space.

This is not true, see e.g. 

frank@sid:~$ ls -l src/tetex-base-2.99.7.20041225-beta/doc/latex/hyperref/manual.pdf 
-rw-r--r--  1 frank frank 342982 Dec 20 19:38 src/tetex-base-2.99.7.20041225-beta/doc/latex/hyperref/manual.pdf
frank@sid:~$ ls -l /usr/share/texmf/doc/latex/hyperref/manual.pdf.gz 
-rw-r--r--  1 root root 258953 Dec 20 19:38 /usr/share/texmf/doc/latex/hyperref/manual.pdf.gz

This is a reduction of 24% in size, and totally:

99M     src/tetex-base-2.99.7.20041225-beta/doc/
frank@sid:~$ du -hs /usr/share/doc/texmf/
66M     /usr/share/doc/texmf/
frank@sid:~$ 

This saves even one third (but covers not only pdf files, of course).

> Instead it inconveniences the user because
> neither acroread (v5.0) nor xpdf (v3.00) opens compressed pdf files
> (gv, i.e. ghostscript, is fine though).  The user therefore either
> must ask the sysadmin to uncompress the files in place in
> /usr/share/doc/texmf/ or must uncompress it into a working directory.

Why don't you just use texdoc? It will uncompless *.gz and *bz2
on-the-fly. 

> Therefore could all the .pdf files be stored uncompressed?
>
> $ dpkg -L tetex-doc | grep '\.pdf\.gz$' | wc -l
>
> says there are 62 gzipped pdf files in tetex-doc.  

In the upcoming version 3.0 it is 103. Which makes compressing even more
desirable. 

> says that their total compressed size is about 73% of the uncompressed
> total.  It's a saving, but I don't think it's worth the hassle to the
> user.

Ah, thanks, that shows that the hyperref manual is quite average
here. I'd say that for a 70-MByte hulk, 27% space saving is worth quite
some hassle. And there's really not much hassle in saying "texdoc
hyperref/manual" instead of "xpdf
/usr/shTABdoTAB/texmf/laTABhyTABeTAB/mTAB &". And hyperref is even a bad
example, because it uses such a generic name - usually you just type the
package name.

> Whereas running the same command for the .ps.gz files says that the
> compressed ps files take up, in total, about 40% of the space of the
> uncompressed ones.  That seems like a worthwhile saving, especially
> since it causes no inconvenience because the main ps viewer (gv and/or
> ghostscript) happily opens .ps.gz files.  (Ditto for .dvi.gz files and
> xdvi.)

We should consider filing wishlist bugs agains xpdf (and
acroread-debian-files if you like, it's not in Debian AFAIR) to enable
them to open compressed files.

> Apologies if this issue is a Debian packaging policy decision that I
> should discuss on a different list (let me know which one).  Most of
> the .pdf.gz files are from tetex-doc; the other major contributing
> packages are doc-debian, apt-howto-*, and mit-scheme.

Policy 12.3 is somewhat unclear here, it says that "additional
documentation" should be installed "at the discretion of the package
maintainer" (which means it can be put outside of
/usr/share/doc/$package, e.g. for online-help), and that "Text
documentation" should be gzipp'ed unless small. I read this as "it's up
to the maintainer to decide about compression of non-ASCII
documentation". 

Therefore I think we are the right people to complain to, but I don't
buy you're argument. I think the way documentation can be accessed in
teTeX is already quite convenient, and if something needs to be
improved, it's the ability of xpdf to read compressed files.

Kind regards, Frank
-- 
Frank Küster
Inst. f. Biochemie der Univ. Zürich
Debian Developer




Reply to: