Re:End of Documentation Discussion
Christian Schwartz wrote:
> So I'll make another proposal. This is meant to be a "compromise" that
> everyone here should be able to accept. I will _NOT_ accept simple
> "objections" this times. If you can't live with this proposal, you'll have
> to present another formulation of a paragraph or of the whole text.
Ok, I take it. I write between brackets the rationale for each point.
It is meant as an explanation of that point, not part of the proposal itself.
1) The default format for online documentation is HTML. A web browser (lynx)
and a very small web server (boa) will be in the core distribution, marked
[A web server adds a lot of flexibility. Boa adds very little overhead. Try
it and only then say something. If they are part of the core distribution, just
like man is now, then we would have gained a lot of consistency.
HTML is a standard. It's use is growing. It is not Unix-specific. It allows
distributing documents among many servers very easily. Even integrating
local documents with other stored directly in www.debian.org (like security
updates to programs. It is the way of the future.]
2) Documents which can be converted on-the-fly to HTML will be installed in
their original format. This allows users to produce nice-looking hardcopies,
while at the same time keeping consistency with the rest of the documents.
Currently, the list of formats to be installed in original form includes
texinfo, man, plain text and sgml. It does not include TeX, since the
converter is not ready for prime time.
[In all cases mentioned, the overhead of on-the-fly conversion is acceptable
enough, just a little slower than formatting man pages. Users should be able
to convert docs to their preferred format without imposing on everyone
to carry multiplicity of formats. Printed copies are best produced from
the original format. HTML is preferred for online consultation. The original
format guarantees both uses.]
3) Documents in markup format for which no on-the-fly conversion is available
will be included in both pre-processed HTML and original format. This is a
last resort measure.
[Original format should always be included. Reasons: 1) To produce printed
copies. 2) Because I hate Ghostscript and Xdvi. I prefer reading the
markup directly (and I am not alone.) 3) Because users might want to
process the documents automatically (search engines for example)
HTML should be included so that the documents are cleanly integrated with
the rest of the documentation and for serving them to remote (W95) systems if
necessary via http.]
4) Documents originally in binary format (PS, DVI, PDF, MS-WORD) for
which no conversion is possible should be packaged separately. A file
explaining how to get the documentation (including which programs
the user will need: ghostscript, xdvi, MS Word) and a brief summary
of the document should be included in the binary package in HTML format, or
a convertible one.
[Binary documents are useful mainly for printing and they usually have a
huge size to information ratio. I hate storing junk in my systems. Online
viewing of binary documents is awful. I hate downloading a 1MB file just to
find out it does not answer my questions. Since developers must read the
document anyway, they could make a brief summary. Sometimes that would be
enough to decide whether it is worth downloading the full document or not.
Binary docs can not be integrated with the rest, they should be discouraged.
Authors should be encouraged to give out the document in the original format
in which they wrote it, which seldom is a binary one, except for MS Word]
5) The man program should be marked optional. When a user types "man something"
and man is not installed, lynx would automatically be invoked and it would
present the HTML-converted man page the user requested.
[This is almost as fast as original man but much more powerful. As it has
little overhead, it can replace the man program. But if a user still thinks
he can't stand the small overhead, man can optionally be installed.
Man depends on groff. That's a huge overhead in both size and speed.
Nowadays no one writes groff documents other than man pages. However, groff
is needed for printing man pages, but it is a bloated solution for online
6) The info program should be marked optional. When it is installed, it would
compile the texinfo files and place the output in /usr/info. When it is
deinstalled, it would erase the /usr/info directory. There will be a hook
in dwww to register texinfo pages so that the info directory is kept always
[The preferred online way of viewing texinfo files is through the texinfo to
HTML on-the-fly converter. Info fans who prefer the crappy info interface
should still be able to install info files, but without imposing them on
everyone. The info format is awful. Texinfo is nicer. Texinfo->HTML is
optimal. Emacs fans can use the w3 mode for viewing texinfo files. Or they
can install info and use the info mode if they want. For other people, just
texinfo is enough.]
7) A default searching/indexing engine should be chosen. It would be
marked standard, but not important. Caching would be an option too.
[This point needs more ellaboration. This would make the HTML format show its
real power and bring down the biggest objection put by info fans.]
I hope my suggestion is clearly exposed, but don't hesitate to ask for further
A few comments about your proposal:
> Thus, every documentation that is available in a format which can be
> converted into HTML, should be converted, with the exception of manual
> pages (they can be converted via dwww at run-time) and source code
On-the-fly conversion is fine for everything except TeX, it seems.
No need for multiplicity of formats. Simplicity is better.
> In case of converted HTML documentation, the files with original mark up
> format should not be provided, unless they are considered as "example
> documents" for the mark up language.
I strongly object to this. Original markup files are necessary for
1) Printing documents
2) Automatically processing
3) Converting to other formats locally needed
They are useful for online viewing with help of on-the-fly conversion.
Post-processed formats severily limits users freedom.
> Packages that contain programs with GNU info manuals, should provide these
> in HTML _and_ in GNU info format. The HTML files should be stored in
> the directory
Texinfo is much better than info. Let info to info fans and provide texinfo
for the rest of mortals. Texinfo might be enough in most cases, with help
on on-the-fly conversion.
> All documentation related files will be kept in the "main binary package"
> if they do not exceed 500 kbytes installed size together. (Of course,
> documentation-only packages are not covered by this rule.)
Ok, but see my point 4 above. A file explaining users how to get the
documentation and a brief summary should always be provided in the
> One questions remains: Is it possible to browse "html.gz" files _without_
> a CGI script with the usual HTML browsers (Netscape, lynx)? If so, we'll
> make it policy to gzip all html files and to adopt the references. If not,
> we'll have to install all html files gezipped--or add a cgi capable web
> server to the base system.
Please, please. Try boa before deciding a server is too much overhead.
TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
Trouble? e-mail to firstname.lastname@example.org .