[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: a unified documentation interface without overkill



I was off-line for some time, so my response is a bit out of sync.
My appologies for that.

On Thu, 25 Jan 1996, Bruce Perens wrote:

> From: Evert-Jan Couperus <couperus@reflex.simplex.nl>
> > I think the complexity in producing a good consistent set of documents is 
> > highly underestimated. It is of a higher order than maintaining a 
> > software distribution like Debian.
> 
> I don't think we're on the same wavelength :-) .

Possibly, but I think it is the result of me failing to make my intentions 
clear. I was deliberatly a bit vague in the actual implementation details 
in order to escape the danger of narrowing down the design too soon.

> This is a recipie for failure. We are not mounting some tremendous
> documentation project here. We simply want to present all of the available
> documentation using one interface, with one top-level index. This should
> be do-able in a month of evenings by one programmer. Once we have that,
> we will have the leisure to start out on more ambitious projects if we feel
> they are necessary.

Very true. However, my experience with documentation projects is:
1. "It's easy. We only want..., nothing fancy..."
2. "It works!"
3. "Well, just don't touch ... We'll fix that later."
4. After fixing a lot of minor points it really works.
5. "Wouldn't it be great if we add ...?"
6. A couple of iterations later the system barely runs and is buggy.

I think it would be a pity if Debian solves this problem for the software
part, but repeats this life-cycle for the other parts. When we feel the
need for better documentation management the Debian bashing may have 
been started already (I extrapolate on SLS and Slackware here). 

[ omitted a lecture of mine about document inconsistencies ]
> > navigational elements (i.e. hyperlinks, browsing sequences, indexes, 
[ ... ]
> Guaranteed consistency is not one of our goals at this time.  We'd like
> the reader to have some chance of finding the documentation and reading
> it without having to be a Make/TeX/Roff/Lout/Latex/etc. guru.

I think you are saying that the navigation through the documentation should 
be more consistent and thereby easier. I suppose we do not disagree on this 
one.

> > Besides that you will need to agree on a set of keywords to 
> > be used by the authors.
> 
> If you really wanted keyword search, why not simply use an inverted index
> a la "refer" and then every word in every document is a keyword and the whole
> procedure is automatic. I'm not convinced that we want keyword search, though.

Your top-level index is a special case of the keyword indexing (not
searching, that's just a related form of navigation). If you feel that
your packages form a logical grouping of information you can use the
package names and base/devel/net/text/../ as keywords and use them for
building the index. When implementing that special case you should try to
structure your code along the lines of general rules and your special
choice of keywords. 

About the inverted index. If you want keyword based indexing you have two
options:
1) assign keywords by hand (the author or debianiser),
2) automagically.
To do it right you need a set of rules to keep the result of 1 consistent 
and AI technigues to keep the output of 2 semantically consistent.
The point I was trying to make is: *if* you want a *solid* keyword index IMO 
you should opt for 1 without forgetting to agree upon a set of rules for 
the authors.
Option 2 without advanced techniques adds essentialy nothing to the good 
old find&grep method.

> > I think we should concentrate on:
> > 1) what kind of information do we need,
> > 2) how do we keep the maintainance distributed without sacrificing 
> >    coherence,
> > 3) how can we use the existing documents as much as possible and yet 
> >    integrate them in one meta-document,
> > 4) how do we keep the use of resources, both human and electronical, 
> > low?
> 
> I don't think we can afford your standards.

I know, it's hard :-)
I said that because I got the feeling that a lot of implementation 
details were exchanged without having the same idea of what was 
getting implemented and what should be the constraints. I tried to 
broaden the view so that we can discuss design issues and their 
consequences for the implementation instead of discussing implementations 
with a lot of hidden design decisions.

> > We should not do major rewrites, just add the necessary primitives 
> > needed for a better navigation.
> 
> We should not add primitives. We should not alter the documentation at all
> except to run it through an automatic program to translate its format
> when we present it. We should construct a top-level index.

With primitives I do not mean changing the document contents or something
like that. With a primitive I mean something that facilitates the indexing
you are talking about, but does not dictate the actual implementation. 

For example, you can add a short (optinial?) record like the ones in the
Packages file or the *.deb files, maybe styled after the Linux Software
Map records. These can be short, extendible and used by another
application to extract information to be used for indexing. 
If you want an HTML document as the top-level index for all installed
packages you can make a script build that builds that index from the names
of the packages as a post-install action (like install-info) or on the fly
by a CGI script. As long as the number of packages is small and their is
no need for a hierarchy deeper or different than the base, system, net
etc. you can just hardwire it into the implementation. But as soon as your
needs change you have to rewrite your code in order to accomodate that.
Whereas adding those document records gives the debianisers the freedom
to: 
1 choose the indexing separate from the packaging hierarchy,
2 add more items per package by adding more document records.

Another primitive is adding a line "Keywords:" to the Package file 
records.

> > For package maintainers that should mean 
> > defining the place in the hierarchy of the meta-index as well as giving 
> > keywords for the keyword network.
> 
> This sounds very complicated :-) .

Yes, it is! It is almost as demanding as making the Package record :-)

> > Maybe new keywords are permitted as long as they are inherited from a more
> > abstract one.
> 
> The concept of "keyword administration" is probably outside of the scope of
> our project.

At the moment I agree. But I also think you should take a more advanced
indexing scheme into account when designing and implementing the automatic
index generation. Doing it now should take no time at all, doing it later
could take a lot of time. 

> > I think we should look at other solutions before using yet another daemon
> > like httpd.
> 
> An HTTPd is only necessary if you want translation at run-time. If you
> sacrifice disk space by having pre-translated files in place, you only
> need an HTML browser using the "file:" URL, you don't need a server. In
> any case, an HTML server is cheap compared to the alternatives.

I oppose. At the moment the cheapest solution is viewer dispatching.
Furthermore I would like to do dispatching for info, ?roff and postscript, 
HTML conversion during installation for the others. Easy to add if the 
right implementation is chosen, otherwise very hard.

> > Working at a firm specialising in "information disclosure" I see too much
> > documentation projects fail because of a lack of analysis and design.
> > Even small ones can suffer from it.
> 
> I'd rather have it Tuesday than have it perfect. I think after we've
> satisfied the basic goal of having some way to read documentation using
> one tool and one overall index we can spend as much time as we wish on
> doing it right.

I can have it ready by Monday morning and perfect :-)

A good design does not imply using more time, on the contrary. It's about
chosing a scalable and flexible solution without mixing too much
implementation details into the design. 


Evert-Jan.


Reply to: