[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Re^2: document registration policy needing to be written

Marco.Budde@hqsys.antar.com (Marco Budde) writes:
> CS> would need the approval of all developers--something which is
> CS> very unlikely to happen since people didn't had a possibility to
> CS> talk about the design of dhelp first.
> ? That#s wrong. I#ve asked several times for suggestions. If the
> people don#t send comments, I can#t improve the system. And it#s no
> problem for me, to change for example the format of .dhelp or rename
> the file to some other name.

Well, I don't like the pseudo-SGML format.  It should be real SGML, or
not; I agree there.

> CS> (For example, I would have voted against the
> CS> SGML-like registry files.)
> Ok, maybe the SGML-like file is not the best idea, but I don#t like the  
> format of the description field in control. I#ve suggested several times  
> to develop one file for dhelp, dwww & co. But I#ve never get an answer.

There is an effort to develop one document registration file format.
That effort is called 'doc-base'.  More below.

> CS> isn't even close to be ready!) dhelp's style also has its
> CS> advantages. But in order to get doc-base supported by policy,
> CS> the long way is necessary.
> I don#t think so and I don#t see any differences in the styles.

Three major differences: 

* doc-base uses /usr/share/doc-base for package placement of
  registration files; dhelp uses /usr/doc/<package>.  I'm not sure
  which technique is better nor why.

* doc-base uses RFC-822 style field format; dhelp uses pseudo-SGML.  I 
  am sure that doc-base's format is superior.

* doc-base tries to be a generic document registration format; dhelp
  tries to just register and organize HTML files.  

> CS> (Again, let me stress that I don't say/think that dhelp's
> CS> procedure isn't good! But we can't follow this procedure with
> CS> doc-base.)
> Once again, I#ve never said, that we shouldn#t change parts of dhelp. In  
> fact dhelp was only one idea at the beginning. But in the current version  
> dhelp is a well working system and I don#t understand, why we need doc- 
> base for the HTML files.
> Instead we should develop *one* .dhelp/.dwww-index/doc-base file and one  
> <directory> structure. If this is done, I#ll change dhelp to support this  
> standard.

You seem to be contradicting yourself.  Either one file format and
registration system, or one for HTML and one for everything else.  As
you already know, I'm for one registration system, which allows other
document *presentation* systems (i.e., dwww, dhelp, whatever else)
hook into it.

The only reasonable complaint you have about doc-base, is that
'install-docs' is written in Perl.  Please, this is a minor
implementation issue.  It could be ported to C at any point.  At this
point, since we there is no policy, and the ideas are flying around
pretty quickly, the advantages of a quick-n-dirty language such as
Perl outweigh the performance issues.  Remember Don Knuth's edict:
premature optimization is always to be avoided.

> CS> Comments are appreciated!
> We should discuss with all maintainers, if the maintainers should
> ship the converted documents or if the users has to convert them. As
> owner of an old notebook (486SL) I don#t like the second solution
> and I don#t see any advantage.

Well you have a point, although I do see the advantage to allowing
users to generate pacakges at install time, or after install time.
For instance, if I ship PS files, do I ship them A4?  Letter?  4up?
No one PS file will really allow everyone to print the way they want

I would actually prefer to back-burner the conversion questions until
(a) I've read the 1997 discussions on conversion, and (b) we have a
strong, solid consensus on document registration.

> CS> Yes, these paths look ok.
> No, this will not work.

Why not?  What is the benefit of your system of putting .dhelp files
under /usr/doc/<pkg>?  One of the requirements of dhelp_parse is that
the file should not be modified while it is in the database, or else
it cannot ever be unregistered.  Putting the file out of harms way (in
/usr/share/doc-base, e.g.) is much less likely to be mucked with by a
curious user.

> CS> Note, that with the old /usr/doc/<pkg>/.dhelp files it was
> CS> necessary to register all different docs (doc-ids) in a single
> CS> .dhelp file. If dhelp can parse other file names than .dhelp,
> CS> it's possible to have a ".dhelp" file for each doc--not only
> CS> package.
> That#s slow. Please keep speed in mind. Where#s the advantage of several  
> small files?

I don't see any advantage; for me, optimization will take a back seat
to convenience to the pacakge maintainers.  It will be easiest for
package maintainers to have one file per package (i.e., the menu
system, which is pretty well supported).

> Ok, I agree that we could rename
>   <filename>
>   <linkname>
>   <dtitle>
> to something like
>   filename:
>   linkname:
>   dtitle:
> But once again, that syntax of description in control is very bad. I#ve  
> got troubles with the control format in some of my packages. This was the  
> reason for my pseudo-SGML.

Why bad?  Please give reasons why you think it's bad, Marco.
Why good?  Because:

* already understood by our target users, i.e., pkg maintainers;
  it's identical to the dpkg control file format
* Internet standard (RFC 822).  Any format which is not based on an
  open standard is utterly unacceptable, IMO.
* easily legible and parsable

> CS> No, nsgmls is no option.
> I agree 100%.

Ok, ok ;)

> CS> Remember, that doc-base will run on every system
> And remember that it is called by a lot of packages. That#s the reason,  
> why a Perl script in no option, too.

Trivial implementation detail, and not even a foregone conclusion in
my book either.  See above.

> CS> far. I've written a general purpose parser of dpkg-style control
> CS> files in Perl--if you want, you could update this part in the
> CS> doc-base source.

> No Perl, please!

Marco, you sound like a skipping record. ;)

> CS>  .
> This should be <p>.

Hmm.  This is one area where we diverge from RFC822.  '<p>' is an
interesting suggestion; however my problem is that since HTML 3.2, <p>
is a container, not just a separator, so it's rather bad usage.
Interesting idea, though.

> ? Must there blanks between the beginning and Line2/3? I thing, you can  
> read my pseudo-SGML easier than this one, if the file is produced by a  
> converter like sgml2dhelp.pl.

Read RFC822.

> But if all others like this format, I could use it, that is not the  
> problem.

You should.  dhelp only consults it's registration file for the
purposes of adding it to it's proprietary (and fast) indexing scheme,
in /var/lib/dhelp/dbase.  If we could modify dhelp to read
/usr/shared/doc-base/<file> directly, that would be *far* better.

> CS>   - doesn't need additional software other than perl-base (included in
> CS>     the base system)
> I will *not* support this format, if I need perl. It#s no problem to
> write a parser in C, in fact all programs can use the same code for
> the parser.

What does a format have to do with a particular parser's
implementation?  I don't understand why any "format" would require

To summarize:

* Debian should have a single document registration system which is
  equipped to be used for system-wide document registration to enable
  display systems to give structured display and searching for these

* The document registration system should not be involved in any
  issues of document display.  That's why dhelp is not in a position
  to take over the function of doc-base, dhelp2dwww notwithstanding.

* It would be *extremely nice* if dhelp could support doc-base files
  directly rather than having us create a separate file (with no
  additional information) for dhelp.

* Document registration files should not be placed under areas of
  package control, i.e., /usr/doc/<pkg>, but rather in a shared
  directory especially for this purpose.

Marco, please address these larger issues, please argue for your

.....A. P. Harris...apharris@onShore.com...<URL:http://www.onShore.com/>

To UNSUBSCRIBE, email to debian-doc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: