Re: document registration policy needing to be written
Christian Schwarz <firstname.lastname@example.org> writes:
> On 10 Apr 1998, Adam P. Harris wrote:
> Thanks for the excellent summary!! This is exactly what we need now. It
> would be great if you could update the summary as the discussion evolves.
Yes, I was planning on putting this into the doc-base SGML file.
Actually I think I shall make it separate, keeping in mind that we
might want to specify it (slink) as a debian documentation
"subpolicy". If we can beef up my statement here, we should have
something we can bring to <debian-devel> or <debian-policy>, and see
where it goes ;). Well, some day... we still have a long way to go.
Notwithstanding that, and my ultimate goal of hashing out a debian
documentation policy for the 21st century, I'm also extremely
concerned with stabilizing and providing a decent (it might be too
late for good) basic infrastructure for hamm (no new features, just
stabilize current practice).
Christian, it seems that we agree a lot, I'm only going to
quote/respond to issues where I have more questions or reservations.
> I disagree. I still think that registering documents to install-docs does
> make sense, even if dwww and dhelp share a common format:
The first thing I want to state is that I fully agree we should have a
small, thin, very simple "put my document into the registry" package.
I feel this package should grow out of doc-base, and that it should
*not* be coupled with any particular presentation or conversion system
(dwww, dhelp, magic-doc-convert). OTOH, I'm also trying to build
bridges to Marco Budde, to bring him on board. I feel a little
tension between dhelp and doc-base, as if he doesn't feel doc-base has
any right to be. ;)
> * Even if you say dhelp/dwww will handle only HTML while doc-base will
> handle all other formats, doc-base is required: there are currently
> 3 different HTML formats that have been requested by the users during
> the last doc policy discussion:
Ah, I need to read this discussion. Was it on <debian-doc> or
> Of course, if you think install-docs would get too large if it does the
> registry and format conversion, you could split the script into a
> package-registry frontend and a conversion backend.
Yes, I think registry should be separate, but I'm pretty flexible.
Until I get a more concrete idea of what the conversion infrastructure
will look like, it's too early to decide. Our architecting here
should not rule out pinching off the registration system.
> Are there any specific reasons you don't want to have the registry code in
> 1. Marco indicated in a private mail discussion with myself yesterday,
> that dhelp does have the functionality to search for .dhelp files in an
> arbitrary directory. With that, it might be possible for doc-base to put
> all .dhelp files in /var/lib/doc-base/dhelp/<doc-id> and run dhelp's
> registry program over these files. (Having to touch /usr/doc/<package> is
> less optimal since if doc-base has a bug, people might end up with an
> installation where a lot of `unknown' files are lying around where neither
> doc-base nor dpkg knows about them.)
This is excellent! I was just thinking about how I want to have
doc-base installed '.dhelp' files installed in somewhere other than
> 2. When talking about filesystem structure, I'd suggest we check out the
> new paths that will be required with FHS. (Debian will switch to FHS
> soon.) Moving `registry' directories like /var/lib/dpkg is nearly
> impossible (we'll not move this directory, for example) but this would be
> required if we aimed 100% compliance with FHS (we'll not do). Therefore,
> I'd suggest we use FHS paths right from the start.
Agreed. I think I'll move stuff now out of /usr/doc/<pkg>/.dhelp for
doc-base installed packages. I guess I wish I could just put it into
/var/state/doc-base/dhelp-gen/<docid> rather than
/var/state/doc-base/dhelp-gen/<docid>/.dhelp, but oh well.
[Technical diversion: doc-base has a listing of, for each docid,
whether we were registered to dhelp or dwww. Given that, I should be
able to safely unlink the old /usr/doc/<pkg>/.dhelp files that
doc-base created and move them over. A pretty crucial bug in doc-base
right now (only last HTML file in control file is registered) is
another good reason to reconstruct our .dhelp files anyway. Actually
I think I'm going to add a flag to install-docs to refresh/reinstall
all installed document ids. Comments requested.]
> 3. We'd also have to watch out which files we put below /usr and which
> below /var. As a thumb rule, everything which is modified at installation
> time _only_, can go into /usr. At run-time dynamically generated and
> modified files must go into /var. Putting any dynamically generated files
> into /var is also a good option since it would simplify the `purge'
> process of doc-base. (That's important, see also #1 above.)
Yes, that would be /var/state/doc-base or some such, from my reading
> > * file format should be standardized, we should whip up a DTD and
> > make it true SGML; this will assist in format validation and
> > standardize file parsing
> Oh, do you want to change doc-base's registry file format into a SGML
Yes, for slink, not for hamm.
> I wouldn't like that for the following reasons:
> * The format has to be supported by the package maintainers (only), so
> we should try to make life most easiest for them. The `dpkg style'
> control file syntax doc-base uses until know should be known to any
> developer already.
I'm not ruling out backwards compatability. As for the dpkg control
file, I'm lobbying (perhaps unwisely) to get that put into SGML also!
> * AFAICS, we don't need SGML `functionality' in the registry files.
Why not? Wouldn't it be nice to be able to use 'nsgmls' to validate
our control file at package build time, to make sure we're ok? A
could see it being nice to have a script that automatically transforms
document control files into valid HTML or some printed report.
> * Parsing SGML files is a lot more work and would require more CPU time
> at installation time, than to parse the simple dpkg control files.
Yes, this is the crux, for me. I actually don't rule out using either
(a) a simple perl module wrapping around nsgmls in conjunction with a
DTD, or (b) writing a perl module to parse SGML down to simple data
structures (list of hashes comes to mind) on it's own in such as way
as it has 98% SGML (or XML) coverage. There may (should!) be a std
CPAN Perl module for this, but I haven't found it yet.
Benefits to SGML for control files you may not have thought of:
* decouple parsing system from the particular format. I.e., we can
add fields without having to also mess with the parsing engine.
* allow features not allowed in control file type format, i.e.,
comments, multidimensional fields (attributes, i.e.,
'language="de"'), looping, cross-referencing within the control
I'm pretty flexible, however. Our current (RFC 822-style) parsing is
fine; if we stick with it, we oughta beef it up and make sure
continuation lines are accepted everywhere, document a comment field,
etc. I haven't dug into that side too deeply; I know dpkg has some
problems with continuation lines in some fields.
My guess is that when we start to reach consensus on the featureset
for slink's document mgmt system, the right choice will be obvious.
> * Just using a SGML-like file syntax but by picky about where line breaks
> and spaces may appear (that's like dhelp behaves now) is even worse,
> since the SGML-like syntax makes the maintainers think doc-base doesn't
> care about spaces--a fail. (This happened to me with dhelp.)
> > * adopt the menu hierarchy as a standard documentation hierarchy (de
> > facto; make it official)
> > TODO if so:
> > * beef up a little, cf my Bug#20936
> > * consider how this hierarchy might integrate or not with language
> > specifiers. dhelp uses
> Yes, that's an important point. Note, that people suggested before that we
> use the same menu structure which is also used for the `application
> menus'. However, I don't think this structure works well with
There are a number of divergances. I would like to get Joost involved
here and let's resolve this issue for hamm release if possible.
FYI, I'm planning on adding some basic document heirarchy validation
to doc-base (will only emit warnings for unknown/unregistered
.....A. P. Harris...apharris@onShore.com...<URL:http://www.onShore.com/>
To UNSUBSCRIBE, email to email@example.com
with a subject of "unsubscribe". Trouble? Contact firstname.lastname@example.org