[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

d-i manual: xml status report



Hi.

As the release of sarge is nearing, I wrote something like a status
report about its install manual at
http://www.debian.cz/~kurem/status.html
[Thanks to Denis Barbier for reading the first draft]

I'd like you to comment on technical details (like which processing
tools are available on debian machines? How much nonfree is fop?)

Dumping here relevant part from above url:

Technical side of the beast

   As you may noticed, the new install manual for sarge is written
   in XML version of DocBook, which is preferred to old SGML
   DocBook. This switch had to be done at some point, because SGML
   DocBook will be unsupported from its authors in the future. The
   switch invoves several things:

    1. We have to use proper xml tags instead of wild sgml
       shortcuts.
    2. We have no more marked selections.
    3. We have to choose some new build system.

  1. Proper xml tags
  ==================

   Converting parts of the old b-f manual to the new one was done
   on some semiautomatical basis and a lot of handwork. I can say,
   that currently written parts of d-i manual are valid xml
   DocBook. (At least xsltproc doesn't complain and I can build
   html and pdf out of it (more about that later)). For newcomers,
   there is nice introduction in
   debian-installer/doc/manual/cheatsheet.xml.

  2. No more marked selections
  ============================

   Marked selections allowed us to branch actual text depending on
   architecture we are building for as well as other conditions,
   like:

 Use <[ %s390; [ tapes ]]>
 <[ %supports-floppy-boot; [ floppies ]]>
 to boot the system.

   These marked selections can be rewritten using profiling of xml
   DocBook:

 Use <phrase arch="s390">tapes</phrase>
 <phrase condition="supports-floppy-boot">floppies</phrase>
 to boot the system.
 
   This is also done for currently written parts and it works fine,
   as far as I can say.

   The other thing is, that marked selections are used not only in
   the text, but also in "metadata" definitions (infamous *.ent
   files), which is much more tricky to get right. I rewrote these
   definitions, so they work fine, but the source looks quite ugly
   sometimes. This also means that there is no place for
   lang-specific entities in *.ent files. These entities can be
   (re)defined at the top of install.XX.xml file, if their
   translators desire so.

   And the last issue with profiling is a fact, that some entities
   can't be profiled, because they go into some xml attribute like
   url:

 <ulink url="some-url/&architecture;">

   which would expand to something forbidden like

 <ulink url="some-url/<phrase arch='i368'>i368</phrase>
                      <phrase arch='m68k'>m68k</phrase>">

   After some grepping through sources it seems that these
   non-profilable entities are just &architecture;, &langext; and
   &downloadable-file;. (Well, I didn't mention &releasename;, but
   this is a non-issue, because it always holds only one value
   common for all arches (I suppose everybody anxiously awaits
   sarge, right?)). All non-profilable entities have to be handled
   by the build system like shuffling symlinks or rewriting content
   of some dynamic file before each arch build (done so in my proof
   of concept build system).

  3. Build system
  ===============

   Due to the point 1. we have to use another set of tools to get
   some .html, .txt, or .pdf output. Due to the point 2. we need
   another way to pass profiling parameters to the processing
   tools.
   Let me start with the latter. Because we've lost marked
   selections and conditional branching (<!ENTITY blah IGNORE> or
   <!ENTITY blah INCLUDE>), we need to organize the *.ent files in
   a slightly different way and push some information outside of
   these files into the build script. (See my proof of concept
   below).

   Back to the former item: new toolchain. There are basicaly three
   options:

    1. Use xsl stylesheets and xsl processor (xsltproc, saxon) to
       get nice .html and .fo (Formatting Objects). FO can be
       transformed to various formats (.txt, .ps, .pdf) with fo
       processor (fop, xmltex/passivetex).
    2. Use dsssl styles and (open)jade to get .html and .rtf. When
       we throw jadetex in, we can also get .pdf and .ps.
    3. Use something like docbook2latex, but I don't consider this
       as a viable alternative.

   In general, xsl way is more modern and is The Way To Go(TM), the
   glory of dsssl is fading. On the other hand saxon and fop are
   java programs, so they depend on "non-free" java (don't know if
   they work with gcj or kaffe). Xsltproc does its work
   marvelously, but in stable it dies hard on profiling (you have
   to use testing/unstable version). I do also recommend to use xsl
   styles from testing/unstable. I've heard some bad things about
   passivetex, but I can't confirm that myself, because it is not
   in stable and installation of unstable version fails in the
   process. Fop has some quirks regarding accented characters and
   line layout is far away from TeX output we are used to.

  Conclusion for technical side:
  ==============================

   I did some proof of concept build system, which I use to verify
   my work on document xml structure. You can download it from
   http://www.debian.cz/~kurem/build.tar.gz.

   It consists of updated *.ent files according to point 2., file
   build.sh, which calls script buildone.sh for each language and
   architecture to build. (Here can go some management code like
   moving just-built doc to some safer location...). Buildone.sh
   sets up profiling and calls the right tools (change various
   variables inside to suit your needs). There are also three
   style-*.xsl files, which can be used to customize output. Just
   grab current debian-installer/doc/manual/en, drop it into the
   unpacked directory and run e.g. ./buildone.sh powerpc en.

   Used toolchain consists of xsltproc (.html and .fo) and fop
   (.pdf).

   I'm not a DD, so I'd like to hear your oppinion about this. I do
   understand we will need some DD to write nice build system,
   which will be flexible and much more dynamic, so don't bash my
   coding style, but the overal idea.


-- 
Miroslav Kure



Reply to: