[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

LSB PROPOSAL: SGML & XML written specification addendum




Enclosed is a proposal, submitted by Eric Bischoff,  for the LSB regarding
SGML & XML.   A more general proposal has been submitted to the Filesystem
Heiarchy Specification workgroup to be adopted.   It is proposed that the
enclosed detailed draft be adopted as an addendum to the LSB written
specification.  A new Sourceforge CVS module would be created so this
document would be initially maintained separately from the ongoing API/ABI
written specification.

Introduction
------------

In a normalisation effort, about thirty people, including packagers
of some Linux distributions, and developers of SGML related tools such
as the SGML-Tools and DocBook Tools project, discussed informally and
agreed on a series of recommendations that will be submitted as a draft
to the Linux Standard Base project. A reference implementation will also
be done as part of the DocBook-tools project.

This document's redaction started as an attempt to end the nightmare
of DocBook distributions, but it appeared quickly to be generic enough
to apply to any SGML or XML DTD. Explanations about the reasons
for all our choices are given in a separate document.

Following a list of definitions, you will find a set of recommendations:
R001 - SGML Directory layout
R002 - DocBook Directory layout
     (standard names for directories, their contents)
R003 - Open Catalogs usage for SGML
R004 - Open Catalogs usage for DocBook
     (for the centralized catalogs and for the individual catalogs)
R005 - Configuration files
     (other /etc/sgml files)
R006 - ISO-entities
     (file names and FPI declarations)
R007 - Packages
     (how to package this type of material)

We'd like to thank the following people who have participated intensively
in this normalisation effort:
     Camille Begnis (MandrakeSoft) <camille@mandrakesoft.com>
     Eric Bischoff (Caldera, KDE) <eric@caldera.de>
     Karl Eichwalder (SuSE) <ke@suse.de>
     Mark Galassi (DocBook-tools) <rosalia@lanl.gov>
     Jorge Godoy (Conectiva) <godoy@conectiva.com.br>
     Cees de Groot (SGML-tools) <cg@cdegroot.com>
     Jochem Huhmann <joh@revier.com>
     David Mason (RedHat, Gnome) <dcm@redhat.com>
     Manoj Srivastava (Debian) <srivasta@datasync.com>
     Norman Walsh (Sun, OASIS) <ndw@nwalsh.com>
and all the other many people that helped with their own contribution.


Definitions
-----------

In the scope of this document, we will use the following terms:

SGML application:
     Any program used to view, edit, convert, process or apply any
     kind of treatment to a document written using a SGML or XML DTD
     (Document Type Definition). This includes command-line utilities
     as well as GUI-based applications.

SGML converter:
     A SGML application, or a part of a bigger SGML application,
     used to convert from a given SGML-based input format to a given
     output format.

frontend:
     A part of a SGML converter used to analyse the input format

backend:
     A part of a SGML converter used to analyse the output format

helper:
     A stand-alone application used by a SGML converter to accomplish
     the conversion itself.

Style sheets:
     Declarations or scripts that define formatting during the
     conversion process.  They can be written in any style sheets
     language: DSSSL, FOSIs, XSL, ...

Open Catalog:
     A set of directives defined by OASIS, mostly used for defining
     equivalences between FPIs (Formal Public Identifiers) and real
     file names (see TR9401:1997 on http://www.oasis-open.org).

Centralized catalog:
     An Open Catalog that includes only comments and CATALOG
     directives pointing to other catalogs (or DELEGATE directives
     if supported).

Super catalog:
     An Open Catalog pointing to all the centralized catalogs.

Package:
     A set of files assembled together for distribution. It includes
     RPMs, DEBs and any other kind of packaging system.


R001 - SGML Directory layout
----------------------------

/etc/sgml/
     Configuration files, including centralized catalogs.

     It includes:
     *.conf: generic configuration files
     sgml-docbook.cat, tei.cat, ...: DTD-specific centralized catalogs
     catalog: the super catalog
     ...

/usr/share/sgml/
     Architecture-independent files used by SGML applications: Open
     Catalogs (not the centralized ones), DTDs, entities, style sheets,
     and other declarative files, if any.

     It is organized into DTD-specific subdirectories:
     docbook/
     tei/
     html/
     ...

At least for the present, all XML documents are also SGML
documents, so it seems unnecessary to create /usr/share/xml and /etc/xml.


R002 - DocBook Directory layout
-------------------------------

This is the layout for a Jade-based or an Openjade-based system. DocBook
applications based on other parsers, or even any other SGML application,
can be based on this layout as well.

In /usr/share/sgml, the upper level directories identify the DTD that
is concerned. Things that are not DTD-specific go directly into
/usr/share/sgml under their own directory.

The lower level directories are package-related. They are
also version-numbered.

/usr/share/sgml/
     sgml-iso-entities-8879.1986/
     xml-iso-entities-8879.1986/
          (the ISO entities)
     jade-1.2.1/
     openjade-1.3/
     ...
                (the parsers and DSSSL engines
           architecture-independent files)
     ...

/usr/share/sgml/docbook/
        sgml-dtd-3.1/
        sgml-dtd-4.0/
        xml-dtd-4.0/
                (the DocBook DTD)
        dsssl-stylesheets-1.54/
        xsl-stylesheets-1.12/
                (DSSSL style sheets for DocBook)
        kde-customization-0.1/
        gnome-customization-0.1/
        ldp-customization-0.1/
                (customized DTDs, entities and style sheets for
           the various projects)
     ...

(version number examples are arbitrary in this list)


R003 - Open Catalog usage for SGML
----------------------------------

Open Catalog files include:
- the individual catalogs provided with the DTDs, sylesheets or entities.
- the centralized catalogs used as central source of information
  that is specific to docbook, tei, or any other DTD
- the super catalog that references indirectly all the available
  catalog files

The centralized catalog file names must end in .cat and reside in
/etc/sgml.  They contain only comments and CATALOG directives pointing
to the "real" catalogs, like:

     -- sample contents of /etc/sgml/foo-1.05.cat --
     CATALOG /usr/share/sgml/foo/xml-dtd-1.05/catalog
     CATALOG /usr/share/sgml/foo/xsl-stylesheets-0.1/catalog

One can use DELEGATE instead of CATALOG if this directive is known to
be supported.

The centralized catalogs are DTD-specific and can be version-numbered.

Here are examples of such centralized catalogs:
/etc/sgml/
     sgml-docbook.cat
     sgml-docbook-3.1.cat
     sgml-docbook-4.0.cat
     xml-docbook-4.0.cat

Version-less centralized catalogs could be only symbolic links to the
latest version (or to any other older version).

/etc/sgml/catalog is the "super catalog". It contains CATALOG pointers
to all the centralized catalogs:

     -- sample contents of /etc/sgml/catalog --
     CATALOG /etc/sgml/sgml-docbook.cat
     CATALOG /etc/sgml/xhtml.cat
     CATALOG /etc/sgml/mathml.cat

One can use DELEGATE instead of CATALOG if this directive is known to
be supported.

It should not point to centralized catalogs that are merely symbolic links
and therefore are already mentioned.

The users should be able to define their own centralized catalogs and
their own super catalog in their home directories:
     $HOME/.sgml-docbook.cat
     $HOME/.catalog

The SGML applications are not supposed to use centralized catalogs,
although their use is stronlgy encouraged: if other mechanisms allow
one to locate the real catalogs, they can be used as well. However
distribution packagers should always take care of feeding the right
entries into the super catalog and the centralized catalogs. The interface
for a script named "install-catalog" that does these maintenance tasks
is described here:

        install-catalog --add|--remove <centralized_catalog>
<ordinary_catalog>

Example:

        install-catalog --add \
          /etc/sgml/sgml-docbook-3.1 \
          /usr/share/sgml/docbook/dsssl-stylesheets-1.54/catalog

The other catalogs should be placed in subdirectories of /usr/share/sgml.
They should all be named "catalog". They are the ones who do the real
work of mapping the FPIs to file names (among other tasks).


R004 - Open Catalog usage for DocBook
-------------------------------------

This recomendation is merely a consequence of the preceeding
recomendations.

For a Jade- or Openjade-based distribution of DocBook, we suggest the
following names. Again, other SGML or XML DTDs can be based on this
structure.

/etc/sgml/
     sgml-docbook.cat
     xml-docbook.cat
     sgml-docbook-3.0.cat
     sgml-docbook-3.1.cat
     sgml-docbook-4.0.cat
     xml-docbook-4.0.cat

/usr/share/sgml/sgml-iso-entities-8879.1986/catalog
/usr/share/sgml/xml-iso-entities-8879.1986/catalog

/usr/share/sgml/jade-1.2.1/catalog

/usr/share/sgml/openjade-1.0/catalog

/usr/share/sgml/docbook/sgml-dtd-3.0/catalog
/usr/share/sgml/docbook/sgml-dtd-3.1/catalog
/usr/share/sgml/docbook/sgml-dtd-4.0/catalog
/usr/share/sgml/docbook/xsl-dtd-4.0/catalog

/usr/share/sgml/docbook/dsssl-stylesheets-1.54/catalog
/usr/share/sgml/docbook/xsl-stylesheets-1.12/catalog


R005 - Configuration files
--------------------------

Other configuration files may also reside in /etc/sgml, either
DTD-specific or application-specific. Their name should end in ".conf" and
they should follow ordinary rules for files residing in /etc as defined by
LSB. The user should be able to redefine them in his/her home directory.

Their syntax and purpose is not defined in this document.


R006 - Iso-entities
-------------------

The file names should be fixed to:
     ISOamsa.ent ISOamsb.ent ...

The identifiers should be fixed to:
     "ISO 8879:1986//ENTITIES Added Math Symbols: Arrow Relations//EN"

In the transitory period, symbolic links and duplicate declarations will
be allowed as a means to preserve the compatibility with previous naming
schemes.


R007 - Packages
---------------

C programs can get compiled with any version of a given compiler. SGML
documents can't use any version of a given DTD. They need the
corresponding DTD to reside on the same system, or at least to be
reachable. The various versions of a given DTD in turn may imply certain
versions of the style sheets.

This leads to a unusual situation where the old DTDs and style sheets
should not be replaced during a package update.

We would like to make the solutions to achieve this aware to distribution
packagers. They may choose to:
 - put the version number in the package name field
   (example: docbook-dtd-3.1-1.0.rpm)
 - not put the version number and use subpackages for each version

George Kraft IV
gk4@us.ibm.com




Reply to: