[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ASCII version of docreg spec



This is not a new version, just an ASCII version of this file for
people to look at.

Please, I will countenance no discussions about the docreg format
unless they are based on a thorough reading (or at least attempt to
read) this spec.

----------------------SNIP

                  Debian docreg File Format Specification
                  ---------------------------------------
                      Adam P. Harris<aph@debian.org>

0.1 Abstract
------------

     This document contains a specification of the Debian docreg File
     Format. A docreg file is used to register a particular piece of
     documentation on a particular system. This specification is meant to
     explain the syntax and semanics of docreg files, that is, their format
     and the meaning of their fields. 

0.2 Contents
------------

     1.        Function of docreg Files
     1.1.      Rationale 
     1.2.      Goals

     2.        The docreg File
     2.1.      Relationship Between docreg File and Document Metadata
     2.2.      docreg File Location
     2.3.      Brief Comment on the Document Store

     3.        docreg File Format
     3.1.      Augmented BNF Description
     3.2.      docreg Field Semantics

     4.        Examples of docreg files

     5.        Contributing to This Specification

0.3 Copyright Notice
--------------------

     Copyright ©1998 Adam P. Harris. 

     This specification is free software; you may redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2, or (at
     your option) any later version. 

     However, even though you are empowered to modify this specification,
     please do not do so; as a standard, it loses power if there are
     alternate versions of it available. Methods for centralized management
     and modification of this specification are outlined below. 

     This manual is free software; you may redistribute it and/or modify it
     under the terms of the GNU General Public License as published by the
     Free Software Foundation; either version 2, or (at your option) any
     later version. 

     This is distributed in the hope that it will be useful, but *without
     any warranty*; without even the implied warranty of merchantability or
     fitness for a particular purpose. See the GNU General Public License
     for more details. 

     A copy of the GNU General Public License is available as
     `/usr/doc/copyright/GPL' in the Debian GNU/Linux distribution or on
     the World Wide Web at `http://www.gnu.org/copyleft/gpl.html'. You can
     also obtain it by writing to the Free Software Foundation, Inc., 675
     Mass Ave, Cambridge, MA 02139, USA. 


-------------------------------------------------------------------------------


1. Function of docreg Files
---------------------------


1.1. Rationale 
---------------


1.2. Goals
----------


-------------------------------------------------------------------------------


2. The docreg File
------------------

     The docreg file is the file used by package maintainers to register
     documents into the Debian Document Registry. The doc-base packaging
     system (specifically the install-docs program) is responsible for
     processing the docreg file and adding the document's meta-information
     contained in the docreg file to the system's local Document Store. 


2.1. Relationship Between docreg File and Document Metadata
-----------------------------------------------------------

     Document metadata is all the information contained in the Debian
     Document Registry for a file. The composition of this metadata is
     directly related to the docreg file, since the docreg file is the sole
     transmitter of document metadata into the registry (via the
     `install-docs' file). While it is easy to confuse the difference
     between the document metadata and the docreg file, there is a
     distinction. 

     A docreg file may contain document-level metadata for any number of
     distinct document identifiers. Each document identifier may contain
     any number of distinct document formats, which is format-level
     metadata which attaches to one and only one document. 

     From the other side, a single document (a unique document identifier)
     may be composed of data gathered from several docreg files. There is
     no strong relationship between the document's metadata and the docreg
     file which has supplied the metadata into the Registry. docreg files
     as such are simply chunks of document metadata which are manipulated
     by changes in the supplying package's state (i.e., installation,
     removal). This enables multiple packages, that is to say, multiple
     docreg files, to supply formats for a single unique document. 

     In summary, documents are globally unique entities. They may have
     attributes (referred to as document metadata) attached to them. They
     also may have formats attached to them. docreg files are the carriers
     of this data in to and out of the local Document Registry. docreg
     files are attached to packages; document metadata and format
     information are attached to documents. 

     *Here the issue of contention resolution arises, unfortunately.
     Document metadata may be supplied in several distinct docreg files.
     How to resolve contention issues.* 

     Note that in the older (hamm) version of the doc-base package, the
     docreg file itself played the dual role of both the registry format
     and the document store. Each docreg file could have only one document
     identifier contained within it, and one and only one docreg file could
     describe a given document identifier. This scheme was judged as
     inadequate because more than package (therefore more than one docreg
     file) may provide additional flavors of document formats for a single
     document identifier. For instance, a German HOWTO package may contain
     translations of a HOWTO document, which was originally in English;
     since a translation of a document is considered an attibute of a
     particular document format. In this case we see a case where it is
     desirable to have two docreg files contain meta-data on a single
     document identifier. 


2.2. docreg File Location
-------------------------

     docreg files are under package maintainer control; they are never
     altered by the Debian documentation system as a whole. The files
     should be installed and removed by the package itself. More details
     can be found in the `install-docs' manual. 


2.3. Brief Comment on the Document Store
----------------------------------------

     The Document Store, in `/var/state/doc-base/docstore', is a file
     containing the collected information about all documents currently on
     the system. This file is in the same format as the docreg files. 

     The Document Store file may be processed by the doc-base system into a
     more optimized system as well, such as Berkeley database file. To be
     determined. 


-------------------------------------------------------------------------------


3. docreg File Format
---------------------

     The format of the docreg file borrows from the Debian control file
     format , which borrows from RFC 822. In general, fields are lines
     composed of a field name, a colon (`:'), and then the field data.
     Records are composed of fields separated by an empty line, or the top
     or bottom of the file. 


3.1. Augmented BNF Description
------------------------------

     The following description uses augmented BNF as defined in RFC 822.
     This standard meta-format lets us define the docreg format without
     ambiguity. See also RFC 2068 for a description and example of
     augmented BNF. 

3.1.1. Basic Rules
------------------

     The following rules define fundamental building blocks used in the
     rest of this specification. 
     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
     ISOCHAR     =  <any ISO-8859-1 character>
     CTL         =  <any ASCII control           ; (  0- 37,  0.- 31.)
                     character and DEL>          ; (    177,     127.)
     LF          =  <ASCII LF, linefeed>         ; (     12,      10.)
     SPACE       =  <ASCII SP, space>            ; (     40,      32.)
     HTAB        =  <ASCII HT, horizontal-tab>   ; (     11,       9.)
     LWSP-char   =  SPACE / HTAB                 ; semantics = SPACE
     linear-white-space =  1*([LF] LWSP-char)  ; semantics = SPACE
                                               ; LF => folding
     specials    =  "(" / ")" / "<" / ">" / "@"
                 /  "," / ";" / ":" / "\" / <">
                 /  "." / "[" / "]"
     atom        =  1*<any CHAR except specials, SPACE and CTLs>
     text        =  <any CHAR, NOT including
                     CR or LF


     ctext       =  <any CHAR excluding "(",
                     ")", "\" & CR, & including
                     linear-white-space>
     quoted-string = <"> *(qtext/quoted-pair) <">
     qtext       =  <any CHAR excepting <">,
                     "\" & CR, and including
                     linear-white-space>

3.1.2. Field Definitions
------------------------

     Field semantics are the same as defined as "Header Field Definitions"
     in RFC 822 Section 3.1, with the exception that rather than CRLF we
     use the standard Unix line separator, LF. Long header fields are
     likewise supported, as specified in RFC 822 Section 3.1.1. 

     The following is the BNF composition of docreg fields syntax. 
               field       =  field-name ":" [ field-body ] LF
               field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">
               field-body  =  field-body-contents
                              [LF LWSP-char field-body]
               field-body-contents =
                             <the ASCII characters making up the field-body, as
                              defined in the following sections, and consisting
                              of combinations of atom, quoted-string, and
                              specials tokens, or else consisting of texts>

     `field-names' are not case-sensitive. 

     For clarifications on the way that fields are composed, refer to RFC
     822. 

3.1.3. docreg Specification
---------------------------

               docreg-file =  identifier
                             *formats
          
               identifier  =  document-id        ; consider additional fields:
                              section            ; revision, date
                              title
                              abstract
                            [ author ]
          
               formats     =  format-type
                              location
                              language
                            [ title    ]
                            [ abstract ]
          
               category    =  <any defined document hierarchy category,
                               see DDH documentation>
               recognized-format = <any define document format>
          
               document-id =  "Document" ":" #atom
               section     =  "Section"  ":" category [*(SPACE category)]
               title       =  "Title"    ":" #ISOCHAR
               abstract    =  "Abstact"  ":" #ISOCHAR
               author      =  "Author"   ":" #ISOCHAR   ; consider RFC 'From'
                                                        ; addr?
          
               format-type =  "Format"   ":" recognized-format
               location    =  "Location" ":" >file location or URL<
               language    =  "Language" ":" >ISOXXX language specifier<


3.2. docreg Field Semantics
---------------------------

3.2.1. Document Metadata Fields
-------------------------------

3.2.2. Document Format Fields
-----------------------------


-------------------------------------------------------------------------------


4. Examples of docreg files
---------------------------


-------------------------------------------------------------------------------


5. Contributing to This Specification
-------------------------------------


-------------------------------------------------------------------------------


     Debian docreg File Format Specification
     Adam P. Harris<aph@debian.org>



--
To UNSUBSCRIBE, email to debian-doc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: