[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[DEP-5] [patch] Syntax of the files.

Le Thu, Aug 12, 2010 at 06:18:24PM -0700, Russ Allbery a écrit :
> I would prefer to stick to a Debian control file format, since otherwise
> implementing DEP-5 aware checks in tools like Lintian is going to be more
> painful than it needs to be.

I will come back with my favorite deviation of the format only if I manage
to write a Lintian parser.

However, I think that it is important to specify the format anyway, because
although the DEP uses a syntax that is close to the Debian control file format
or the RFC 822, it has some differences. If we would like people to write DEP-5
files or parsers without extensive knowledge of our oral culture, we need to
write it down.

First here is a list of some differences between the DEP and the other similar

 * RFC 822 and successors specify that lines finish by carriage returns and 
   line feeds (\r\n) whereas in DEP-5 a newline (\n) is expected.

 * In RFC 822 and successors, folding fields (see §2.2.3) does not preserve 
   newlines, whereas in Debian control files and DEP-5, identification of the 
   first line of the field body is crucial.

 * In RFC 822 and successors, there is a complex speficiation of inline comments,
   that are not expected in DEP-5.

 * Debian control files allow comment lines that start with a sharp sign, but
   currently DEP-5 does not.

 * In Debian control files, a succession of two empty lines ends the
   machine-parseable record. I am not sure we want this for DEP-5…

 * Debian control files escape empty lines in a field body by replacing them by
   lines containing a single point, but this is not part of RFC 822 and 

I attached a patch as a proposition of a simple syntax description. We could
refer to the Debian Policy instead, but it contain specific instrctions about
fields where folding is not allowed, that may be confusing, especially if we
would like to propose Upstreams to use DEP-5 for themselves. Also, it does
not mention double-empty line as a record terminator.

Lastly, for the sake of simplicity, I propose that we chose either ‘stanza’ or
‘paragraph’ and use exclusively this term.

Charles Plessy
Tsurumi, Kanagawa, Japan
>From df8f12ebf619c5ec0b65274e632e399a900c58c3 Mon Sep 17 00:00:00 2001
From: Charles Plessy <plessy@debian.org>
Date: Sat, 14 Aug 2010 15:45:25 +0900
Subject: [PATCH]  # Written description of the DEP-5 file format.

 dep5.mdwn |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/dep5.mdwn b/dep5.mdwn
index d12c3a3..ecbac0f 100644
--- a/dep5.mdwn
+++ b/dep5.mdwn
@@ -54,10 +54,7 @@ A user might want to have a way to avoid software with certain licenses
 they have a problem with, even if the licenses are DFSG-free. For
 example, the Affero GPL.
-# Compatibility and Human-Readability
-The file must be encoded as UTF-8 and strictly formatted as a superset
-of RFC2822 including significant newlines. Free-form text is not
+# Human-Readability
 The `debian/copyright` file must be machine-interpretable, yet
 human-readable, while communicating all mandated upstream information,
@@ -66,6 +63,29 @@ copyright notices and licensing details.
 For the sake of human-readability this proposal avoids any complex field
 names or syntax rules.
+# Syntax
+This file uses a syntax similar to Debian control files, in the spirit
+of the RFC 822.
+ * The file must be encoded as UTF-8.
+ * Fields are logical elements composed of a field name, followed by a colon,
+   followed by a field body.
+ * A field name is composed of printable characters, except colons.
+ * Whitespace leading the field body is ignored.
+ * Field bodies are terminated by a line feed character, except when it is
+   followed by a space.
+ * Empty lines in field bodies are escaped by adding a single period (.) to
+   them.
+ * Empty lines between fields delimitate groups of related fields, called
+   ‘stanza’ or ‘paragraphs’.
 # Implementation
 ## Sections
 ### Header Section (Once)

Reply to: