[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please review the copyright file of bwa.



Le Wed, Oct 07, 2009 at 08:39:25AM +0200, Andreas Tille a écrit :
> 
> I had a look and do not have any additions regarding the content.
> However I'm curious about the formatting.  If the copyright information
> should be expressed in RFC822 format you need to connect the paragraphs
> by a line conteining "^ .$" to make sure it will be machine parseable.
> 
> Further comment:  If you replace your rules file content by
> 
> %:
> 	dh $@
> 
> You can spare the cdbs build dependency.

Hi Andreas,

thank you for the review.

While working on the machine-readable format for Debian copyright files, I
realised that there are several good reasons to not use the same format as for
Debian control files (which actually does not comply with RFC822). In
particular, it does not allow free-form comments, and is much more compact than
the ‘human-readable’ template from the dh-make package. Also, in my experience,
adding the inter-paragraph dots is boring and error-prone, which makes the
format quite unfriendly.

In the experimental format I use for my packages and would like to propose for
DEP-5, the syntax is simplified:

 - Fields are composed of a name and a body, separated by a colon and optional
   spaces. Field bodies are ended by line terminators.

 - A field name is composed of printable characters, except colons.

 - The field body is composed of any character. Leading spaces of the body are
   ignored. To avoid problems with multi-line values, any line terminator must
   be escaped by following it with a space. The line that contains that space is
   called a continuation line.

 - Lines that are not continuation lines and do not start a new field are plain
   comments.

 - Fields are grouped in paragraphs that are separated by empty lines. The
   paragraphs are organised in a sequential order. Within a paragraph, the
   fields are not ordered. If the same field appears more than once in the same
   paragraph, their contents are added.

(http://git.debian.org/?p=users/plessy/license-summary.git;a=blob_plain;f=dep5.mdwn)

It is similar in principle to the Posfix configuration file syntax, except that
newlines are preserved and colons replace the equal signs:

http://www.postfix.org/postconf.5.html

Translated in Perl, it would give:

 - If a line matches /^(\w+)\s*:\s*(.*)$/, it starts a new field of name $1 and
   content $2 in the Perl language. If there is already such a field in the
   current paragraph (stanza), the contents are added.

 - If a line matches /^\s(.*)$/, it extends the current field with $1\n.
   Otherwise, it is a free-form comment.

 - If a line matches /^$/, it finishes the current paragraph.

 - All other lines are free-form comments.

I have not yet tried the above regular expressions yet… Also, in theory other
approaches could be used, such as “slurping” the whole file, escaping the
newlines followed by spaces, splitting it into an array and collapsing this
array into an array of hashes (which Debian control files are).

I think that the chances I propose make the machine-readable Debian copyright
files much more human-readable. See for another example libbio-scf-perl:

http://packages.debian.org/changelogs/pool/main/libb/libbio-scf-perl/current/copyright

About your other question: I tend to chose CDBS when I forsee that compilation
flags may have to be tuned, because it is more straightforward to do so with
CDBS in my opinion. In other cases (in particular Perl modules), I use dh 7.

I will upload bwa and review dicomscope soon.

Have a nice day,

-- 
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan


Reply to: