Why YAML is not a good choice for Debian control files.
Le Fri, Jul 31, 2009 at 10:01:36AM -0400, Adrian Perez a écrit :
> There's any plan of supporting another format - without breaking
> compatibility, I mean supporting - besides the RFC one?
> I think YAML would be a good one.
Hello Adrian,
I thought about YAML for machine-readable license summaries and came the
conclusion that it is not suitable. I think that it is also true for Debian
control files for the following reasons:
The “pseudo-RFC” format that Debian uses is organised in paragraphs, also
called ‘stanzas’, and often the first of them has a special role. YAML on the
other hand has concepts of scalars, sequences and mappings (in Perl, they would
be called scalars, arrays and hashes). First of all, if we want the first
paragraph of a Debian control file to have a special role, then the YAML must
be organised as a sequence of mappings. Here is YAML's example:
Example 2.4. Sequence of Mappings
(players’ statistics)
-
name: Mark McGwire
hr: 65
avg: 0.278
-
name: Sammy Sosa
hr: 63
avg: 0.288
(http://www.yaml.org/spec/1.2/spec.html#id2559116)
In a Debian control file, it would reduce readability with no benefit.
Second, the “pseudo-RFC” format delegates the management of folding to the
Debian Policy, while in YAML it has to be part of the markup: “|” and “>” are
used to denote when line breaks are significant or not:
name: Mark McGwire
accomplishment: >
Mark set a major league
home run record in 1998.
stats: |
65 Home Runs
0.278 Batting Average
(http://www.yaml.org/spec/1.2/spec.html#id2559996)
So basically, switching Debian control files to YAML would mean addign “-”, “|”
and “>” signs in precise locations, each of them being one opportunity for a
parsing error.
Here is a simple example based on a debian/control file for the seaview package:
Source: seaview
Section: non-free/science
Priority: optional
Maintainer: Debian-Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
DM-Upload-Allowed: yes
Uploaders: Charles Plessy <plessy@debian.org>
Build-Depends: debhelper ( >= 7 ), libfltk1.1-dev, libjpeg62-dev, libpng12-dev, libxft-dev,
libxext-dev, zlib1g-dev
Standards-Version: 3.8.1
Vcs-Browser: http://svn.debian.org/wsvn/debian-med/trunk/packages/seaview/trunk/?rev=0&sc=0
Vcs-Svn: svn://svn.debian.org/svn/debian-med/trunk/packages/seaview/trunk/
Homepage: http://pbil.univ-lyon1.fr/software/seaview.html
Package: seaview
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
Recommends: clustalw, muscle, phyml
Description: Multiplatform interface for sequence alignment and phylogeny
SeaView reads and writes various file formats (NEXUS, MSF, CLUSTAL, FASTA,
PHYLIP, MASE, Newick) of DNA and protein sequences and of phylogenetic trees.
Alignments can be manually edited. It drives the programs Muscle or Clustal W
for multiple sequence alignment, and also allows to use any external alignment
algorithm able to read and write FASTA-formatted files.
.
It computes phylogenetic trees by parsimony using PHYLIP's dnapars/protpars
algorithm, by distance with NJ or BioNJ algorithms on a variety of evolutionary
distances, or by maximum likelihood using the program PhyML 3.0. SeaView draws
phylogenetic trees on screen or PostScript files, and allows to download
sequences from EMBL/GenBank/UniProt using the Internet.
Translated in YAML, it would be:
-
Source: seaview
Section: non-free/science
Priority: optional
Maintainer: Debian-Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
DM-Upload-Allowed: yes
Uploaders: Charles Plessy <plessy@debian.org>
Build-Depends: >
debhelper ( >= 7 ), libfltk1.1-dev, libjpeg62-dev, libpng12-dev, libxft-dev, libxext-dev,
zlib1g-dev
Standards-Version: 3.8.1
Vcs-Browser: http://svn.debian.org/wsvn/debian-med/trunk/packages/seaview/trunk/?rev=0&sc=0
Vcs-Svn: svn://svn.debian.org/svn/debian-med/trunk/packages/seaview/trunk/
Homepage: http://pbil.univ-lyon1.fr/software/seaview.html
-
Package: seaview
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
Recommends: clustalw, muscle, phyml
Description: |
Multiplatform interface for sequence alignment and phylogeny
SeaView reads and writes various file formats (NEXUS, MSF, CLUSTAL, FASTA,
PHYLIP, MASE, Newick) of DNA and protein sequences and of phylogenetic trees.
Alignments can be manually edited. It drives the programs Muscle or Clustal W
for multiple sequence alignment, and also allows to use any external alignment
algorithm able to read and write FASTA-formatted files.
It computes phylogenetic trees by parsimony using PHYLIP's dnapars/protpars
algorithm, by distance with NJ or BioNJ algorithms on a variety of evolutionary
distances, or by maximum likelihood using the program PhyML 3.0. SeaView draws
phylogenetic trees on screen or PostScript files, and allows to download
sequences from EMBL/GenBank/UniProt using the Internet.
Alternatively, the Description field could indicate with the markup that the
first line is the short description. Either:
Description:
- Multiplatform interface for sequence alignment and phylogeny
- |
SeaView reads and writes various file formats (NEXUS, MSF, CLUSTAL, FASTA,
PHYLIP, MASE, Newick) of DNA and protein sequences and of phylogenetic trees.
Alignments can be manually edited. It drives the programs Muscle or Clustal W
for multiple sequence alignment, and also allows to use any external alignment
algorithm able to read and write FASTA-formatted files.
It computes phylogenetic trees by parsimony using PHYLIP's dnapars/protpars
algorithm, by distance with NJ or BioNJ algorithms on a variety of evolutionary
distances, or by maximum likelihood using the program PhyML 3.0. SeaView draws
phylogenetic trees on screen or PostScript files, and allows to download
sequences from EMBL/GenBank/UniProt using the Internet.
or:
Description:
Short: Multiplatform interface for sequence alignment and phylogeny
Long: |
SeaView reads and writes various file formats (NEXUS, MSF, CLUSTAL, FASTA,
PHYLIP, MASE, Newick) of DNA and protein sequences and of phylogenetic trees.
Alignments can be manually edited. It drives the programs Muscle or Clustal W
for multiple sequence alignment, and also allows to use any external alignment
algorithm able to read and write FASTA-formatted files.
It computes phylogenetic trees by parsimony using PHYLIP's dnapars/protpars
algorithm, by distance with NJ or BioNJ algorithms on a variety of evolutionary
distances, or by maximum likelihood using the program PhyML 3.0. SeaView draws
phylogenetic trees on screen or PostScript files, and allows to download
sequences from EMBL/GenBank/UniProt using the Internet.
As you see, in terms of human readability and writability, YAML does not bring
advantages over the current format.
I like YAML a lot, so if I overlooked something that would make it more
suitable, please let me/us know !
Have a nice week-end,
--
Charles Plessy
Tsurumi, Kanagawa, Japan
Reply to: