[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [DEP5] Asking for common wisdom on new field(s): References*



On Wed, Nov 24, 2010 at 09:57:32PM -0500, Yaroslav Halchenko wrote:
> > I really think that this information should be in another file...
> 
> well, in general I do not mind; but yet 'another file' might introduce
> more cons than pros

This "another file" is probably used for > one year and was discussed
several times.  However,  it was not adopted outside Debian Med and
Debian Science as far as I know.
 
> > Would you be interested in this approach ?
> 
> Well, I would adopt any approach really which would be somewhat
> transparent and easy to use for us (debian maintainers):
>  
>  * easy to embed existing references
>  * not necessary to duplicate information across multiple files
>  * maintain ability for 2 liner debian/rules ;)

These are reasonable requirements and they are fullfilled.
 
> and for the users:
> 
>  * reference should be in 'ready to use' format and ideally readily
>    available (so no cp, or cut/paste necessary for each individual
>    reference)

The only "ready to use" format is IMHO BibTeX.  *Conversion* to BibTeX
is cheap, but a conversion between yaml and BibTeX will be necessary
(but easy to do with a script).

>  * so may be we could even compile for them easily the
>    "Debian upstream references" bibliography
> 
>  as a consequence, complete pipeline should avoid too much of
>  conversion, i.e.

While conversion is needed I think it is not what I would call "too
much".
 
>  COMMON_REFERENCE_FORMAT1 (used by upstream) -> UNIFIED_DEBIAN_FORMAT (used by us) -> COMMON_REFERENCE_FORMAT (used by users)
> 
>  especially if  there is an existing dominant 
>  COMMON_REFERENCE_FORMAT1 == COMMON_REFERENCE_FORMAT2
> 
>  should ideally be avoided

As far as I know there is no such thing like a common format between
different upstream sources.  The most references I have found were more
or less free text information on web pages, sometimes in README files.
I did not yet found a BibTeX file inside an upstream source.  So IMHO
we rather has the situation:

  unstructured information (used by upstream) -> UNIFIED_DEBIAN_FORMAT (used by us) -> COMMON_REFERENCE_FORMAT (used by users)

So we definitely have

  unstructured information != COMMON_REFERENCE_FORMAT (used by users)
 
> > packages in our archive that contain it in debian/reference, debian/references
> > or debian/upstream-metadata.yaml.
> 
> please point me to the representative package so I could have a look,
> especially for debian/upstream-metadata.yaml in regards to the
> wishes stated above.

If you like to have a view into Debian Med SVN:

$ find trunk/packages -name upstream-metadata.yaml | sed 's?^trunk/packages/[R/]*\([^/]\+\)/.*?\1?' | sort | uniq | wc -l
56
$ find trunk/packages -name upstream-metadata.yaml | sed 's?^trunk/packages/[R/]*\([^/]\+\)/.*?\1?' | sort | uniq | head 
acedb
adun.app
alien-hunter
altree
autodocksuite
ball
bioperl
bitops
bwa
clustalw

So there are some examples in an area close to your workfield ...
 
> I have ran into samstools, but that one has bulk of things duplicated
> among control and upstream-metadata.yaml, and upstream-metadata.yaml and
> reference

To compare with the same set as above we have

$ find trunk/packages -name reference | sed 's?^trunk/packages/[R/]*\([^/]\+\)/.*?\1?' | sort | uniq | wc -l
18

which makes four times more upstream-metadata.yaml files than reference
files.  As far as I know the debian/reference file was used by some
maintainers before the suggestion
 
> > http://wiki.debian.org/UpstreamMetadata

was born and never widely accepted (because not properly published?) In
principle both files might fullfill the same purpose if we are talking
about references only.  In this case the BibTeX formated reference file
would be even better because it does not need further conversion.

However, to get widely accepted in more Debian packages than only
scientific packages with references and to become an accepted standard
in Debian the focus to only references is to narrow.  In this respect I
think upstream-metadata.yaml is the better choice.  Strictly speaking:
If you want to continue the discussion here on debian-project you should
talks about upstream-metadata.yaml.  If you want to talk about the
reference file you can rather move the discussion to debian-science list
because the general Debian maintainer / user will not be very
interested.

> In general  I like this idea, BUT unfortunately I do not see it being
> complete without avoiding duplication of information... unless
> automated...

That's the point.  If we would drop references and provide a script
   yaml2bibtex
which converts upstream-metadata.yaml to references (preferably in a
default location like
   /usr/share/references/<package>.bib
or something like this) and if we do this at package build time (or
postinst??) the upstream-metadata.yaml approach would probably the
most flexible idea.
 
> Let me elaborate:  due to the historical evolution of Debian
> packaging we have already other files which one way or another do
> contain 'UpstreamMetadata' -- control, copyright, watch are the most
> "popular" ones.

Yes.  Duplication should definitely be avoided.  I'm not really happy
about keeping Homepage and watch inside upstream-metadata.yaml - except
if we *really* start accepting this file and replace debian/watch as
well as moving Homepage from control to upstream-metadata.yaml and all
tools fetch the information from there.  Because I do not see happen
this in the foreseeable future I would leave out this information from
upstream-metadata.yaml until it is really accepted.

> With DEP5 copyright gets even closer to the content of
> upstream-metadata.yaml, (just use Maintainer for Contact, more vague
> Remark for Donation).  So what becomes left for upstream-metadata.yaml ?
> seems to be primarily 'Reference's, which I logically placed into
> existing copyright file (reasoning was included in original email why
> this file is imho appropriate).

IMHO the main problem is that we are not really able to relay on the
automatically readable copyright files and thus the bibliographic
information might be simply hidden there.  While I'm not fully convinced
that references really belong to debian/copyright I think if we want to
have machine readable bibliographic information *now*(ish) we need to
find another solution (which could be provided by
upstream-metadata.yaml).

For your question what becomed left other than references?  IMHO
  http://wiki.debian.org/UpstreamMetadata
has some more information fields which are definitely of general
interest.
 
> so... may be there should/could be
> 
>  * minimalistic debian/upstream-metadata.yaml.in just extending
>    information from other files, not duplicating it

YES!!

>  * helper which generates a 'complete' debian/upstream-metadata.yaml
>    and gets Ok'ed by Joey Hess to become a part of debhelper ;-)

That's an interesting idea.
 
Kind regards

      Andreas. 

-- 
http://fam-tille.de


Reply to: