[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Parsing package filenames (was: Re: New ftp method for dselect)



"brian (b.c.) white" <bcwhite@bnr.ca> said:

> I looked into this more closely and it seems that most of the packages
> that once had dashes in the version stings are now gone.  If neither
> the version nor revision strings can have dashes, then counting "-"'s
> will break up the filename without having rename all the packages with
> "--" in them.
> 
> Exceptions:  (the ones I saw, anyway)
> 	stable/binary/net/bind-4.9.3-BETA24-1.deb
> 	debian-1.0/binary/net/bind-4.9.3-BETA26-2.deb

Also:
Package: cern-httpd
Package: elv-fmt
Package: auto-pgp
Package: linuxdoc-sgml
Package: ncurses-runtime
Package: electric-fence
Package: boot-floppies
Package: ax25-kernel-source
Package: gopher-client
Package: w3-el
Package: arrl-infoserv
Package: elv-vi
Package: emacs-el
Package: elisp-manual
Package: mt-st
Package: ax25-util
Package: mh-papers
Package: ncurses-developer
Package: pgp-i
Package: pgp-us
Package: wu-ftpd
Package: elv-ctags

Personally, I also think we'll be better off if we bite the bullet and
try to maintain as much backwards compatability as we can with current
package naming usage than if we fall into a pattern of blowing off
backwards compatability issues in the interest of implementor convenience.

I believe that the vast majority of packages currently follow these
conventions:

    filenames have the form <PKG>-<VER>-<REV>.<EXT>
    e.g.:                   ab-cd-1.23a-45678.tar.gz
    Field Separators:            -     -     .
    Field Contents:         ab-cd 1.23a 45678 tar.gz

    Reading from right to left:
    EXT is currently either .deb, .changes, .tar.gz, .diff.gz.
        It might be restricted to alphanumerics and '.' chars.
    '.' currently separates REV from EXT
    REV typically is numeric only.  It might be restricted to
        alphanumerics.  There's been some talk of some packages
        not providing this field, but it might be made mandatory.
    '-' currently separates EXT from VER
    VER typically contains numerics, but alphas and '.' chars
        are not uncommon.  I think there's only one case of a
        '-' in VER:  bind-4.9.3-BETA24-1.  The current convention,
        not always followed 100.00%, is that VER should track the
        upstream maintainer's version number.  We might relax this
        to the extent of restricting VER to not contain any '-' chars.
        '-' chars in upstream version numbers could be transliterated
        to '.' or '_'.
    '-' currently separates VER from PKG
    PKG typically contains alphanumerics.  In some cases, it contains
        '-' chars.  It's typically taken from the upstream source code
        author's name, which could contain any printable chars.  This
        cannot be restricted without relaxing the (IMO sensible)
        convention that debian package names track the upstream package
        name.  (That convention isn't followed in all cases.  Exceptions
        generally occur when the a debian package is a split or a join
        of upstream packages, and where multiple binary packages are
        produced from a single source package.)
        
Counting from the end towards the front of the typical package filename
string, the first '-' encountered separates REV from VER.  Take that as a
reference point.

 -  Counting to the right from that point, the first '.' encountered
    separates REV from EXT.  Whitespace to the right of EXT delimits it.
 -  Counting to the left from that point, the first '-' encountered
    separates PKG from VER  Whitespace to the left of PKG delimits it.

That's a bit messy, but maintaining backwards compatability is often
messy.  Blowing off backwards compatability whenever maintaining
it gets inconvenient is a tempting option, but not something which
should not become a habit (IMHO).


Reply to: