[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: tf-random id weirdness



The regex form is (and has always been) [A-Za-z0-9-_.]+ (completely
opaque) and Cabal has never had any other guarantee about the structure
of this identifier.  Case in point, GHC 7.10 took advantage of this
flexibility to try to enforce a maximum length on IPIs (although we
backed out of this change).  If the opaqueness of these identifiers is
taken seriously, the only supported way of getting your hands on the
name of the package and its version... is to have kept track of it from
the beginning  (c.f.  ConfiguredId in cabal-install's source code.)

I looked at Dh_Haskell.sh and it seems like it would be
inconvenient if this truly were the case.  So it sounds like what
you would like instead is a guarantee that an installed package
identifier embeds the package name and version.  I suppose that
we could give this guarantee (although it would not hold for GHC 7.10!)

Supposing we gave that guarantee, currently, an installed package
identifier can be regexed with the following productions:

    $alphanum_minus_num
        ::= [A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*
    $package_name
        ::= $alphanum_minus_num(-$alphanum_minus_num)+
    $package_ver
        ::= [0-9]+(\.[0-9]+)*
    $package_id
        ::= $package_name-$package_ver
    $installed_package_id
        ::= $package_id(-$hash)?
    $hash
        ::= [A-Za-z0-9-_.]+

To actually give this guarantee, we would need to write this restriction
into Cabal and it won't be enforced by GHC until the next major release.
Furthermore, some other changes to package identifier parsing would need
to be made (specifically, we currently accept any number of trailing
tags after version numbers e.g., 0.1-alpha; additionally, we permit a
version number to be dropped.)

But I would quite like it if these identifiers could be kept
opaque; with things like internal libraries and Backpack they
definitely may be in flux.  Perhaps it would be possible for
Dh_Haskell.sh to pass around package identifiers (no hashes)
rather than installed package IDs?  I don't know enough about
the script to say one way or another.

Edward

Excerpts from Joachim Breitner's message of 2016-07-02 03:55:04 -0400:
> Hi Edward,
> 
> we treat it as opaque, but we currently try to match on a precise,
> predictable length – this seems to be more reliable than just matching
> on any sequence of characters. It helps, like, you know, types  :-)
> 
> But we can easily adjust. You can help us by giving a definite
> description of how package IDs can look like nowadays, e.g. as a regex.
> 
> Greetings,
> Joachim
> 
> Am Freitag, den 01.07.2016, 19:46 -0400 schrieb Edward Z. Yang:
> > Yeah, we started compressing the IDs so that they take less
> > length.  Is there something we can do to make things easier
> > for packagers?  In general, these identifiers are supposed
> > to be treated as opaque.
> > 
> > Edward
> > 
> > Excerpts from Joachim Breitner's message of 2016-07-01 06:16:24
> > -0400:
> > > Hi Edward,
> > > 
> > > Am Freitag, den 01.07.2016, 09:54 +0000 schrieb Clint Adams:
> > > > When building tf-random with ghc 8, an id of
> > > > 
> > > > tf-random-0.5-4z8OJUaXC1FRNfrLPFWAD
> > > > 
> > > > is produced.  Since this is the wrong length, this breaks
> > > > Dh_Haskell.sh .
> > > > 
> > > > Can someone explain what's happening and what should be done
> > > > instead?
> > > 
> > > previously, we (Debian Haskell packagers) could rely on package
> > > hashes
> > > to be 32 characters. Has this changed with GHC-8 somehow?
> > > 
> > > Greetings,
> > > Joachim
> > > 
> > 
> > 


Reply to: