[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: DEP-14: Recommended layout for Git packaging repositories



On Wed, Nov 19, 2014 at 03:41:12PM +0000, Ian Jackson wrote:
> Ron writes ("Re: RFC: DEP-14: Recommended layout for Git packaging repositories"):
> > As I explained in the earlier discussion with Henrique, there are more
> > things than just : and ~ which are perfectly legal in debian version
> > strings, but are illegal constructs in git refnames.
> 
> AFAICT[1], the exceptions are:
>  * No leading `.'
>  * No trailing `.'
>  * No trailing `.lock' (wtf?)
>  * No consecutive dots `..'

There are more rules than that, but yes, I think these are the only
ones that intersect with legal debian versions.

I'm not sure what you think is wtf about git wanting to ensure that
its lock files are unambiguously not refs though.


> I am not aware of any special handling of these cases by existing
> Debian tools.  I think git-buildpackage and dgit and probably most
> other tools would simply construct a git-illegal tag name, and bomb
> out when they can't create the tag.

Did you miss the part in the mail you replied to where I said that
gitpkg handles these, and that (I think) git-dpm handles them too?
gitpkg has handled them since the very beginning and sanitises
version strings anywhere that they are automatically turned into
a generated refname.

> I think existing version numbers which violate these rules are
> probably rare.

Yes, they are surely rare.  But when you have a tool like
git-debimport that was designed to be able to import every previous
version of a package that was ever released to build a complete
history of the package, it needs to be able to cope with what is
legal, not just what is common.


> We should IMO fix this by inventing a representation for these
> troublesome cases.  The following characters in git ref names
> generated from Debian version numbers remain entirely unused and
> available AFAICT:
>   ! = ,

There's a pretty good reason people avoid those for literal strings
that might be passed on a shell command line whenever possible ...


> Additionally we can use these if we are slightly careful:
>   @
> There are some obscure restrictions on `@' which aren't likely to
> affect us.
> 
> 
> I would suggest that we should do some simple replacement of
> troublesome `.' by (say) `!'.
> 
> Eg in Perl:
>   s/\.(?=$|-|\.|lock$)|^\./!/g;
> (We want to replace . preceding - so that don't end up with
> revisionful numbers being treated differently to the corresponding
> revisionless substrings.)  Examples:
>   foo..bar            foo!.bar
>   foo.-bar            foo!-bar
>   .foo                !foo
>   bar.                bar!
> 
> But, ultimately, this is a bikeshed issue.

Well, you've still skipped over the issue of why we'd even *want*
something this arcane, that pretty much nobody who tags things
by hand is ever going to follow, and what it's going to achieve
that isn't just a fragile castle made of sand.

Tags are convenience names and bookmarks for *humans*.  There are
far better ways to get the information that machines need robustly.
Trying to conflate them will just make something that is horrible
for both.


> I think the authors of DEP-14 and/or the maintainers of
> git-buildpackage should decide.
> 
> Guido, Raphael: please let us know what you decide.

If you do that, please rename this DEP to "recommendations for gbp".
If that's what you want, I'm happy to let you do whatever you please,
but if you want this to actually be a proper exploration of best
practices, then it actually needs some rigour to *answer* the hard
questions, and to not ignore the actual working solutions that tools
you don't use have already developed to deal with them successfully.

Just because you've never thought about these things before and were
unaware of them, doesn't mean that everyone else was ignorant of them
too.

If you're inventing convoluted solutions on the run, without clearly
identifying the problem they are supposed to solve, it's hard to
imagine that there aren't a whole host of other things you've probably
missed about this too which could also benefit from some more open
and detailed discussion.


You avoided the question and got angry the first time I asked it,
and didn't comment on the further discussion that Henrique and I
had about what was and wasn't possible with this.  If you see some
real problem that you think I'm missing, and haven't so far addressed
please tell us what the *problem* is, not what the first solution
you could think of was.  That way we can actually have a sensible
discussion on it, to arrive at some sort of consensus that covers
*all* the problems everyone here can see.  Not just the one your
one tool cares about.


> > To make this even more amusing, somewhere between git 1.7 (on squeeze)
> > and git 2.1, the rules for what were legal in git refs were tightened
> > (at least once) to exclude even more things.  I don't know offhand if
> > this is likely to happen again at any stage.
> 
> The differences are (things documented as permitted git ref names in
> 1.7 and documented as forbidden in 2.1.3):
>  * Consecutive slashes
>  * Whole ref name beginning or ending with a slash
>  * Whole ref name being the single string `@'
> 
> This seems like a documentation clarification, rather than an
> implementation change.  I doubt very much that any such ref names
> would have worked with 1.7.

Yes, that possibly quite true.  I know I tightened up the filtering
in gitpkg to also handle the cases covered by the new rules, but I
didn't dig into what if anything changed in git itself, at least
partly because the changes didn't seem like something that would
actually intersect with a name one of the gitpkg tools would ever
need to mangle.

Given this I'm a little hesitant to make any strong assumption
that could be hard to paint our way out of in the future though.
At least not when there's a trivial way to avoid that like I'm
seeing that we have available to us.


> > I'd also be a bit wary of changing existing tools to behave differently
> > (between different Debian releases that are still supported even) without
> > a very careful look at what might be affected if previous transformations
> > were no longer reproduced in the same way, or behaved differently depending
> > on exactly which chroot you ran them in.
> 
> At the moment we have a number of tools which do things differently.
> The effect is that it is difficult to rely on the information you get
> from the various repos.
> 
> Standardising this seems like an improvement.

I apologise if I'm somehow seeming thick to you over this, but an
improvement of *what*?

The information that you really need isn't, and as we're seeing,
pretty much _can't be_ in the tag names in any even remotely elegant
and robust way.  The tags are for humans and we shouldn't try to
usurp that from them.  If you want machine data, there are much better
ways to obtain that.

If we start with a problem description, we can talk about those much
easier, without needing to get frustrated with each other :)  And
maybe together we'll see things that improve all our tools.

  Thanks in advance!
  Ron



Reply to: