[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: DEP-14: Recommended layout for Git packaging repositories



On Sun, Nov 16, 2014 at 02:03:23PM -0200, Henrique de Moraes Holschuh wrote:
> On Sun, 16 Nov 2014, Ron wrote:
> > On Sat, Nov 15, 2014 at 03:49:56PM -0200, Henrique de Moraes Holschuh wrote:
> > > On Sat, 15 Nov 2014, Raphael Hertzog wrote:
> > > > On Fri, 14 Nov 2014, Ian Jackson wrote:
> > > > > > What exactly is your use case you feel this is essential for?
> > > > > 
> > > > > I think this discussion is in danger of going round in circles.
> > > > > I'm going to leave it here and let Raphael get on with it.
> > > > 
> > > > I understand Ron's logic and there's certainly value in questioning the
> > > > need for something. But this is always a question of cost/value.
> > > > 
> > > > Even if we don't have an immediate need for this property, the problem
> > > > is that we can't always envision all the ways people will want to use
> > > > the repositories (isn't that what you were trying to tell me about
> > > > how upstream can use git?) and I'm pretty sure that dropping the epoch
> > > > will be annoying to someone at some point. And the cost of not dropping
> > > > the epoch is not very high.
> 
> ...
> 
> > If the answer to that is simply "it's relatively harmless, has minimal
> > cost, some things already do it, and it may be a useful visual cue to
> > human users, and that's all", then that's a perfectly good reason.
> 
> Agreed.  But there is a strong technical reason, too, on top of some
> technical advantages.
> 
> So let me state the strong technical reason (which should at least be
> documented in DEP-14, because this _is_ something people often gloss over):
> 
> Anything that doesn't store the full debian version (be it a git tag or a
> filename) will have a colision risk.  Consider a package that has versions
> 1:2.3.5-1 and 5:2.3.5-1.

Right, that was one of the first cases I mentioned.  When you upload
5:2.3.5-1, dak will reject it exactly because they will collide. :)

Having the epoch in the tag means you won't find that out until you
try to upload it.  Not having it would mean trying that would fail
early and you might investigate why and find that out before you've
overwritten some existing file on your own system or uploaded a
broken package.

fwiw, I don't consider that a compelling reason alone to not include
the epoch in tags - but you appear to have just confirmed that not
many people know this, even when it's explicitly mentioned in a
thread they just replied to!  So I think *that* is probably worthy
of a mention if we do recommend including the epoch in tags :)


> > If the answer is instead something like "I want automated tools to be
> > able to assume they can perfectly reverse the mangling and assume some
> > semantic meaning to the text in the tag", then we're into Broken By
> > Design territory and we really need to look more closely at *exactly*
> 
> Well, we can have a perfect, reversible transformation between the debian
> version namespace and the debian git-tag version namespace, where "debian"
> means "a vendor that uses dpkg and the deb format".

No.  I don't believe you can.

Mangle this version in a way that's legal to git: "0:1.2:3..4-1~~2..3.lock"

Now try to explain to me how you can reverse that.

Yes, it's a corner case you've probably never seen, but the troublesome
parts of that are all legal debian versions.


> In that case, a debian version tag really could have one valid semantic
> meaning: one can derive the debian version from the tag, and also the
> opposite.
> 
> What one cannot do is try to use either the debian version tag or the debian
> version to infer anything about the upstream version/versio tags.  We agree
> on this.

There's really no reason to need to depend on either.  There are robust
ways to actually get this information correctly if you do need it.
There's no need to invent a new one that only partly works when the
planets are aligned auspiciously.

I do strongly believe we should not be encouraging any tool to be that
kind of fragile.  But that's a separate question to "should we include
the epochs in tags anyway, despite they fact they should *never* be
used for this".


> This assumes that DEP-14 will mandate exactly how debian version tags are
> formed, and that tools that require DEP-14 semanthics to work correctly will
> be upfront about that.

Which is itself a fragile assumption.  It can't *mandate* anything,
or go back and change history to date.


> > what it is that people want to do and/or assume, and we probably need
> > to look at less fragile solutions that really do satisfy that need.
> 
> Trying to handle colisions due to missing epoch information in the git tag
> is far more fragile, IMO.
> 
> Any tool that needs the full debian version information would either need it
> in the tag in a known format, or it would have to parse debian/changelog,
> which requires fetching the correct blob from git, and parsing it using
> dpkg-parsechangelog.  This is *not* fun to implement correctly, especially
> when you don't want to mess with the work tree.

This is, a) trivial, gitpkg has done it for years.  b) it doesn't need
to mess with the work tree.  c) any tool that breaks depending on the
mess currently in your work tree is arguably already broken and fragile
in any case.

gitpkg can export *any* version you want, at any time, regardless of
the current state of your work tree.

Even if you're in the middle working on a new version, you can still
safely export any version from it at any time.


If other tools can't do that, people should fix them, or switch to
better tools, not formalise broken assumptions as a 'permanent'
state of affairs.  We have a working proof that broken assumptions
aren't required, going backwards from that in a 'best practices'
document seems like an oxymoron.


> > Henrique, would you care to elaborate on your definition of "safer"?
> 
> Sure.  I consider safer a design that loses no information when translating
> between the debian version namespace and debian version git-tag namespace,
> because you won't have permanent colisions among debian versions that differ
> only on the epoch.

Ok, I can agree with that definition, but as I showed above, this
plan doesn't satisfy it.  You'd need to fully URL encode them or
similar to have that, and I don't think we should do that.

You can still semi-meaningfully encode the epoch in a way that
is a clue to humans, but you *can't* guarantee the version is
reversible from the necessary mangling.


> > They still aren't guaranteed to be so even with this convention,
> > because there's no hard guarantee that everyone or everything will
> > use them.  And even if they did, I still don't think you've covered
> > every combination of things that are illegal in a git refname which
> > might need to be mangled here.
> 
> I'd expect tools that make use of DEP-14 strongly recommended/mandatory
> definitions to require DEP-14-compliant repositories.  I don't think there
> is any other sane way to deal with this, other than "don't write such a
> tool".

Well, don't don't write such a tool, and don't recommend things that
would mislead people into writing fragile tools :)

I haven't seen anybody suggest any use case yet where there actually
is no (already proven) way around being fragile - so I still do think
the sane way forward here is not to formalise a broken design and then
tell people who yet bitten by it "yeah, you're screwed. because DEP14"
when we still have the chance to make good recommendations that aren't
broken like that.

Are there actually already existing tools that rely on reversing this?
If there aren't we certainly shouldn't be encouraging creating new ones
with this flaw in the future.

  Cheers,
  Ron



Reply to: