[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] Proposal for new source format



Bastian Blank <waldi@debian.org> writes:
> On Mon, Oct 21, 2019 at 09:29:05PM -0700, Russ Allbery wrote:

>> If we're going to go to the trouble of defining a new source format,
>> I'd prefer we embrace a VCS-based one rather than once again rolling
>> our own idiosyncratic representation of a tree of files

> I'm not completely sure what you mean with "VCS-based".  You want to add
> a complete repository (dump) to the source?  Do we need to define a
> subformat for each VCS then?  CVS, SVN, GIT, just to name some used
> ones.  In any case we would be defining our own representation anyway,
> because each VCS behaves different.

I think it's safe at this point to just use Git.  It's the dominant VCS by
far and seems likely to remain so for the apparent future.  We'll need to
have some mechanism to generate a simple Git tree from source packages in
some other format, but that's not really a problem.  Right now, we convert
100% of non-native packages to a different format than their original.  If
we used Git as a VCS format, we would be closer to most of our upstreams
and the difference for non-Git upstreams doesn't seem too significant.

If at some point something else takes over from Git, we could always
switch to a 5.0 format.

> Also this would negate all the things we've accomplished on
> reproducibilty of source packages.

That seems excessively pessimistic.  What about Git makes you think it's
impossible to create a reproducible source package?

> We never shipped history as part of our source.  Was this asked for?

We've always shipped one version of history as part of our source.  That's
a large part of the point of separate upstream and Debian tarballs.  With
the addition of quilt, we ship even more history in the form of the patch
sequence.

And yes, this has been repeatedly requested and wanted by the project
going all the way back to Joey Hess's original proposal for the 3.0 (git)
source package format.  I think that was at least ten years ago?

Those of us who wanted it then haven't stopped wanting it.

> dpkg currently supports "3.0 (git)" as format, however it was never
> accepted by the archive.

To be clear, many of us would be happily using it right now.  It wasn't
accepted by the archive because the archive team vetoed it.

> There is a reason for that, as this would force license reviews not only
> on the current state, but on the history as well.  We would also just
> distribute arbitrary information we don't actually need to ship to
> hundred of unrelated mirror people and would bring them into jeopardy.

> If something really problematic slips in, we also would be forced to
> remove all intermediate versions, because they ship the history.

I understand the concerns with shipping *all* of the history, and I think
we'll need to get somewhat creative about what history to include and what
history to elide if we still have concerns about non-free elements
sneaking into the archive via history (which I'm dubious about, but see
below).  But Git has mechanisms to handle this (shallow clones, for
instance) that still preserve some of the utility of having a native Git
package.

> I think we are talking about different things.  I'm talking about the
> source we _must_ provide to fulfil several licenses and our own policy.
> If we save them in the form of snapshots.d.o for example, we have a
> complete history of the releases.

> You are talking about the detailed history, a history that might not
> even be accurate, as it can be changed retrospectively.

To be clear, I think including the history is just one of many advantages
to basing the source format on Git.  The overall advantage is that for
many packages the Debian source package becomes a familiar construct,
rather than some idiosyncratic invention of Debian, that can be
manipulated with standard tools and that is far closer to something that
one can immediately start hacking on.  This has huge benefits even if we
ship only a shallow clone with only one revision of history.

It has more benefits if we can include history, of course.

It also more clearly unblocks releasing via pushing signed tags, which is
the way that many, if not most by total number (if not by significance),
free software packages do releases these days, thus lowering the barrier
to entry for people packaging for Debian and again standardizing on common
tools.

> Just think about what would happen if a contributor adds code he must
> not distribute for whatever reason.  Another contributor finds this and
> removes it before any release happens.  This shows up some time later
> and we get angry mails or letters stating we ship stuff we must not.  So
> we now need to purge this information everywhere, even if it was never
> inside a release.

How do you plan to deal with this problem with Salsa right now?  Can't the
archive use the same mechanism that Salsa would?

There are also plenty of packages where the risk of this happening seems
low and where the Debian package maintainer might want to accept the risk
of possibly having to throw out history in the future (most native
packages, for instance, or packages where the Debian package maintainer is
also upstream).

> I think I understand what you mean and we have different goals.

> I want to modify how we ship the source we _must_ ship, where we don't
> have the option not to.  Just make the handling of it less painfull,
> without sacrifice too many things we currently have.

> You want to ship more info in immutable form.  Info that have the
> abbility to bite us, the whole project, and many other people, just by
> distributing it.

> I hope this makes the reasons clear, why I proposed what I did and not
> further.

I'm disappointed that the archive team seems to be refusing to engage with
goals that many of us have been asking for over the past ten years.

Revising the source format without supporting tag2upload and native VCS
representations of packages strikes me as a waste of everyone's time and
will mean that we'll need a 5.0 format in the near future that does
support those goals.

-- 
Russ Allbery (rra@debian.org)              <https://www.eyrie.org/~eagle/>


Reply to: