[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Notes from the DebConf Source Format BoF



After a discussion on IRC, I organized a BoF at DebConf10 to discuss new
source formats, specifically 3.0 (git).  Below are the notes from that
discussion.  I tried to take reasonably comprehensive notes, but I'm sure
that I missed things.  Other participants, please add any additional bits
that I forgot!

Notes for source package format ad hoc session
Friday, 2010-08-06, 10:30 - 11:30 (EDT)

Agenda
 * 3.0 (git)
 * ftp-team worries about VCS-embedded source package formats
 * git push as an upload mechanism
 * how much does debcheckout make this irrelevant?
 * 4.0?

ftp-team is concerned about doing license checks across the entire git
archive Colin points out that we're in the same situation with Alioth for
redistributability.  However, it is easier to withdraw things from Alioth
than from the archive.  And redistributability (the legal requirement) is
a lot less of a bar than what we check for DFSG.
 - shallow clones do bound the amount of work that has to be done here
   * Colin thinks that people may want to upload a lot more than that, but
     Joey doesn't think they will.
 - Colin: straw man: why is the answer not a shallow clone containing one
   revision?
   * You can pick how much you can include
   * ftp-master can make their own policy, only allow for native packages,
     limit to shallow clones
     - Lintian for that check if possible
   * Remember internal use where 3.0 (git) may be a lot more attractive
   * The default should be something ftp-master would accept

10 revisions doesn't really multiply the work by 10; it's equivalent to a
3.0 (quilt) package with 10 patches.  Is that enough history?  How do we
manage the size of the history versus the ftp-master review to keep from
imposing lots of additional work on ftp-master?
 * Having policy in advance is important; decisions on the fly cause arguments
   and frustration.

Concern that 3.0 (git) formats require Git to unpack.
Colin has heard from people who are auditing what's in Debian packages didn't
want to have to troll through version control repositories and would rather
see something more akin to what you get from a patch system.
 - Question: isn't that what you get from VCS log?
   - Answer: no, because a single change may be spread across lots of commits
     stgit/topgit, bzr loom and hg patch queues do something like this, but
     those are all still vaguely experimental

For people who are trying to figure out what we're doing, without following
what we're doing with VCS, they may be better served by 3.0 (quilt).

Joey says he pretty much agrees right now, but tools will improve and change.

You can sign a tag or branch: would it be possible for ftp-master to use a
signature or tag to mark where they've done the review to?  This would require
resigning the source package, so probably not.
 - It's unlikely that ftp-master would be doing incremental checks

Pluses for the format:
* Working in a VCS and exporting to patches is really clumsy
* TopGit and the like are rather cludgy and kind of annoying
* Part of Joey's motivation is that if you look at GitHub, the people using it
  a lot consider Git to be a source package format, and Debian should think
  about how we attract those people and how we work with them.  That does
  suggest non-native may be key in the long run, though.
  - If we're trying to get these people to be Debian contributors, native may
    be okay.

* Colin talked to quite a lot of people trying to understand our source
  packages and from their point of view they know what a patch is but
  beyond that it gets fairly variable.
  - Joey: that may get back to whether we have it for non-native packages.
    Maybe just start with native packages.

rra has tried TopGit and finds it rather annoying, and ends up using single
Debian patch with 3.0 (quilt) and essentially having 1.0.
 - One big patch is not really a seller.
 - Sponsors have not been looking at the Git repository and just building that;
   they're in a mode of getting the source package and looking at it, which
   makes clean 3.0 (quilt) formats important.
 - Packaging teams review from the VCS.

Colin uses a patch system and double-commits, because it becomes more readable
in the end.

Why not use debcheckout and some simple package format?
- I don't trust your repository will be up -- but if it's a shallow clone, you
  still don't get the history.  But at least you get something.
  * Does that give you anything that useful, though?  it's a lot less than
    what you get from debcheckout.  debcheckout doesn't work all that
    much, though.
- Colin wonders why we don't have a central directory of all the source
  package packaging repositories rather than putting it in package metadata.
  * Even with that, if you look at stuff in stable, the chances are that a
    lot of those repositories have gone away.
- debcheckout is only really useful if you're about to do development
  * There's no uniform way to get a particular revision of the package.
  * It may not be tagged, it may be on another branch, etc.

Joey would really rather upload his whole repository for things that he
knows are clean, but that's a problem for ftp-master review, and you have
to get into who you trust to make that determination.

You might be able to do a shallow clone of depth one and include every signed 
tag that matches an entry in debian/changelog
but it may be too bloaty.  That might ease the review.
 - How would topic branches fit into this scheme?
 - The default is to include only master, but you can include other things if
   you want.
 - This might be a good default behavior.
 
Best practices for Git repository layout?
- git-buildpackage documentation is closest to that

git push as an upload mechanism
- Attractive because over time it builds a Git repository for the package
- However, it assumes binaryless uploads, which we currently don't allow.
- You can't have a smart server that tries the build and rejects it if it
  fails because then how do you sign the package?
  * You could do this personally and then use debsign -r.
  * You would really have to trust this if it were done centrally; easier if
    you do it yourself with your own upload server.
- If anyone is interested, they should develop the software and try it for
  their own uploads and go from there.
  
If you're implementing 3.0 format, please don't hard-code the extensions that
you "know" will be found in source packages, because as we add additional
files listed in *.dsc, we may add other types of files.

One issue with 3.0 (quilt) is that when you check it out when it's
maintained in a VCS, you have two choices: commit the .pc directory and
files, or leave it out and then have to run some magic after you fetch the
VCS in order to work on the quilt patches.  (Assuming you check in to your
VCS the results of having all the patches applied.)  Colin has been
talking with bzr people about having it notice it and collect it on clone.
- Why don't you just check in with patches not applied?
  * Colin really hates that, because the great thing about 3.0 (quilt) for
    him is that patches are already applied.  It means that you can use
    vcs blame, vcs log, etc., and you get a consistent view of what
    changed this line in this file.  Without that, you don't get anything
    better than a traditional patch system.
  * If you use topic branches in the VCS and something like TopGit that
    generates them, you can get this, but then you can't maintain the patches
    in the VCS as well.
  * You can do this with a rebased patch branch, but then you don't get
    history on modifications to the patch.

What about building a unified VCS repository location (with whatever
VCSes) for all of Debian that everyone uses, which would simplify this a
lot?
- We lose a lot of flexibility for doing this, and Debian can't agree on
  doing this for everything.
- We do need to distribute source packages on CDs, so we need a source
  package format that includes all the source.
- We have lots of child distributions, and it's useful to have a source
  format that's useful when importing their packages when they may not
  have a central revision control system.
  
What about repository size bloat if revision control history is included?
- That's one of the reasons why shallow clones are important.
- Also, we do have a reasonable amount of archive space, particularly for
  source.
- The history ftp-master will review is more of a bottleneck.
- Are the advantages written down anywhere?
- Joey finds that in his usage patterns, git blame is the only case where he
  really cares about the whole history; in other situations, he mostly looks
  at recent revisions.
  
Currently in 3.0 (git), origin points to the bundle and doesn't embed the
actual repository, but Joey is working on fixing that.  (Setting origin
based on Vcs-Git.)

3.0 (git) does ensure that you get whatever branch is the Debian build
branch, not possibly the upstream branch that you might get if you
debcheckout something with a repository that combines local and upstream.

source.debian.org is working on importing source packages into a Git
repository and storing the history as one revision per new source package
upload.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: