Request for help - faithful source format (related to dgit)
This is a call for help for one or two volunteers who:
- are keen on gitish (or similar workflows)
- have some time right now
- can speak Perl (as found in dpkg-source)
- are willing and able to do some negotiation as well as coding
Introduction:
One persistent difficulty with our current source package
representations, for non-native packages, is that none of them
represent deletions of files which are present in upstream tarball(s)
but which the maintainer deletes (or wishes to delete).
This causes especial trouble for dgit. (See dgit(7).)
I would like to see a better solution to this, preferably in stretch.
I think this is still possible. I started an attempt but I got
slightly discouraged by some unexpected qualms in some quarters. And
I have substantial amounts of other dgit work to do (notably, sane
handling of gbp git trees), so I am not going to be able to put in the
time to fix this properly for stretch.
Is there anyone who would like to help with this ?
Solution-neutral problem statement:
* The primary objective is to be able to represent any tree (let us
say, anything which is representable as a git "tree object") as a
Debian source package, while still retaining the space/bandwidth
and traceability advantages of basing the source package on
upstream tarballs.
* So there should be a non-native source format which is capable of
accurately representing at least any git-representable tree, as
`differences' from any source tarballs.
* "Accurately" means that it must represent:
- changes to executableness of files
- files present in upstream but removed in Debian
- replacement of a directory with a file or vice versa
- symlinks
- removal of "unusual" filesystem objects such as fifos,
sockets, devices, or whatever (which some upstream
tarballs ship)
- creating, deletion and changes to binary files (ideally,
efficiently representing small changes to big binaries)
* This format should be available in a patch-stack-less version, at
the very least.
* This format should support multiple orig tarballs in the same way
as `3.0 (quilt)'.
* Ideally this would be in stretch and backportable to jessie or
even earlier.
* It is NOT a requirement that `the same changes' (as represented by
whatever actual container file - .diff or .rsync or whatever- ends
up being in the source package), can be applied to different
upstream tarballs. The assumption is that maintainers who use this
format will use other means (eg, a version control system) to
rebase the Debian changes onto a new upstream.
Stakeholders:
* The dpkg maintainers (CC'd) obviously have a big stake in this
and their support will be essential
* ftpmaster will need to support the new format too
* Consultation with the wider community of Debian contributors, and
derivatives, would be helpful, but should not be allowed to block.
Technical ideas:
* rsync batch mode provides a stable binary format, and rsync is
extremely reliable software of exceptional quality and stability.
rsync has a very good history of backward-compatibility.
A format a bit like 1.0 but supporting multiple tarballs and
replacing the .diff.gz with a .rsync.gz (with a specified rsync
protocol version, to ensure compatibility) would work.
I proposed this some time last year. ftpmaster were concerned that
they would lose an easy way to view the diff between upstream and
Debian; so, this approach would require some kind of update to the
ftpmaster tools.
If a patch stack version is wanted, the .debian.tar.gz could be
replaced by a .rsync.gz which creates debian/patches/* along with
whatever other changes are wanted. This may be confusing, because
not every Debian change would be in debian/patches/.
* There has been talk of using advanced GNU diff/patch features. I
am rather sceptical of these because they are very new, and because
diff and patch have a rather poor history in backward compatibility
terms.
Also it may be difficult to control exactly which diff/patch
features end up being used. This has already caused some lossage,
where a new not-backward-compatible source sub-format was
accidentally introduced into our archive (!)
One part of the task I need help with is negotiating and selecting an
appropriate technical approach.
Thanks for your attention.
Ian.
--
Ian Jackson <ijackson@chiark.greenend.org.uk> These opinions are my own.
If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.
Reply to: