Hi, this is a follow-up to my question after the dgit talk today: It would be great to have a git view of the a package’s history in Debian. There is some possible overlap with dgit in the sense that if everyone had been using dgit from the start, then we would have that, but dgit’s objectives are slightly different, so maybe my question could be posed and answered separately. There is precedent to what I want: http://hdiff.luite.com/ is a service that imports every Haskell package upload into a git repository, and provides a cgit interface to it. This has been very useful to me as a tool to investigate what has happened when, and to easily view diffs. Now snapshot.debian.org already contains all the data that should go into these git repositories. What would stop us from importing all of the sources packages into per-package git repositories? Given that it’s only source and there is compression, I would expect the resource usage to be acceptable. If the answer is „Nothing is stopping, just that someone has to do it“, then I’m volunteering, as long as I can do most of it during DebConf. Peter, what do you think? I probably do not need more than access to snapshot.debian.org and a directory there to work on. Technically, this is how I would do it: I phrase it terms of the git data model, and not in terms of the git command that reach that, as that gives a cleaner specification. * Every source package from snapshots.d.o becomes, extracted with dpkg-source -x as usual, produces a git tree object. I’d probably simply ignore empty directories. * Every source package also produces a git commit, with - Tree: the above - Author: top changelog entry - Date: also top changelog entry - Description summary: The version number - Description text: The top changelog entry. - Parents: This is the interesting bit The set of parents should be the commits corresponding to any version mentioned in debian/changelog, pruned by those that are transitively reachable. This ensures that we get a nice git DAG for things like packages that have been experimental for a while, merging from unstable repeatedly. The order of parents could correspond to the order in debian/changelog, so that the second changelog entry becomes the first parent. These rules should, unless suddenly new historic packages appear, ensure that we get identical git hashes if we re-run this tool, which is goo. * Every suite (unstable, jessie...) becomes a branch, pointing to the corresponding commit * Optionally: One tag per version pointing to the corresponding commit, for each version. Although maybe that would produce too many tags... Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: F0FBF51F JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata
Attachment:
signature.asc
Description: This is a digitally signed message part