- To: firstname.lastname@example.org
- Subject: Tracking versions
- From: Matt Zimmerman <email@example.com>
- Date: Thu, 12 Dec 2002 15:23:21 -0500
- Message-id: <20021212202320.GL29455@mizar.alcor.net>
- Mail-followup-to: Matt Zimmerman <firstname.lastname@example.org>, email@example.com
- In-reply-to: <20021212181145.GA6249@riva.ucam.org>
- References: <OF184EFD11.6AEC20B5-ON86256C8C.005C25EE@norlight.com> <20021211193929.GA16925@netexpress.net> <20021211231729.GA23676@home.ouaza.com> <20021212012656.B24788@gandalf.drinsama.de> <20021212082559.GA10638@zombie.inka.de> <20021212103122.GB2342@riva.ucam.org> <20021212165506.GE29455@mizar.alcor.net> <20021212181145.GA6249@riva.ucam.org>
On Thu, Dec 12, 2002 at 06:11:45PM +0000, Colin Watson wrote:
> On Thu, Dec 12, 2002 at 11:55:06AM -0500, Matt Zimmerman wrote:
> > The obstacle I have encountered (in my efforts to automate tracking
> > this sort of information for security issues) is that there is no way
> > to reliably determine whether two versions of a package are on the
> > same branch of development.
> On several occasions aj has suggested changelog parsing for this,
> building a tree of known versions and their inheritance for every
> package. I don't quite see all the details of how this will work yet,
> but it does seem like the most obviously correct approach. Modifying
> dpkg-parsechangelog and/or apt-ftparchive will probably be necessary.
> See the archives of debian-debbugs for discussion.
I saw that now, in the IRC log that someone posted a URL for. I can't say
that I like the idea much, because it requires a lot of data, and that data
can't be extracted from (e.g.) the .changes file. It's also inconvenient to
work with, because in order to compare two versions properly, they must be
traced back to a common ancestor. It's not even trivial to find the right
changelog data if you're working with binary packages (as in debbugs), since
that requires maintaining historical information about source->binary
mappings. The recent changelog entries for foo could be in bar now, but it
used to be built from a source package named foo, and the old changelog
entries are there.
I wrote a script a long time ago which would extract changelog data and put
it into an RDBMS, allowing for tools that would extract a specified range of
changelogs to create reports (a server-side companion to apt-listchanges),
but I didn't get any response to my inquiries about hooking into the
processing of newly uploaded packages and moved on to other projects.
> You don't have to keep track of the distribution at which every upload was
> targetted; you "only" need to get a path through the tree from each
> changelog you encounter, build the tree as you go from that (coping with
> inconsistencies somehow), and know what versions are currently in each
> distribution. Certainly not at all trivial, but doable, and I don't think
> storing the version trees will take a significant amount of space compared
> to the size of the bug database.
Right, and in the course of normal processing of bugs, "each changelog you
encounter" is no changelogs at all. :-) So this data would need to be
imported from the packages themselves, which is inconvenient. Changelogs
are not the cleanest data around, either, when taken as a whole rather than
snippet as in .changes.
The amount of data is not outrageous, but it is not straightforward to
obtain or to operate on, which is why I would prefer a different approach.
The simple and intuitive way to keep track of branches is to use version
numbers. Since our branches never merge per se, it would be enough to be
able to compare two version numbers and say "same branch" or "different