Bug#585432: python-apt: source package handling is slow and cumbersome
Package: python-apt
Version: 0.7.94.2
Severity: normal
I have an application (lp:xdeb) that needs to get source package records
in order to manipulate them in various ways. Unfortunately:
>>> import apt
>>> import apt_pkg
>>> import timeit
>>> sr = apt_pkg.SourceRecords()
>>> timer = timeit.Timer('sr.lookup("man-db")', 'from __main__ import sr')
>>> timer.timeit(1000)
28.415705919265747
This interface is really slow, because it has to reparse the text of the
Sources files over and over again. There doesn't appear to be a better
way to do this within python-apt. The cache only covers binary package
records, and the Python bindings for SourceRecords only expose lookup()
and restart() but not e.g. step(), so there isn't even any way to build
a cache outside python-apt.
I ended up using 'apt-get --print-uris update' to get a list of Sources
file names and feeding them through python-debian's parser, but this is
of course pretty awful, not to mention rather slow in itself - I end up
with a long delay on startup while the cache builds. (On reflection, I
could probably use apt_pkg.TagFile, which has a step() method that might
be faster.)
Ideally, though, I'd like source package records to be more like
first-class citizens in apt/python-apt rather than feeling a bit like
afterthoughts, and to have their own cache and wrapper objects so that
e.g. apt.SourceCache()[src] can return an apt.Package.Source object,
which might have methods like fetch_source() and
install_build_dependencies() as well as offering attribute access.
Would this be at all feasible?
Thanks,
--
Colin Watson [cjwatson@debian.org]
Reply to: