[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#585432: python-apt: source package handling is slow and cumbersome



Package: python-apt
Version: 0.7.94.2
Severity: normal

I have an application (lp:xdeb) that needs to get source package records
in order to manipulate them in various ways.  Unfortunately:

  >>> import apt
  >>> import apt_pkg
  >>> import timeit
  >>> sr = apt_pkg.SourceRecords()
  >>> timer = timeit.Timer('sr.lookup("man-db")', 'from __main__ import sr')
  >>> timer.timeit(1000)
  28.415705919265747

This interface is really slow, because it has to reparse the text of the
Sources files over and over again.  There doesn't appear to be a better
way to do this within python-apt.  The cache only covers binary package
records, and the Python bindings for SourceRecords only expose lookup()
and restart() but not e.g. step(), so there isn't even any way to build
a cache outside python-apt.

I ended up using 'apt-get --print-uris update' to get a list of Sources
file names and feeding them through python-debian's parser, but this is
of course pretty awful, not to mention rather slow in itself - I end up
with a long delay on startup while the cache builds.  (On reflection, I
could probably use apt_pkg.TagFile, which has a step() method that might
be faster.)

Ideally, though, I'd like source package records to be more like
first-class citizens in apt/python-apt rather than feeling a bit like
afterthoughts, and to have their own cache and wrapper objects so that
e.g. apt.SourceCache()[src] can return an apt.Package.Source object,
which might have methods like fetch_source() and
install_build_dependencies() as well as offering attribute access.
Would this be at all feasible?

Thanks,

-- 
Colin Watson                                       [cjwatson@debian.org]



Reply to: