[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Aptitude question



On Mon, Jan 16, 2006 at 10:57:22AM +0900, Miles Bader <miles.bader@necel.com> was heard to say:
> Daniel Burrows <dburrows@debian.org> writes:
> > When you say that normal operation is getting slower, do you mean just
> > the load time or its overall performance?  The time required to load
> > in all the state files is a bit long, but once they're loaded the
> > program seems to run reasonably quickly to me.
> 
> The really problematic thing is indeed the state-file loading time; it
> seems to take 30 - 40 seconds on my machine (with no disk I/O as the
> files are already cached by the kernel).

  It would be interesting to see profile data for this.  The new aptitude
has a "nop" command that does nothing but load the cache and then stop,
the better to get profiling information (but it's too fast on my computer
to yield useful results).

  A lot of what aptitude is doing there is parsing RFC822-alike files
using core apt code, and it's possible that the apt end of things could
be optimized.  There's even a patch in the BTS that eliminates various
unnecessary copies (#$319377), although it might be better in some cases
to prevent aptitude from calling those routines so many times (e.g., it
should really be caching configuration values rather than doing lookups
in the middle of a long loop).

> Normal operation is generally OK, though some searches (e.g. "~dfoo")
> are so slow as to be almost useless -- especially given that it's
> "i-search", so a super-slow search gets repeated for every key as you
> type the search string!

  I suspect that the problem with searches is due to locality: aptitude
has to access several structures/files to perform a search, and (IIRC)
it only attempts to order accesses along one "axis".  i.e., it accesses
packages in the order that they occur in the main cache, but this isn't
necessarily the same order that they occur in, e.g., the Description files.

  If you look at the apt-cache code, in contrast, you'll see that it does
a two-pass search, first iterating through the package cache to match
package names, then sorting packages according to their order in the
description file to match descriptions.  This is easy for apt-cache, since
it only has one type of match; aptitude's much more complex matching
language and search architecture would require some more effort to optimize
this way.

  I had some notes at one point on modifying aptitude's search to be
multi-pass in order to increase locality, but I didn't get around to
carrying through on the project.

> >   The main thing that changed recently that would impact the program's speed
> > under normal use is the switch to using Unicode internally, which means that
> > many string manipulations take 4x as long, and input strings (e.g., from
> > package descriptions) have to be decoded before they're used.
> 
> Do you know if the package/state files so large that it's really running
> against fundamental memory bandwidth problems?  I've noticed (in my own
> programs) that some standard C++ library code, e.g. reading from
> io-streams, seems suspiciously slow (though I've not confirmed this with
> measurements)...

  I doubt that the code is hitting any fundamental limits, since you
mentioned that the program is slow even when everything is cached.  The
standard library generally seems reasonably quick to me, although I avoid
iostream input like the plague.

  Daniel

Attachment: signature.asc
Description: Digital signature


Reply to: