On Mon, Jan 16, 2006 at 10:57:22AM +0900, Miles Bader <miles.bader@necel.com> was heard to say: > Daniel Burrows <dburrows@debian.org> writes: > > When you say that normal operation is getting slower, do you mean just > > the load time or its overall performance? The time required to load > > in all the state files is a bit long, but once they're loaded the > > program seems to run reasonably quickly to me. > > The really problematic thing is indeed the state-file loading time; it > seems to take 30 - 40 seconds on my machine (with no disk I/O as the > files are already cached by the kernel). It would be interesting to see profile data for this. The new aptitude has a "nop" command that does nothing but load the cache and then stop, the better to get profiling information (but it's too fast on my computer to yield useful results). A lot of what aptitude is doing there is parsing RFC822-alike files using core apt code, and it's possible that the apt end of things could be optimized. There's even a patch in the BTS that eliminates various unnecessary copies (#$319377), although it might be better in some cases to prevent aptitude from calling those routines so many times (e.g., it should really be caching configuration values rather than doing lookups in the middle of a long loop). > Normal operation is generally OK, though some searches (e.g. "~dfoo") > are so slow as to be almost useless -- especially given that it's > "i-search", so a super-slow search gets repeated for every key as you > type the search string! I suspect that the problem with searches is due to locality: aptitude has to access several structures/files to perform a search, and (IIRC) it only attempts to order accesses along one "axis". i.e., it accesses packages in the order that they occur in the main cache, but this isn't necessarily the same order that they occur in, e.g., the Description files. If you look at the apt-cache code, in contrast, you'll see that it does a two-pass search, first iterating through the package cache to match package names, then sorting packages according to their order in the description file to match descriptions. This is easy for apt-cache, since it only has one type of match; aptitude's much more complex matching language and search architecture would require some more effort to optimize this way. I had some notes at one point on modifying aptitude's search to be multi-pass in order to increase locality, but I didn't get around to carrying through on the project. > > The main thing that changed recently that would impact the program's speed > > under normal use is the switch to using Unicode internally, which means that > > many string manipulations take 4x as long, and input strings (e.g., from > > package descriptions) have to be decoded before they're used. > > Do you know if the package/state files so large that it's really running > against fundamental memory bandwidth problems? I've noticed (in my own > programs) that some standard C++ library code, e.g. reading from > io-streams, seems suspiciously slow (though I've not confirmed this with > measurements)... I doubt that the code is hitting any fundamental limits, since you mentioned that the program is slow even when everything is cached. The standard library generally seems reasonably quick to me, although I avoid iostream input like the plague. Daniel
Attachment:
signature.asc
Description: Digital signature