[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: dpkg list-file performance



On 08/30/2009 09:28 AM, Cyril Brulebois wrote:
David Benjamin<davidben@MIT.EDU>  (29/08/2009):
The current list-files are good for the query "given a package, what
did it install". They also have fairly fast updates. However, they
are extremely poorly suited for the query "given a file, what
package(s) installed it" or if you need to read it all in at
once. So, I propose adding a cache for the data.

You want to give “dlocate” a try?

The main nuisance is that dpkg --install is slow as a result of this. I'm using dpkg-query --search as my test case mostly to isolate the database step. dpkg --install does some extra work that, for now, I don't care about. dpkg-query, on the other hand, does some extra work, but it's negligible compared to the 30s penalty for reading the files database. (Both --install and --search call ensure_allinstfiles_available.)

From what I can tell, dpkg's primary operations to the list-files do not correspond to "given a package, what did it install"? The current implementation will often read in everything, which *.list is bad at. The point of reading them all in seems to often be for findnamenode() to work. In which case, what we actually want is the "given a file, what package(s) installed it" query. *.list are not very good at that either.

I'm also not entirely convinced "alternative" versions of core dpkg operations for acceptable performance is quite the right way to do this, but that's not the main issue. The main issue is that dlocate does nothing to help dpkg --install.


David Benjamin


Reply to: