Re: dpkg list-file performance
On 08/30/2009 09:28 AM, Cyril Brulebois wrote:
David Benjamin<davidben@MIT.EDU> (29/08/2009):
The current list-files are good for the query "given a package, what
did it install". They also have fairly fast updates. However, they
are extremely poorly suited for the query "given a file, what
package(s) installed it" or if you need to read it all in at
once. So, I propose adding a cache for the data.
You want to give “dlocate” a try?
The main nuisance is that dpkg --install is slow as a result of this.
I'm using dpkg-query --search as my test case mostly to isolate the
database step. dpkg --install does some extra work that, for now, I
don't care about. dpkg-query, on the other hand, does some extra work,
but it's negligible compared to the 30s penalty for reading the files
database. (Both --install and --search call ensure_allinstfiles_available.)
From what I can tell, dpkg's primary operations to the list-files do
not correspond to "given a package, what did it install"? The current
implementation will often read in everything, which *.list is bad at.
The point of reading them all in seems to often be for findnamenode() to
work. In which case, what we actually want is the "given a file, what
package(s) installed it" query. *.list are not very good at that either.
I'm also not entirely convinced "alternative" versions of core dpkg
operations for acceptable performance is quite the right way to do this,
but that's not the main issue. The main issue is that dlocate does
nothing to help dpkg --install.