Hello, I'm using Debian since several years, I'm not a developer but I try to contribute when possible. It's about one year I'm experincing the slowness of dpkg on reading the database. It's been a long time since this thread died (2007 Apr). I think lately the number of packages and libraries installed in a desktop have grown (I count >1700 packages). There're >200000 files installed. This means that dpkg is going to open 1700 files and read 200000 files from those files. It takes a long time, and please don't ask me to profile it... I think everybody knows it takes much time to (Reading database... 5% - 10% - 15%...). This can be questionable if you're doing your upgrades when you don't use the computer, but for installing/removing _just one package_ it's a lot boring. Also, from the development perspective, reading list files together with the generation of manpages db is one of the reasons why it takes forever to build packages in a chroot. > Quote from Ian Jackson: > dpkg needs to be > very reliable; its databases must not get corrupted even under > situations of stress. This is desiderable in all situations. But dpkg can still rely on the filesystem, while the sqlite backend could be a cache on top of the filesystem that gets generated when it's corrupted/missing (like APT). I think everybody who ever tried an rpm-based distribution feared the database corruption, but here's not the case. The backend is still the file system but speeded up using a cache. Also there could be a possibility of adding a configuration option to avoid using the cache. Therefore this feature can be disabled by default. > dpkg is very close to the bottom of the > application stack; making it depend on a big and complex library like > a SQL engine is a bad idea. From this perspective it might be a bad idea, then one could take in consideration to create its own format for keeping a cache (like APT does). Since it is at the bottom of the application stack it has one more reason to use a cache; software have cache components to speed up operations that takes a long time to compute. In this case we have the database reading that is slow and that can be cached. > But there are some significant performance problems elsewhere: > * The status and available file parser is too slow. I think this > needs some optimisation work. > * We still need the `smallmem' in-memory model for the file list > data (removed by Adam Heath while he was ripping out my nicely > simple counting allocator and replacing it with use of glibc > obstacks). You can agree with me that these performance problems are way less significant than reading the database, not saying they're less important. In hope that my intervent hasn't been threaten as offensive, I'd like to thank you all for the work you do on maintaing dpkg. Best regards, -- http://www.debian.org - The Universal Operating System
Attachment:
signature.asc
Description: Digital signature