[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

realise diff-updates with dpkg

Hi ;)

I would like to imagine a time approach, could be made easier with the updates.

With daily changing databases like virus database, blacklists, card or similar is always a part of the old data is erased as new data are added. Programs such as clamav bring it with their own update client.
But I would however also like to use dpkg. So then every day just a new package with the new database created with aptitude to the clients can then load them down. For small data sets, this is also possible, but it occurs at greater length on a very high volume of data that is not on slow internet connections an advantage. Moreover, it is also pointless to load 1 million records down to in the end 10 to remove from the database and 25 to add.

There is however the possibility to use debdelta; Then only the changes of the package loaded and locally patched the package. This is generally a very good idea, but here is the whole package and replaced not only applied the change. For small data sets, this is also OK again to delete more than 1 million records and reloaded into the database but brings huge performance problems.

To solve this problem I have now devised the following idea:
- At regular intervals, a new version of the package X is created. (like the “old” way)
- A package Y does not contain data, but only the control archive with the maintainer-scripting. (Data in /tmp is also possible, but only temporary!)
Through the Fields "update-package" and "upgrade-version" in DEBIAN / control is specified on the package, and what version the update is provided. "Version" is the new version, "upgrade version" the old.
DPKG now leads the install the update from the maintainer scripts, it updates the version of the package to the new version and then deletes the data that was needed for the update.

Now it is possible to create a package Y with a script in postinst what deletes our 10 records and adds the new 25. This (should) bring huge performance increase. This type of update is however NOT intended for "normal" packages such as software or libraries, but only for frequently changing data sets. This approach would be compatible with the previous dpkg because aptitude so always update the package would completely and only the new version would download the diffs.

What do you think about this idea - it makes sense or should we implement it rather differently?


Reply to: