[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Package metadata server



(picked up from http://www.debianplanet.org/article.php?sid=633)

The scalability problems of the Packages file is a recognised problem that
has been discused many times on this list, i think the following idea
could go a long way to solving it.

The current method of checking for updates is to retrieve a new
Packages.gz file and discard the old Packages.gz file. The problem with
this method is that commonly less than 1% of the Packages.gz file has
changed. A number of solutions have been proposed to overcome this
problem, these include - Compressing the Packages.gz in an rsync friendly
manner. - Making diffs from older Packages files available.
 - Splitting the Packages.gz into multiple files.
 - Reducing the size of the metadata for each package.
Each of the above ideas has its own problems that have been discussed on
this list.


An idea that i havent heard mentioned here is to create a client/server
application for specifically handling our metadata, the server can be
queried by clients to send only the required metadata.

Checking for updates could go something like this.
1) query the server for all the package names and version in woody,
2) Compare the results to your previous metadata to determine which
packages have change. 3) query the server for the metdata of the changed
packages. 4) reconstruct the Packages file with the new metadata.

Advantages
 - Compatable with existing packaging tools, it can compliment rather than
replace the existing method. - Reduced bandwidth to a minimum
 - Flexible, different queries could be implemented to handle other
unforseen situations. - other ?

Disadvantages
 - implementation may take a bit of work.
 - requires a new server to be run rather than using standard file
transport tools. - other ?

One idea is to do it with LDAP, but i dont know enough to comment.



Glenn


Attachment: pgpCEwcri0Xy8.pgp
Description: PGP signature


Reply to: