[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Package Pool Proposal



On 21 Nov 1999, Guy Maor wrote:

> I am going to implement something like this.
> 
> The archive will look like this:
> debian/dists/pool/{binary-arch,source}/x/package
>   x is the first letter of the filename.  Note that packages are not
>   organized according to section in the archive, just filename.

I would be tempted to say this should be improved:

Wakko{jgg}~/work/dsync/build/bin#apt-cache dumpavail | grep -i ^Package  |
awk -- '{print substr($2,1,1)}' |  sort | uniq --count | sort -n | tail
    157 f
    190 m
    202 t
    206 d
    210 c
    270 s
    321 g
    323 p
    365 x
    857 l

That says we are looking at about 1/5 of the packages in a single
direcotry (lib* basically) [assuming 1 ver per package]. I'd say the only
way to deal with this is to hash the packagename/filename into one of say
255 buckets. That virtually makes manual downloading impossible though.. 

> The Packages.gz and Sources.gz are built from this database alone.
> The database is editable from a web-page by maintainers/ftpmasters
> (not every field by everybody of course).  All changes to the archive
> are made through the database.  The change is then reflected in the
> next day's archive run.

You know what sort of an improvement this will yield for generation time
alone? Wow!

Have you considered what sort of database you'd like to use [LDAP/SQL
basically..]? 

Also, we should consider implementing authorization, with the Developer DB
we have now this is pretty simple.

And finally, we should make a dump of the DB downloadable, we could put
some neat smarts into a APT-GUI that allowed the user to specify their own
filter someday.

> Mirroring by architecture and by freeness is still possible, but
> mirroring by distribution is not possible without a specialized tool.

It would be a pretty trivial thing to write this specialized mirroring
tool that is based on APTs library, if someone is interested in doing this
I'd love to tell them how :>

Getting rid of the symlinks is a big thing, they really took up alot of
space and alot of mirror time. Not to mention that we get ditch those
aweful hardlinks :|

> distribution "unreviewed" would.  (You'd be surprised at the schlock
> people upload that gets rejected.)

I think we should consider things like that carefully, injecting random
crap into the archive isn't so hot on our mirrors :|
 
I can probably contribute a few lines of code to this, if there is a
discussion forum for implemetation please stick me on it..

Thanks,
Jason



Reply to: