[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: implementation of package pools



On Thu, 19 Oct 2000, Eray Ozkural wrote:

> > >  [..]/pool/main/libm/libmng/
> > >          [...]

> > What kind of distribution does this give us?  How many packages in each dir,
> > etc.  I would have preferred a 2 level hash.

> I agree with Adam, splitting into directories according to a prefix
> of the name is _nonsense_. By the way, this topic had been discussed
> before and my sober proposal was silently dismissed. [1]

This topic has been discussed many times - this is the solution that meets
the most goals.
 
> * Ideally, all packages would lie flat in a single directory. example
>  /package-pool

No. This is not ideal, it impeads the ability of the ftp team to
manipulate the archive. Packaging into sub dirs by source restores some of
this ability - in fact organizing by source may be a big win for them, it
is too early to tell for sure.

> * This means that all packages or package directories would be in a
>  single directory. Unfortunately this is not possible because ext2fs
>  is a terrible filesystem. [can't handle dirs with a lot of entries well]

Right.

> * You have to split the pools directory in order to preserve
>  file-system performance for a large number of packages. That is,
>  _performance_ is the only reason for doing this. The directory
>  names ultimately don't have to be human readable, since such _simple minded_
>  prefixing doesn't at all ease browsing the ftp archive.

Wrong.

It is absolultely critical that someone who knows what they are looking
for and is well versed in the archive structure (say, the FTP admins) can
go directly to a package's directory without having to run a hashing
function. Any scheme which does not allow this must be rejected.

> * Then, you must use a REAL static hash function for determining how
>  this split is going to happen. If you don't know hash functions well,
>  someone else surely does. Feel free to ask for advice!

Actually the hashing results from this function are well within the
limits for good ext2 performance. A more even distribution is not
important for this application. See past discussions on this list for some
numbers.

Jason



Reply to: