[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: an idea for next generation APT archive caching



On Wed, 2004-10-20 at 12:11 +0200, martin f krafft wrote:
> also sprach martin f krafft <madduck@debian.org> [2004.10.20.0211 +0200]:
> > Here's an idea I just had about apt-proxy/apt-cacher NG. Maybe this
> > could be interesting, maybe it's just crap. Your call.

> 3. squid:
>   Squid works reliably, but it has no concept of the APT repository
>   and thus it is impossible to control what is cached and for how
>   long. The release-codename symlinks can be worked around with
>   a simple rewriter, but other than that, there are three parameters
>   that seem relevant:
> 
>   maximum_object_size 131072 KB
>   cache_dir aufs /var/spool/squid-apt 1024 16 256
>   store_avg_object_size 100 Kb
> 
>   These values are what I came up with after two days of testing.
>   The problematic one is the last one. It's at 13 Kb per default,
>   and this causes squid not to reliably cache objects larger than
>   35 Mb. Increasing it to 100 Kb causes even openoffice.org to be
>   cached for some time, but the high average also causes smaller
>   files to be removed earlier than they should be.

store_avg_object_size should have no impact on what is and is not
cached. It is used to estimate the has size required to fully populate
the cache. Having too low a value there will simply cause squid to
create a hash table that is larger than optimal : it will not enable or
prevent caching of debs, nor will it cause smaller files to be removed.

If you are using bloom inter-cache digests, the average object size
estimator is also used there to tune the bloom digest for optimal
density. Again, no impact on the local squid's caching or not of any
given object.

I usually bump my max object size up past 720MB, so that I can cache
isos.
maximum_object_size 740 MB

One of the problems with sid debs, is that they are often very recent,
and delivered without cache expiry metadata, so squid's heuristic, which
looks at time since modification, gives them an inappropriately low
maxmium lifetime. So lets specify 1 day minimum for debs without expiry
metadata, and 1/5 of its age as the age based freshness, and upper cap
that at 1 month. This is heavily geared, and if you are happy with
revalidation - i.e. your deb mirror returns the same mod time & etag
when squid checks, and you aren't trying to use this offline, then this
can reduced. As debs names are not reused, this should be safe as is.

refresh_pattern deb$ 14400 20% 2592000

likewise the control files files, but these we expect to change daily.

refresh_pattern (Packages(.gz)?|Release|Sources.gz)$ 14300 20% 14400

(I think that regex is right, haven't tested it).

You probably want heap LFUDA cache replacement policy.

cache_replacement_policy heap LFUDA


Cheers,
Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: