Bug#425283: apt: send hints to proxies through custom HTTP headers
Package: apt
Version: 0.6.46.4-0.1
Severity: wishlist
Hello,
I have a wish that you may discuss, sooner or later. I would like to get
some of the internal information about a package file that APT knows in
advance to be sent by it's http client. When receiving index files, it
may also send the base URL part of the package files which serves the
relative base directory for the files described by that index file.
The purpose is following: a proxy (ie. apt-cacher) may use the expected
package size and checksum and basename to locate an appropriate file in
the cached pool of files even if the exact location differs.
I need this feature to simplify apt-cacher's work when solving the
package filename clash problem in #415398. A simple data base would be
enough to identify the file id in the flat directory as well as in
nested structure. This would also help apt-proxy, IIRC they require that
user does all the path-to-mirror mapping manually, and doing this
automatically would be PITA. I tried to implement a such database
supported lookup solution in apt-cacher-ng and it sucked. Why? Assume
that you want to send a file from the cache on request, but the file is
not in exactly the same location [1]. Solution? Track the files in the
local pool using their basename+size+checksum as IDs. However, this does
not work efficiently because the data needs to be extracted from Index
files. Here problems arrise:
- to get the file ID, you need the (huge) lookup table, mapping all
paths ("server/path/filename") to the file ids. So you need to
extract the information from index files, e.g. Packages.bz2.
- extraction costs time. You cannot do it in realtime with bzip2 on a
Pentium equipped proxy machine. OTOH you need the data in realtime
because apt may start getting the package file really soon after
Packages.gz2 fetching.
- the location of the index file cannot reliably be used as a reference
path for the files because APT appends some suffix depending on
user's configuration. The proxy just cannot know for sure, APT does.
- The last subproblem also counts when the daily expiration of cached
contents needs to be done. The exact file paths are needed, guessing
is not a way.
All that could be simplified a lot if the relevant bits of information
are sent by APT when requesting the file.
[1] This happens because there are multiple paths of the
Debian archive on the mirror (ie. /pub/debian and /pub/linux/debian and
/debian) or just because the mirror name is different. OTOH you cannot
ignore the directories anymore because there may be Ubuntu stuff also
stored there.
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.22-rc2
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages apt depends on:
ii debian-archive-keyring 2007.02.19 GnuPG archive keys of the Debian a
ii libc6 2.5-7 GNU C Library: Shared libraries
ii libgcc1 1:4.1.2-6 GCC support library
ii libstdc++6 4.1.2-6 The GNU Standard C++ Library v3
apt recommends no packages.
-- no debconf information
Reply to: