[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#425283: apt: send hints to proxies through custom HTTP headers



Package: apt
Version: 0.6.46.4-0.1
Severity: wishlist

Hello,

I have a wish that you may discuss, sooner or later. I would like to get
some of the internal information about a package file that APT knows in
advance to be sent by it's http client. When receiving index files, it
may also send the base URL part of the package files which serves the
relative base directory for the files described by that index file.

The purpose is following: a proxy (ie. apt-cacher) may use the expected
package size and checksum and basename to locate an appropriate file in
the cached pool of files even if the exact location differs.

I need this feature to simplify apt-cacher's work when solving the
package filename clash problem in #415398. A simple data base would be
enough to identify the file id in the flat directory as well as in
nested structure. This would also help apt-proxy, IIRC they require that
user does all the path-to-mirror mapping manually, and doing this
automatically would be PITA. I tried to implement a such database
supported lookup solution in apt-cacher-ng and it sucked. Why? Assume
that you want to send a file from the cache on request, but the file is
not in exactly the same location [1]. Solution? Track the files in the
local pool using their basename+size+checksum as IDs. However, this does
not work efficiently because the data needs to be extracted from Index
files.  Here problems arrise:

 - to get the file ID, you need the (huge) lookup table, mapping all
   paths ("server/path/filename") to the file ids. So you need to
   extract the information from index files, e.g. Packages.bz2.
 - extraction costs time. You cannot do it in realtime with bzip2 on a
   Pentium equipped proxy machine. OTOH you need the data in realtime
   because apt may start getting the package file really soon after
   Packages.gz2 fetching.
 - the location of the index file cannot reliably be used as a reference
   path for the files because APT appends some suffix depending on
   user's configuration. The proxy just cannot know for sure, APT does.
 - The last subproblem also counts when the daily expiration of cached
   contents needs to be done. The exact file paths are needed, guessing
   is not a way.

All that could be simplified a lot if the relevant bits of information
are sent by APT when requesting the file. 

[1] This happens because there are multiple paths of the
Debian archive on the mirror (ie. /pub/debian and /pub/linux/debian and
/debian) or just because the mirror name is different. OTOH you cannot
ignore the directories anymore because there may be Ubuntu stuff also
stored there.

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.22-rc2
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages apt depends on:
ii  debian-archive-keyring        2007.02.19 GnuPG archive keys of the Debian a
ii  libc6                         2.5-7      GNU C Library: Shared libraries
ii  libgcc1                       1:4.1.2-6  GCC support library
ii  libstdc++6                    4.1.2-6    The GNU Standard C++ Library v3

apt recommends no packages.

-- no debconf information



Reply to: