[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

apt-get update hashsum mismatch prevent



Hi,

I would like to resurrect the dicussion about preventing apt-get
update hashsum mismatch failures. This was discussed a while ago
already [1] but there was no real consensus.

Let me summarize the problem first:

When the archive (or the mirrors) update their InRelease file and
indexfiles (Packages,Sources,Translations,Packages.diff/Index) there
are two inherent race condition for clients:

(1) apt fetches the InRelease file and if during that fetching the
server updates its indexfiles the subsequent GET for the indexfiles
will fail with a hashsum mismatch because the InRelease file has the
hashes of the previous generation of the indexfiles.

(2) apt fetches a new InRelease file but the new indexfiles are not
updated/mirrored yet. A hashsum mismatch error is found because the
new InRelease file hashes do not match the old indexfiles.


Problem (2) is of less relevance right now because AIUI our mirror
scripts updates in multiple steps, i.e. sync pool, sync indexes, sync
release. But its still worth thinking if this could be simplified.


Problem (1) is more prevalent and to solve it we need to keep (at
least) one generation of the previous file around in some way.

So the goals would be:
(g1) apt should always be able to update (getting an authenticated set
of Packages/Sources files)
(g2) mirroring should be as simple as possible with as little
     special cases as possible.

Here are the solutions I would like to outline, credits for the
solutions should go to the all people taking part in the mailing list
discussion and who discussed this in RL and on irc. Thank you!

Proposed solution for (g1):
- we add a flag to InRelease like "Content-Addressable-Indexes: yes"
  "Get-By-Hash: sha256,sha1" (suggestions for a good name welcome) [2]
- we keep the previous N (depending on how first the index is rebuild
  this N should be 1-3 maybe) generations of the indexfiles available
  via something like
  dist/$release/main/binary-$arch/by-hash/$hashtype/$hash
  and provide a symlink back to
  dist/$release/main/binary-$arch/Package.xz
  for compatiblity
- apt-ftparchive adds the flag and keeps the N generations around
  and generates the links (and may write out a index 
- apt sends a GET dist/$release/main/binary-$arch/by-hash/sha256/$hash
  when its told to do so (e.g. via the InRelease file) [3]

Proposed solution for (g2):
- the mirror script mirrors everything except InRelease,Release{.gpg,}
- InRelease,Release{.gpg,} is mirrored

Please let me know what you think! I would really love to fix this
issue and created a branch for apt with some proof-of-concept code
(based on the work of Robie Basak in [4]) that seems to work really
nicely. So if we can agree on a design I would love to finish this
work.

Cheers,
 Michael

[1] https://lists.debian.org/debian-dak/2012/05/msg00006.html
[2] Either going with a simple boolean flag in which case the hashes
    need to be provided for all hashes in the Release file (e.g.
    via symlinks) or by using specific hashes mentioned in the 
    Release file. I lean towards just using a boolean as its
    the simplest solution.
[3] For miroring tools like debmirror that do not know about the new
    by-hash files yet, apt may still have to support a fallback mode
    to the old "Packages.gz"
[4] https://wiki.ubuntu.com/AptByHash


Reply to: