apt-get update hashsum mismatch prevent
Hi,
I would like to resurrect the dicussion about preventing apt-get
update hashsum mismatch failures. This was discussed a while ago
already [1] but there was no real consensus.
Let me summarize the problem first:
When the archive (or the mirrors) update their InRelease file and
indexfiles (Packages,Sources,Translations,Packages.diff/Index) there
are two inherent race condition for clients:
(1) apt fetches the InRelease file and if during that fetching the
server updates its indexfiles the subsequent GET for the indexfiles
will fail with a hashsum mismatch because the InRelease file has the
hashes of the previous generation of the indexfiles.
(2) apt fetches a new InRelease file but the new indexfiles are not
updated/mirrored yet. A hashsum mismatch error is found because the
new InRelease file hashes do not match the old indexfiles.
Problem (2) is of less relevance right now because AIUI our mirror
scripts updates in multiple steps, i.e. sync pool, sync indexes, sync
release. But its still worth thinking if this could be simplified.
Problem (1) is more prevalent and to solve it we need to keep (at
least) one generation of the previous file around in some way.
So the goals would be:
(g1) apt should always be able to update (getting an authenticated set
of Packages/Sources files)
(g2) mirroring should be as simple as possible with as little
special cases as possible.
Here are the solutions I would like to outline, credits for the
solutions should go to the all people taking part in the mailing list
discussion and who discussed this in RL and on irc. Thank you!
Proposed solution for (g1):
- we add a flag to InRelease like "Content-Addressable-Indexes: yes"
"Get-By-Hash: sha256,sha1" (suggestions for a good name welcome) [2]
- we keep the previous N (depending on how first the index is rebuild
this N should be 1-3 maybe) generations of the indexfiles available
via something like
dist/$release/main/binary-$arch/by-hash/$hashtype/$hash
and provide a symlink back to
dist/$release/main/binary-$arch/Package.xz
for compatiblity
- apt-ftparchive adds the flag and keeps the N generations around
and generates the links (and may write out a index
- apt sends a GET dist/$release/main/binary-$arch/by-hash/sha256/$hash
when its told to do so (e.g. via the InRelease file) [3]
Proposed solution for (g2):
- the mirror script mirrors everything except InRelease,Release{.gpg,}
- InRelease,Release{.gpg,} is mirrored
Please let me know what you think! I would really love to fix this
issue and created a branch for apt with some proof-of-concept code
(based on the work of Robie Basak in [4]) that seems to work really
nicely. So if we can agree on a design I would love to finish this
work.
Cheers,
Michael
[1] https://lists.debian.org/debian-dak/2012/05/msg00006.html
[2] Either going with a simple boolean flag in which case the hashes
need to be provided for all hashes in the Release file (e.g.
via symlinks) or by using specific hashes mentioned in the
Release file. I lean towards just using a boolean as its
the simplest solution.
[3] For miroring tools like debmirror that do not know about the new
by-hash files yet, apt may still have to support a fallback mode
to the old "Packages.gz"
[4] https://wiki.ubuntu.com/AptByHash
Reply to: