[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#641769: [apt] fetches InRelease file, problematic on several mirrors (aka "Packages Hash Sum mismatch")



Package: apt
Version: 0.8.15.6
Severity: important

Since 0.8.11, APT fetches InRelease in preference to the classic Release file. InRelease is an "Inline" signed Release file announced in http://lists.debian.org/debian-devel-announce/2009/11/msg00001.html If I understand correctly, the only real noteworthy advantage of InRelease at the moment is that it avoids one HTTP request, improving performance.

Although InRelease is available since 2009, the mirrors team reportedly only became aware of its existence in February 2011, when it changed ftpsync to account for the InRelease file, excluding it from the files synchronized in the first mirror synchronization stage. http://lists.debian.org/debian-devel/2011/03/msg00598.html gives some background. The first ftpsync version to consider InRelease was 80387, which was released on 2011-02-23: http://lists.debian.org/debian-mirrors-announce/2011/02/msg00001.html

When a mirror uses a problematic ftpsync version, its InRelease files are incoherent with its Packages files from a moment in the first stage to a moment in the second stage. This could (and I think basically does) mean the files are incoherent for the duration of the synchronization. When Packages and Release are incoherent, updating APT indices fails, for example:

# LANG=C apt-get update
Hit http://ftp.ca.debian.org sid InRelease
Get:1 http://ftp.ca.debian.org sid/main i386 Packages [7703 kB]
Hit http://ftp.ca.debian.org sid/main TranslationIndex
Fetched 1 B in 3s (0 B/s)
W: Failed to fetch bzip2:/var/lib/apt/lists/partial/ftp.ca.debian.org_debian_dists_sid_main_binary-i386_Packages Hash Sum mismatch

E: Some index files failed to download. They have been ignored, or old ones used instead.
root@vinci:/etc/apt#

ftp.ca.debian.org is a round-robin that used until yesterday 2 mirrors, mirror.mountaincable.net aka debian.mirror.rafal.ca and debian.mirror.iweb.ca. Both of these mirrors used problematic ftpsync versions. Both of these are children of the primary mirror syncproxy.wna.debian.org, running on rietz.debian.org. rietz is the north american archive sync proxy and appears to have 25 children (see http://mirror-master.debian.org/status.html ). If the archive has something in the order of magnitude of 1 GB of new packages daily, that means rietz must be uploading about 25 GB a day just for synchronization. It's quite possible that rietz's uplink is saturated for a good proportion of time. There are currently 4 dinstall-s a day. The mirror I'm using, mirror.mountaincable.net, stayed incoherent for more than 2 hours the only time I was able to measure it somewhat precisely. With a dinstall each 6 hours, that's a lot of time. Mirror synchronization is explained on http://www.debian.org/mirror/ftpmirror#how

This issue only affects modified Packages/Sources files. This is typically the case of sid main, surely most dinstall-s. This therefore affects sid systems, and in a less important way testing systems. Having a testing/unstable mix which I usually upgrade every day, I would have to retry indices updates about 1 day on 2. If everyone was using my mirror, that would have affected quite a lot of people.

The problem with ftpsync is that the mirrors team doesn't seem to have much infrastructure to ensure it stays up-to-date. There is no package for ftpsync. When I reported my problem to #debian-mirrors, Simon Paillard contacted my mirror's administration and they upgraded their ftpsync in less than a day, fixing a problem which was affecting me since half a year. Each mirror reports its ftpsync version in debian/project/trace/ but this information is apparently not centrally collected, allowing to see which mirrors are affected at a glance and to contact all administrators quickly. If your mirror is affected, see http://anonscm.debian.org/viewvc/webwml/webwml/english/mirror/Mirrors.masterlist to find the contact address. There are now 2 synchronization methods that don't cause this problem that I'm aware of: using ftpsync 80387, and using ftpsync-pushrsync, which runs on the parent server rather than pulling updates. Unfortunately the latter is reportedly used only on very few mirrors. There are hundreds of official mirrors. Furthermore, there is no definition of what is a supported/acceptable ftpsync version and what should be considered outdated. Overall, coordination between the mirrors team and mirror administrators appears to have little structure.

I don't know what proportion of mirrors are affected. As I said, as of 2 days ago, both mirrors behind ftp.ca.debian.org were affected. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=625160 has some more reports, citing 4/4 ftp.us.debian.org mirrors affected as of May. Some believe http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=636292 stems from this issue too. The proportion looks high enough to do something. I'm setting severity based on my feeling, but could be very wrong.

A Debian installation administrator may work around this by changing to an updated mirror. With the number of mirrors available, this is not a big problem, the problem is that unless the administrator changes mirrors blindly until things start to work, figuring out what's wrong can be time-consuming. I took myself about 2 hours of work in total before understanding what was going on (which is why I only started investigating what was wrong after being affected for 4 months...). We don't want too many people to have to do that.

A few solutions come to mind:
* Upgrading mirrors
* Removing problematic mirrors
* Avoid InRelease

The first solution requires some work to first obtain a list of problematic mirrors. Then, the problem is that Debian doesn't directly control its mirror network, so we have to contact mirror administrators and wait until they act. This is not a very short-term solution. The second alternative also needs a list of problematic mirrors, and would only work for Debian installations which will be installed in the future, which largely defeats the point as we're trying to mitigate a short-term problem.

The other solution, which I came to the conclusion should be implemented after polling #debian-mirrors, is to make APT avoid InRelease. Josh Triplett filed an RFE asking APT to provide an option to disable use of InRelease files: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626026 This wasn't implemented so far, but I believe APT should go further and avoid using InRelease with problematic mirrors *by default*. I think I would still have spent a long time with my problem and a couple of hours debugging even if APT had offered that option from the start.

There are 2 ways to implement that third solution:
* Detecting the synchronization method and only using InRelease on mirrors using a recent one
* Never use InRelease

Detecting the version could be tricky, as I don't think the way it's published is standardized. Forgetting about InRelease would be reliable and very easy to implement (just revert InRelease's support). If the only disadvantage is an extra HTTP request per update, then that's better than detecting the synchronization method, as that would cost at least one HTTP request too.

Bottom line, I recommend to disable InRelease support for now. There could be optional support with a configuration option, but I'm not sure that's worth the effort. The question is then when should InRelease use be re-enabled [by default]? I suggest to file an issue report against the mirrors pseudo-package to track this issue, and to request the mirrors team to close it when they'll consider that APT can safely use InRelease.

Thanks to Simon Paillard for his help mitigating the problem and for helping me to discover solutions. David Kalnischkies's message explaining why the problem happens helped a lot.



Reply to: