[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#128818: [patch] packages.gz diff support for apt



Michael Vogt wrote:
So the index file I was imagining looked like:
While all the information is certainly usefull, I wonder if it's all
needed.

It's not /needed/ but it is /useful/. On the server side, it's useful to have the source timestamps and ordering available, eg.

A problem I see that the index-file still needs to download
a bunch of patches.

Upside is that you can download the Index file, then all the patches simultaneously. That's essentially two round trips instead of N (for your current version) or one (for Jeroen's idea).

I wonder if the idea of Jeroen van Wolffelaar to
use only one ed-style diff is workable. It would indeed have a much
better performance for the client.

I don't know that it's such a big deal -- your general use case is a daily "apt-get update", anyway, and that'll only become moreso once those require downloading kBs instead of MBs. The other issue is that it makes server-side space requirements be squared instead of linear (you've got N patches, the most recent of which is stored N times, the oldest of which is stored 1 time). If we've got enough space for N=10, then the choice is between storing 10 days of patches Jeroen-style, or 55 days of patches (11*10/2) ordinary style. The bandwidth hit might also be obnoxious, I'm not sure.

I'd be interested in seeing how that actually ends up looking for unstable and testing, though.

I'm half tempted to suggest thinking about an annotated patch file, that looks like:

	patch-for abcdef12341231def1123 4123 2004-11-23-131421.1234
	* a 31
	* blahblah
	* .
	patch-for a4234534562bce123423f ...
	* ...

that concatenates all the information for the patches in a single file, most recent to least recent with some index stuff at the top, and you just stop downloading once you've got enough information, or you find out it's not going to work. Might be overly complicated though.

Below I outline my thoughts on the index file. I would very much
appreciate your comments. My current feeling is that we may go without
a explicit index-file. But I may be wrong here of course.

"we", huh?

Knowing the md5sum/size of what you're going to end up with is a useful sanity check, so that you can stop halfway through if you've somehow managed to get yourself into a loop or similar.
If the patch fails for some reason the next calculated md5sum will not
match any file on the server and the code will fallback to download
the Packages.gz file.

What makes you think it won't match a file on the server? It's easy to write a CGI script that'll return a patch that adds lines to your file no matter what md5sum you ask for. If it returned a script like "a\na\n.\n%s/a/aaaaaa/g\n" it should do a good job of breaking your system reasonably quickly.

Knowing the md5sum of the patches is useful just in case diff has a
root exploit.
I'm not sure if I understand this correctly. You think that someone
could sneak in a rogue diff to expolit apt?

It'd be a rogue diff that'd exploit patch, or ed, or whatever you used to apply it. Hopefully pretty unlikely, but defense in depth is always good.

Knowing the size of the patches you need to download is good for
progress bars.
http/ftp will tell us about that and it should already work with the
current patch.

It'll tell you how much you're downloading for the current patch; but not if you need to download another 100kB of patches after that one's done.

Cheers,
aj

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: