> On Tue, Nov 01, 2005 at 09:54:09AM -0500, Michael Vogt wrote: > > A problem is that zsync needs to teached to deal with deb files (that > > is, that it needs to unpack the data.tar and use that for the syncs). [Anthony Towns] > That seems kinda awkward -- you'd need to start by downloading the ar > header, working out where in the file the data.tar.gz starts, then > redownloading from there. I guess you could include that info in the > .zsync file though. Right, the latter. Having downloaded the .zsync file, you calculate your local checksums against the ones in that file and you know exactly what's left to be downloaded and what to do with it. The .zsync file includes a sort of map of the structure of the target, not unlike a jigdo file. > OTOH, there should be savings in the control.tar.gz too, surely -- > it'd change less than data.tar.gz most of the time, no? He was only comparing data.tar.gz because that made for a simpler mock-up. zsync doesn't currently dig into a .deb at all, so this was just a simulation, as it were. > Hrm, thinking about it, I guess zsync probably works by storing the > state of the gzip table at certain points in the file and doing a > rolling hash of the contents and recompressing each chunk of the > file I haven't actually looked at the implementation of zsync, but I've always assumed that zsync assumes a homogeneous (i.e., predictable) gzip algorithm everywhere, works out the known variables by trial and error, and stores the appropriate amount of state to reproduce the gzip file exactly, given the assumptions about the gzip implementation. For that to be correct assumes a certain homogeneity of the zlib used by zsync implementations; for it to be efficient assumes the same about whatever is used to compress files in gzip format. I've always harbored my doubts about the deployment scalability of this approach. > Anyway, just because you get a different file, that doesn't mean > it'll act differently; so we could just use an "authentication" > mechanism that reflects that. That might involve providing sizes and > sha1s of the uncompressed contents of the ar in the packages file, > instead of the md5sum of the ar. Authenticating uncompressed content is a good design choice anyway. Makes it easier, for instance, to add gpg signatures inside the ar file, without invalidating existing checksum authentication. Conceptually, authenticating content based on a container which is essentially nondeterministic is a bit like refusing to authenticate a person because he or she is wearing different clothes from the ones noted in the auth database.
Description: Digital signature