Re: New method for Packages/Sources file updates
Anthony Towns <email@example.com> writes:
> Goswin von Brederlow wrote:
>>>I think the whole approach is needlessly complicated. Could you
>>>read the idea covered in the thread starting at
>>>and explain what's wrong with it in your opinion?
>> + no performance difference for daily updates (+- a few bytes)
> That's not correct: the "ed diff" format only needs to transmit the
> new "Version:", "Filename:", size, and md5sum fields if they're the
> only things that change. The unchanged fields such as "Package:",
> "Description:" don't appear in the transmitted data.
> In any case, do the measurements, don't just randomly hypothesise.
Go back a few mails and read the statistics I posted covering a full
month of real world data. I compared those numbers with filesizes for
diffs provided by other people (not posted) and that is what I base my
If someone can show me diffs covering the same time period that are
better I'm open for it. Maybe the diff files I looked at weren't using
the best diff mode or something.
> (Or, hey, use the measurements that've already been done,
Those measurements seem to be purely fictional / theoretical.
Or do you truely believe there was an exactly 12KB diff for every day
over a period of 15 days? The mail seems to discuss the benefits and
overhead of the diff scheme using some hypothetical 12K diff size
data to illustrate the point.
I would call that hypothesising.