Re: New method for Packages/Sources file updates
Anthony Towns <aj@azure.humbug.org.au> writes:
> Goswin von Brederlow wrote:
>>>I think the whole approach is needlessly complicated. Could you
>>>read the idea covered in the thread starting at
>>>http://lists.debian.org/debian-devel/2004/07/msg00128.html
>>>and explain what's wrong with it in your opinion?
>> + no performance difference for daily updates (+- a few bytes)
>
> That's not correct: the "ed diff" format only needs to transmit the
> new "Version:", "Filename:", size, and md5sum fields if they're the
> only things that change. The unchanged fields such as "Package:",
> "Description:" don't appear in the transmitted data.
>
> In any case, do the measurements, don't just randomly hypothesise.
Go back a few mails and read the statistics I posted covering a full
month of real world data. I compared those numbers with filesizes for
diffs provided by other people (not posted) and that is what I base my
opinion on.
If someone can show me diffs covering the same time period that are
better I'm open for it. Maybe the diff files I looked at weren't using
the best diff mode or something.
> (Or, hey, use the measurements that've already been done,
> <http://lists.debian.org/debian-devel/2002/04/msg01076.html>)
Those measurements seem to be purely fictional / theoretical.
Or do you truely believe there was an exactly 12KB diff for every day
over a period of 15 days? The mail seems to discuss the benefits and
overhead of the diff scheme using some hypothetical 12K diff size
data to illustrate the point.
I would call that hypothesising.
> Cheers,
> aj
MfG
Goswin
Reply to: