Re: New method for Packages/Sources file updates
Goswin von Brederlow wrote:
[snip]
> >> Following Seteves idea of marking removals in the Packages file itself
> >> I've come up with the following:
> >
> > I think the whole approach is needlessly complicated. Could you
> > read the idea covered in the thread starting at
> > http://lists.debian.org/debian-devel/2004/07/msg00128.html
> > and explain what's wrong with it in your opinion?
> >
> >
> > Thiemo
>
> + no performance difference for daily updates (+- a few bytes)
+ Full backward compatibility. Changing the Packages format is likely
to break some tools.
> - preformance penalty for repeated patching of the same package
> (e.g. the zsh-beta upload every odd day)
>
> - compression penalty due to lots of small files instead of one big
> one from gzip, even worse with bzip2
>
> - performance penalty due to lots of small files instead of one big
> one from apt-method, forking gunzip, forking patch
Client-side performance is mostly irrelevant. Also, this particular
set of problems can be solved by using cumulative diffs instead of
several incremental ones.
> - multi pass method where a failure in any one of them is fatal
Failure isn't fatal, it just triggers fallback to the full
Packages file. Btw, this can also be solved by cumulative diffs.
> - timestamp on package is timezone/clock dependent but the index
> should protect that
You haven't read the complete thread. The timestamp problem can be
avoided.
> - extra space needed for the diff files
Which is minimal in comparision to the archive size.
> - limited update interval to stop the extra space from exploding,
> 2 weeks suggested
Rather a heuristics based on patch sizes << Packages size and the
number of update cycles. The absolute timespan isn't a good measure,
just think about the typical update cycles for unstable, stable and
security/stable.
> while my method can cover full releases for a few
> K extra
True. OTOH, covering many update cycles isn't that useful for typical
use.
> - new files that mirrors won't pick up for a long time,
> can only be used on mirrors that are reconfigured to mirror diffs too
I think full mirrors alredy cover the whole directory contents, so this
is only a problem for partial mirrors. It's also non-fatal, as it falls
back to the current method (failing index file accesses are the only
difference).
Reduced server load provides an incentive to those mirror admins to
change their scripts.
> - no benefit for rsync or zsync
True. OTOH, low bandwith users are unlikely to update their machines
via rsync.
> - not applicable (due to number of files) to archives with hourly
> updates (like amd64, and we might even do 15m updates to prevent
> Build-Depends stalls)
This suggests interested parties do frequent updates anyway. This
eventually allows to shorten the timespan covered, which means the
number of files won't increase much.
> - probably unusable on snapshots.debian.net like archives with tons of
> Packages files due to too many tiny files
Which is a good thing, since archived Packages files aren't supposed
to get updated. :-)
> Need any more? :)
Yes.
Thiemo
Reply to: