[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please test gzip -9n - related to dpkg with multiarch support

On Thu, Feb 9, 2012 at 22:29, Guillem Jover <guillem@debian.org> wrote:
> On Thu, 2012-02-09 at 21:50:17 +0200, Riku Voipio wrote:
>> On Thu, Feb 09, 2012 at 03:34:28AM +0100, Guillem Jover wrote:
>> > Riku mentioned as an argument that this increases the data to download
>> > due to slightly bigger Packages files, but pdiffs were introduced
>> > exactly to fix that problem. And, as long as the packages do not get
>> > updated one should not get pdiff updates. And with the splitting of
>> > Description there's even less data to download now.
>> off-topic but often pdiffs don't really speed up apt-get update. Added
>> roundtrip time latency on pulling several small files slows down the
>> download unless you run update nightly.
> One of the reasons of this I think, is that the current pdiff
> implementation in apt is really not optimal, see #372712.

The real slowdown is that APT currently works on one pdiffs at the time.
The solution for this is two-fold: First get all pdiffs needed - for debian
this is easy as its strictly sequential, but other archives can (and some
even do) use different paths so we need a bit more metadata to support these,
too. After we have all these pdiffs we can merge these to one "big" pdiff and
apply this one. As we walk over > 25 MB files only once and not for each
patch we should be quiet a bit faster. The theory and even python code for
the merge part can be found at [0], it's just that the APT team is since years
so overcrowded that we haven't yet decided who can pick this one [/irony].

If someone wants to work on that, feel free to drop a line to deity@l.d.o
(and to Anthony) and i will try to help if time permits.

[0] http://lists.debian.org/deity/2009/08/msg00169.html

>> But the more interesting slowdown is that the amount of packages is general
>> slows down apt operations in a rate that is around O(dependencies^2) (pure guess,
>> perhaps someone has better knowledge?).

My question would be why you are guessing O(d^2) for a situation which
should be intuitively a O(d*2). My empirical testing seems to support this,
given that the runtime roughly doubles (a bit less)
(Less than doubled packages as we have arch:all packages, but a bit more
 than doubled deps given that we have new implicit ones for multiarch).
But as team member and implementer of multiarch in APT i might be a bit
biased here… ;)
(note though that numbers/timing are based on experimental, sid has currently
 a slightly different implementation, but shouldn't be that bad either)

>> We do remember apt-get slowing down
>> to crawl on maemo platforms with much smaller repositories..

As an owner of an N810 i am not, but i might be used to pain, given that
i managed bootstrapping Debian with a recent (partly working) kernel
on it (the gentoo/openwrt have details on that if someone is interested).

So if you can go into details what you remember exactly we might be able
to work on it - until then, my only comment to adding more packages:
"What should possible go wrong?" ;)

If APT survives i386 packages in amd64, it might survive some new ones, too.

Best regards

David Kalnischkies

Reply to: