[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#761869: marked as done (debsources: "update statistics" stage is too slow)



Your message dated Wed, 19 Aug 2015 21:32:38 +0200
with message-id <20150819193238.GA21398@upsilon.cc>
and subject line Re: Bug#761869: debsources: "update statistics" stage is too slow
has caused the Debian Bug report #761869,
regarding debsources: "update statistics" stage is too slow
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
761869: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=761869
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: qa.debian.org
Severity: normal
User: qa.debian.org@packages.debian.org
Usertags: debsources

The "update statistics" stage of Debsources updated is currently too slow,
taking ~12 minutes on the current sources.d.n machine.

It could be easily optimized by avoiding redoing queries for each live suites
(currently: 9), where each query will do a sequential scan (due to count(*),
despite "index mostly scan") over the same data.  Instead, we can use GROUP BY
queries, taking at once stats for all suites.

(See proof of concept and benchmarks available in
doc/update-stats-query.bench.sql)


-- System Information:
Debian Release: jessie/sid
  APT prefers testing
  APT policy: (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.14-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=it_IT.UTF-8, LC_CTYPE=it_IT.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

--- End Message ---
--- Begin Message ---
On Tue, Sep 16, 2014 at 02:38:40PM +0200, Stefano Zacchiroli wrote:
> It could be easily optimized by avoiding redoing queries for each live suites
> (currently: 9), where each query will do a sequential scan (due to count(*),
> despite "index mostly scan") over the same data.  Instead, we can use GROUP BY
> queries, taking at once stats for all suites.

This is now done, in commit 18facdfcec8de1ea2f5784b6141796b82d993159,
thanks to Orestis' work.

Cheers.
-- 
Stefano Zacchiroli  . . . . . . .  zack@upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . . . . @zacchiro . . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

--- End Message ---

Reply to: