[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Fun with the BTS and an SQL DB



On 04/05/07 at 18:24 +0200, Filippo Giunchedi wrote:
> just out of curiosity: have you already thought how to size the
> threshold n ?

Not yet. The algorithm will probably be something like that:
- get a list of interesting packages and maintainers
- add the missing ones to the "permanent" list  to monitor
- for each entry in the "permanent" list of stuff to monitor, get
  the stats and store them.

The idea is to avoid losing space to store uninteresting data. But don't
forget that if 'n' = 10, but you maintain lots of packages with 10 (not
higher than n) bugs, then *your* stats will still be recorded, but your
packages' won't.


On 04/05/07 at 18:32 +0200, Filippo Giunchedi wrote:
> just a few more comments on bugs_bugs:
> 
> can it be useful to store usertags?
> 
> what about added tags? I guess this would need to change the schema accordingly,
> and given that it doesn't happen very often it isn't a concern right now

Actually, adding/removing tags is easy. What I need to get right is
which data I want to store over time. But the "snapshot" of the current
BTS can evolve without breaking anything.

usertags are stored in a separate DB (see
merkel:/org/bugs.debian.org/spool/user/). It would be quite easy to add
a different table to store them, but it would be a list of (user,
usertag, bug) tuples. Since I don't see any interest in storing them as
well, I'll leave that for later, but if you have a specific idea in
mind, I can change that :-)

Some quick results:
n  | packages with more than n bugs open in unstable
50 | 165
40 | 227
30 | 332
20 | 495
15 | 695
10 | 1043
5  | 1944
0  | 6783

n   | Emails with more than n bugs open in unstable (excluding
co-maintained packages)
100 | 121
50  | 269
25  | 421
20  | 494
15  | 601
10  | 751
5   | 996

n   | Emails with more than n bugs open in unstable (INCLUDING co-maint)
25  | 650
20  | 726
15  | 848
10  | 1010
5   | 1261

So it would probably be reasonable to have n=20 for packages, and n=10
for maintainer and co-maintainer. Which would generate about 2300
rows per day. (and we can expire the old ones quite fast)
-- 
| Lucas Nussbaum
| lucas@lucas-nussbaum.net   http://www.lucas-nussbaum.net/ |
| jabber: lucas@nussbaum.fr             GPG: 1024D/023B3F4F |



Reply to: