[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1060201: qa.debian.org: [udd] carnivore_emails is lacking lots of entries



Control: usertag -1 udd

Am Sun, Jan 07, 2024 at 01:38:35PM +0100 schrieb Andreas Tille:
> Package: qa.debian.org
> Severity: normal
> 
> Hi,
> 
> I tried to analyse closed bugs using done_email via carnivore_emails but realised
> that this table is lacking lots of entries where I could easily add several from
> my own memory:
> 
> SELECT done_email, COUNT(*) FROM (
> SELECT done_email FROM archived_bugs WHERE id IN (SELECT id FROM (SELECT ab.id, ce.id AS ce_id
>           FROM archived_bugs ab
>           LEFT JOIN carnivore_emails ce ON ce.email = ab.done_email
> ) noid WHERE ce_id IS NULL ) AND done_email NOT IN ('ftpmaster@ftp-master.debian.org','noreply@salsa.debian.org','unknown')
> ) miss GROUP BY done_email
> ORDER BY count DESC
> ;
> 
>                        done_email                       | count 
> --------------------------------------------------------+-------
>  gothicx@sapo.pt                                        |  5221
>  bartm@quantz.debian.org                                |  2665
>  doko@cs.tu-berlin.de                                   |  2555
>  kitame@northeye.org                                    |  2371
>  md@Linux.IT                                            |  2056
>  herbert@gondor.apana.org.au                            |  1900
>  damog@merkel.debian.org                                |  1788
>  daniel.baumann@progress-technologies.net               |  1393
>  max@stro.at                                            |  1327
>  bunk@fs.tum.de                                         |  1278
>  debian-bts@adam-barratt.org.uk                         |  1155
>  ccheney@cheney.cx                                      |  1031
>  sramacher@respighi.debian.org                          |   992
> ...
>  zweistein12@gmx.de                                     |     1
> (9075 rows)
> 
> I wonder how the carnivore_* tables are filled and whether you want me
> to draft some INSERT statements filling up the most relevant emails
> where I would volunteer to sort the according IDs.
> 
> Kind regards
>    Andreas.

BTW, its probably pretty easy to resolve >900 of these missing e-mails:

CREATE TEMPORARY TABLE missing_in_carnivore_emails AS
SELECT done_email, COUNT(*) FROM (
SELECT done_email FROM archived_bugs WHERE id IN (SELECT id FROM (SELECT ab.id, ce.id AS ce_id
          FROM archived_bugs ab
          LEFT JOIN carnivore_emails ce ON ce.email = ab.done_email
) noid WHERE ce_id IS NULL ) AND done_email NOT IN ('ftpmaster@ftp-master.debian.org','noreply@salsa.debian.org','unknown')
) miss GROUP BY done_email
ORDER BY count DESC
;

SELECT DISTINCT done_name, done_email, cn.id FROM
  (SELECT BTRIM(done_name, '"') AS done_name, done_email FROM archived_bugs) ab
  LEFT JOIN carnivore_names cn ON cn.name = ab.done_name
  WHERE done_email in (SELECT done_email FROM missing_in_carnivore_emails WHERE count > 10)
    AND done_name IS NOT NULL AND done_name != ''
    AND id IS NOT null
;

           done_name            |                   done_email                    |  id  
---------------------------------+-------------------------------------------------+------
 Camm Maguire                    | camm@enhanced.com                               | 6158
 Ross Vandegrift                 | ross@kallisti.us                                |  734
 Michael Ablassmeier             | abi@grinser.de                                  | 2751
 Neil McGovern                   | maulkin@halon.org.uk                            | 3708
 Torsten Landschoff              | torsten@pclab.ifg.uni-kiel.de                   | 6320
 Agney Lopes Roth Ferraz         | agney@users.sourceforge.net                     | 4000
 Galen Hazelwood                 | galenh@micron.net                               | 1241
 Anand Kumria                    | wildfire@progsoc.org                            | 4175
 Adam Rogoyski                   | rogoyski@cs.utexas.edu                          | 1102
 Christophe Barbe                | christophe.barbe@ufies.org                      | 2054
 Yann Dirson                     | ydirson@fr.alcove.com                           | 5804
 Arjan Oosting                   | arjanoosting@home.nl                            | 5366
 Julian Gilbey                   | J.D.Gilbey@qmw.ac.uk                            | 3875
 Norman Jordan                   | njordan@shaw.ca                                 | 3513
 Michael Piefel                  | piefel@informatik.hu-berlin.de                  | 1111
 Frederic Lepied                 | Lepied@debian.org                               | 2460
...
 Neil Williams                   | linux@codehelp.co.uk                            | 1552
 Christopher Martin              | chrsmrtn@freeshell.org                          | 2754
 Andrew Lenharth                 | adl@cs.washington.edu                           | 3085
(922 rows)


This statement could be easily turned into injects and would be a first approach to enhance
the carnivore_emails table with more ids.

If you give some green light I could create such a statement and maybe more enhancements
by looking into more tables.

Kind regards
     Andreas.

-- 
http://fam-tille.de


Reply to: