[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1064913: marked as done (UDD/upload_history: strange values in the 'distribution' column)



Your message dated Thu, 29 Feb 2024 17:49:42 +0100
with message-id <ZeC1piqZajS-rxNu@grub.nussbaum.fr>
and subject line Re: Bug#1064913: UDD/upload_history: strange values in the 'distribution' column
has caused the Debian Bug report #1064913,
regarding UDD/upload_history: strange values in the 'distribution' column
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
1064913: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1064913
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: qa.debian.org
User: qa.debian.org@packages.debian.org
Usertags: udd

The email parsing that is at the core of the upload_history table
generates some erroneous values.

----- Forwarded message from Patrice Duroux <patrice.duroux@gmail.com> -----

From: Patrice Duroux <patrice.duroux@gmail.com>
To: debian-qa <debian-qa@lists.debian.org>
Date: Wed, 14 Feb 2024 19:36:24 +0100
Subject: remark about the UDD upload_history table
Message-ID: <[🔎] CAGKjw9LQN5_6KqgE+k-8Y096qhwDdFF8tJsfTtxAyxOUaCVWpw@mail.gmail.com>
X-Mailing-List: <debian-qa@lists.debian.org> archive/latest/25424
List-Id: <debian-qa.lists.debian.org>

Hi,

I am a bit surprised with the following output:

udd=> select distinct distribution from upload_history;
          distribution
---------------------------------
 experimental
 froxzen unstable
 froze unstable
 frozen
 frozen  unstable
 frozen unstable
 frozen unstable contrib
 frozen woody
 frozen-contrib contrib
 non-free
 rc-buggy
 sid
 sid=20
 stable
 stable frozen unstable
 stable unstable
 stable-security
 testing
 testing unstable
 testing-security
 unstable
 unstable  frozen
 unstable contrib
 unstable frozen
 unstable non-free
 unstable stable
 unstable stable frozen
 unstable testing
 unstable testing stable
 unstable unstable
 unstable=20
 woody-proposed-updates unstable
(32 lignes)


I am then not sure about the consistency of another SQL query based on
this column value.

Regards,
Patrice



----- End forwarded message -----

--- End Message ---
--- Begin Message ---
On 27/02/24 at 18:19 +0100, Lucas Nussbaum wrote:
> Package: qa.debian.org
> User: qa.debian.org@packages.debian.org
> Usertags: udd
> 
> The email parsing that is at the core of the upload_history table
> generates some erroneous values.
> 
> ----- Forwarded message from Patrice Duroux <patrice.duroux@gmail.com> -----
> 
> From: Patrice Duroux <patrice.duroux@gmail.com>
> To: debian-qa <debian-qa@lists.debian.org>
> Date: Wed, 14 Feb 2024 19:36:24 +0100
> Subject: remark about the UDD upload_history table
> Message-ID: <[🔎] CAGKjw9LQN5_6KqgE+k-8Y096qhwDdFF8tJsfTtxAyxOUaCVWpw@mail.gmail.com>
> X-Mailing-List: <debian-qa@lists.debian.org> archive/latest/25424
> List-Id: <debian-qa.lists.debian.org>
> 
> Hi,
> 
> I am a bit surprised with the following output:
> 
> udd=> select distinct distribution from upload_history;
>           distribution
> ---------------------------------
>  experimental
>  froxzen unstable
>  froze unstable
>  frozen
>  frozen  unstable
>  frozen unstable
>  frozen unstable contrib
>  frozen woody
>  frozen-contrib contrib
>  non-free
>  rc-buggy
>  sid
>  sid=20
>  stable
>  stable frozen unstable
>  stable unstable
>  stable-security
>  testing
>  testing unstable
>  testing-security
>  unstable
>  unstable  frozen
>  unstable contrib
>  unstable frozen
>  unstable non-free
>  unstable stable
>  unstable stable frozen
>  unstable testing
>  unstable testing stable
>  unstable unstable
>  unstable=20
>  woody-proposed-updates unstable
> (32 lignes)
> 
> 
> I am then not sure about the consistency of another SQL query based on
> this column value.

Hi,

The problems caused by email decoding (=20) are fixed by changes to the
code (and then reprocessing all emails).

The remaining strange values are values that were probably valid at some
point, but haven't been for a long time:

udd=> select distribution, count(*), max(date) from upload_history group by distribution order by 2;
          distribution           | count  |          max           
---------------------------------+--------+------------------------
 unstable non-free               |      1 | 1999-05-18 06:32:53+00
 froxzen unstable                |      1 | 1999-01-08 01:03:06+00
 frozen-contrib contrib          |      1 | 1999-01-17 18:55:29+00
 frozen  unstable                |      1 | 2000-02-02 15:19:52+00
 frozen unstable contrib         |      1 | 1998-12-03 19:06:21+00
 frozen woody                    |      1 | 2000-02-10 14:12:53+00
 froze unstable                  |      1 | 1999-01-05 00:31:44+00
 unstable testing stable         |      1 | 2004-01-07 08:32:00+00
 unstable contrib                |      1 | 1999-05-18 06:46:28+00
 unstable  frozen                |      2 | 1998-11-16 20:05:57+00
 stable                          |      2 | 1998-09-05 23:11:05+00
 unstable stable frozen          |      2 | 2000-02-10 09:53:03+00
 testing-security                |      2 | 2005-05-25 15:32:08+00
 testing                         |      3 | 2001-07-08 18:55:03+00
 unstable stable                 |      3 | 2001-02-06 19:55:56+00
 non-free                        |      6 | 2000-01-24 19:55:07+00
 stable frozen unstable          |      8 | 2000-01-28 08:50:34+00
 unstable unstable               |      9 | 2001-01-18 20:02:09+00
 woody-proposed-updates unstable |     10 | 2002-05-10 20:04:56+00
 stable-security                 |     11 | 2005-09-29 07:32:05+00
 unstable testing                |     18 | 2002-03-15 16:47:32+00
 testing unstable                |     52 | 2002-04-23 17:17:27+00
 rc-buggy                        |     54 | 2024-02-10 03:52:33+00
 stable unstable                 |    203 | 2003-03-25 06:30:16+00
 unstable frozen                 |    289 | 2002-04-10 14:47:17+00
 frozen                          |    370 | 2002-04-26 00:47:16+00
 frozen unstable                 |   3042 | 2002-05-14 14:17:22+00
 sid                             |   4800 | 2024-02-27 19:54:32+00
 experimental                    |  82062 | 2024-02-29 10:50:32+00
 unstable                        | 774360 | 2024-02-29 12:37:00+00
(30 rows)

If we restrict to post-2005, the picture is much cleaner:

udd=> select distribution, count(*), max(date) from upload_history where date > '2006-01-01' group by distribution order by 2;
 distribution | count  |          max           
--------------+--------+------------------------
 rc-buggy     |     54 | 2024-02-10 03:52:33+00
 sid          |   4800 | 2024-02-27 19:54:32+00
 experimental |  78144 | 2024-02-29 10:50:32+00
 unstable     | 618028 | 2024-02-29 12:37:00+00
(4 rows)

Lucas

--- End Message ---

Reply to: