[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#966649: marked as done (UDD: 'upload_history' importer broken; needs porting to Python3)



Your message dated Thu, 27 Aug 2020 14:52:10 +0200
with message-id <20200827125210.GA2595@xanadu.blop.info>
and subject line Re: Bug#966649: Unfortunately there are several Uploads missing (Was: upload_history is back)
has caused the Debian Bug report #966649,
regarding UDD: 'upload_history' importer broken; needs porting to Python3
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
966649: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=966649
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: qa.debian.org
User: qa.debian.org@packages.debian.org
Usertags: udd

Hi,

The upload_history importer works as follows:

1) /srv/udd.debian.org/email-archives/debian-devel-changes/ contains a copy
of the email archives, copied manually from master.debian.org. The
latest emails are received directly on ullmann, to /srv/udd.debian.org/email-archives/debian-devel-changes/debian-devel-changes.current
This part is about OK. It would be better if DSA provided a way to
access those archives from ullmann without having to copy them from time
to time.

2) When started, the importer first runs 'make' in /srv/udd.debian.org/upload-history/. This:
2.1) updates local copies of keyrings
2.2) using 'munge_ddc.py', converts email archives into summarized versions, stored as, e.g.:
/srv/udd.debian.org/upload-history/debian-devel-changes.201209.gz.out

3) then the importer reads *.out and import them into postgres.

'munge_ddc.py' has the following issues:
- it's not version-controlled
- it doesn't support xz email archives, so it's broken for recent
  archives
- it's python2 (but the lzma module is python3-only)

Help would be welcomed to port it to python3 and resolve the other
issues. Also, the data files around the upload_history gatherer should
probably be reorganized with a cleaner separation between code (that
should be versioned in UDD) and data.

Lucas

--- End Message ---
--- Begin Message ---
Hi,

On 26/08/20 at 09:13 -0300, Eriberto Mota wrote:
> Em qua., 26 de ago. de 2020 às 06:45, Andreas Tille <tille@debian.org> escreveu:
> >
> > Control: reopen -1
> >
> > Hi Asheesh,
> >
> > On Tue, Aug 25, 2020 at 10:52:56PM -0700, Asheesh Laroia wrote:
> > > Test yourself with e.g. this command (which queries the public UDD mirror,
> > > but you can use the real UDD if you can connect to ullmann.debian.org)!
> >
> > So the bad news is that there is something wrong with the importer.
> 
> Please, check my UDD[1] that lists all sources in Debian Sid (see the
> 'Upload Sid' column).
> 
> [1] https://people.debian.org/~eriberto/udd/help_a_package.html
> 
> Currently there are 31.371 sources in Sid and 19.353 has no upload date.
> 
> Thanks a lot for your efforts.

I think I messed up the initial import. I re-did it, and then re-did a
normal import, and everything seems to be fine.

To check, use:
select source,version from sources where release='sid' and extra_source_only is null except select source,version from upload_history order by source;

This gives 31 packages with no known uploads. I dug a couple, and this
seems to be caused by mails missing in the archive. For example,
kerneltop 0.91-2 was upload in 04/2014, but is not in the corresponding
email archive.

I'm closing this bug. Please reopen if the issue reappears of course.

Lucas

--- End Message ---

Reply to: