Your message dated Tue, 25 Aug 2020 22:52:56 -0700 with message-id <CAMumaChEOPvfjTO54pVnig432Sbr1RMgAs3A+xaN+=3fbq2Q=Q@mail.gmail.com> and subject line upload_history is back has caused the Debian Bug report #966649, regarding UDD: 'upload_history' importer broken; needs porting to Python3 to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@bugs.debian.org immediately.) -- 966649: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=966649 Debian Bug Tracking System Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: submit@bugs.debian.org
- Subject: UDD: 'upload_history' importer broken; needs porting to Python3
- From: Lucas Nussbaum <lucas@debian.org>
- Date: Sat, 1 Aug 2020 08:48:02 +0200
- Message-id: <[🔎] 20200801064802.GA9103@xanadu.blop.info>
Package: qa.debian.org User: qa.debian.org@packages.debian.org Usertags: udd Hi, The upload_history importer works as follows: 1) /srv/udd.debian.org/email-archives/debian-devel-changes/ contains a copy of the email archives, copied manually from master.debian.org. The latest emails are received directly on ullmann, to /srv/udd.debian.org/email-archives/debian-devel-changes/debian-devel-changes.current This part is about OK. It would be better if DSA provided a way to access those archives from ullmann without having to copy them from time to time. 2) When started, the importer first runs 'make' in /srv/udd.debian.org/upload-history/. This: 2.1) updates local copies of keyrings 2.2) using 'munge_ddc.py', converts email archives into summarized versions, stored as, e.g.: /srv/udd.debian.org/upload-history/debian-devel-changes.201209.gz.out 3) then the importer reads *.out and import them into postgres. 'munge_ddc.py' has the following issues: - it's not version-controlled - it doesn't support xz email archives, so it's broken for recent archives - it's python2 (but the lzma module is python3-only) Help would be welcomed to port it to python3 and resolve the other issues. Also, the data files around the upload_history gatherer should probably be reorganized with a cleaner separation between code (that should be versioned in UDD) and data. Lucas
--- End Message ---
--- Begin Message ---
- To: 966649-done@bugs.debian.org, Andreas Tille <andreas@fam-tille.de>
- Subject: upload_history is back
- From: Asheesh Laroia <asheesh@asheesh.org>
- Date: Tue, 25 Aug 2020 22:52:56 -0700
- Message-id: <CAMumaChEOPvfjTO54pVnig432Sbr1RMgAs3A+xaN+=3fbq2Q=Q@mail.gmail.com>
Thanks to Lucas for reviewing & merging & QA-ing this UDD merge request https://salsa.debian.org/qa/udd/-/merge_requests/26 and a few follow-up pushes/merge-requests.I went with the approach Lucas suggested, where we still read the mbox files that mentioned at the start of the bug.Test yourself with e.g. this command (which queries the public UDD mirror, but you can use the real UDD if you can connect to ullmann.debian.org)!$ echo 'select date,source,version from upload_history order by date desc limit 5;' | psql "postgresql://udd-mirror:udd-mirror@udd-mirror.debian.net/udd"
date | source | version
------------------------+---------------------+------------------
2020-08-26 00:33:28+00 | folding-mode-el | 0+20200825.748-1
2020-08-25 23:50:23+00 | supervisor | 4.2.1-1
2020-08-25 23:21:29+00 | php-doctrine-bundle | 2.1.2-1
2020-08-25 23:21:09+00 | firefox-esr | 68.12.0esr-1
2020-08-25 22:33:30+00 | pandas | 1.0.5+dfsg-1
(5 rows)
--- End Message ---