[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [UDD] Suspected problem of upload_history importer



On 13/06/13 at 10:39 +0200, Martin Zobel-Helas wrote:
> Hi, 
> 
> On Thu Jun 13, 2013 at 10:29:33 +0200, Lucas Nussbaum wrote:
> > On 13/06/13 at 09:46 +0200, Andreas Tille wrote:
> > 
> > The upload_history is broken since I don't have a way to access
> > debian-devel-changes@ archives from UDD since master moved to
> > new-master.
> > 
> > In order to clarify that this importer is broken, I've just removed all
> > data in the upload_history table.
> > 
> > This has been discussed multiple times with DSA, but no solution has
> > been found yet.
> > 
> > This is tracked in #702085.
> 
> you have also been told to work out a solution with ftp-master.

The main discussions about this topic happened on March 2nd and March
28th on #debian-admin.

On March 2nd, I was asked to discuss with ftp-masters what needed to be
done on the dak side to provide historical data. I did that the same day
on #debian-ftp, and then sent
https://lists.debian.org/debian-dak/2013/03/msg00000.html, which was
never answered.

On March 28th, I re-raised the topic on #debian-admin. Sligthly edited
log (to remove unrelated conversations) follows:
03/28/13 16:15:50< lucas> One thing that is still broken is UDD's importer for history of uploads. that works by parsing debian-devel-changes@ archives. We would need a way to bring those archives from master to ullmann.
03/28/13 20:55:46<@luca> lucas: how did that work before master was moved?
03/28/13 20:57:10< lucas> luca: the archives were processed locally, and the resulting files were downloaded over HTTP
03/28/13 20:57:24< lucas> what broke that was the lack of httpd on master (which I understand)
03/28/13 20:58:29<@luca> so master could push them to ullmann?
03/28/13 21:19:44<@zobel> lucas: what is the status of the dak export of that needed information from ftp-master to udd?
03/28/13 21:20:01<@zobel> or do i mix up something here?
03/28/13 21:20:24< lucas> zobel: no progress AFAIK
03/28/13 21:20:26< Ganneff> no you dont, but as noone has put any work into it, the status is same like last time
03/28/13 21:21:19< lucas> luca: master could push them, or, if you provided a script to sync them using a specific ssh key pair, that would work too
03/28/13 21:21:27< Ganneff> üatches welcome, of course
03/28/13 21:21:39<@luca> lucas: however it seems that the arrangement is for dak to provide these?
03/28/13 21:21:51<@luca> lucas: rather than exporting from master
03/28/13 21:21:57< lucas> luca: I don't have time to work on dak
03/28/13 21:21:58<@luca> lucas: based on what i just read
03/28/13 21:22:09< lucas> luca: it seems that it won't happen if nobody has time to work on dak
03/28/13 21:22:15< Ganneff> luca: someone needs to write the code. the export is easy, we "just" have to get the historical data into dak once.
03/28/13 21:22:29< Ganneff> (and a new table in the db, but thats even simpler)
03/28/13 21:22:34< lucas> luca: otoh, the code already exists and was functional to get that info from the mail archives
03/28/13 21:22:51< lucas> luca: (the code for UDD)
03/28/13 21:23:55< lucas> luca: so I think that a good compromise is to fix UDD ASAP by pushing or rsyncing the archives to ullmann, and switch to dak if the change happens there
03/28/13 21:29:43<@luca> seems like we need some master data management and an enterprise service bus ...
03/28/13 21:30:33<@zobel> lucas: so you need a script that parses mail on master, right?
03/28/13 21:30:47<@zobel> and import that data to udd?
03/28/13 21:30:50<@luca> dak -> email -> aggregator -> website -> wget -> database seems more prone to data loss (emails do get lost) than dak -> database -> view -> etl -> table -> database -> udd
03/28/13 21:31:43< lucas> zobel: not necessarily. I think that the easier would be to do all the processing on ullmann
03/28/13 21:32:00< lucas> zobel: that is, push archives to ullmann. or even autosshfs them
03/28/13 21:32:40<@zobel> maybe Ganneff can add a mail address on ullman in dak config to be notified?
03/28/13 21:32:59<@zobel> baeh, i don't like that idea.
03/28/13 21:33:15< lucas> can ullmann receive mail?
03/28/13 21:35:09<@luca> looks like zobel has this topic well in hand
03/28/13 21:35:28<@zobel> no.
03/28/13 21:35:40< lucas> what's easier for you?
03/28/13 21:35:41<@zobel> i am currently debugging, where 25G are lost on beach.d.o
03/28/13 21:35:43<@luca> okay, then what are we doing?
03/28/13 21:35:53<@luca> well, this doesn't need solving this minute
03/28/13 21:35:59<@zobel> it needs.
03/28/13 21:36:07<@zobel> 630M left 
03/28/13 21:36:09<@zobel> only
03/28/13 21:36:17<@luca> my 'this' is udd
03/28/13 21:36:44<@luca> the dak->udd problem needs a solution that everyone agrees to
03/28/13 21:36:58<@luca> am i facilitating this conversation or you?
03/28/13 21:37:02<@luca> let me know, either way
03/28/13 21:37:05<@luca> no rush
03/28/13 21:37:20< Ganneff> that would be "generate in dak" i think. i dislike the mail thing, but it would work.
03/28/13 21:37:32< Ganneff> and would be easy, but then udd needs to receive those mails
03/28/13 21:37:52<@zobel> lucas: could you work out with Ganneff or ansgar how the database layout could be and offer them data to import?
03/28/13 21:38:51< lucas> I can offer them data to import. re dak's db schema, they are more likely to know it than I am;)
03/28/13 21:39:33< Ganneff> lucas: can we discuss it tomorrow evening over in #debian-ftp? the db schema isnt hard, but im somewhat distracted elsewhere.
03/28/13 21:40:28<@luca> wearing my DSA hat: not keen on having ullmann receive email
03/28/13 21:40:41< lucas> I'm not sure I'll be around tomorrow evening. but please ping me, I'll ping you back
03/28/13 21:40:59< lucas> luca: what about getting access to mail archives on ullmann? that sounded easier
03/28/13 21:41:01<@luca> wearing my 'enterprise architect' hat (feel free to ignore me): not keen on using email as a data transfer mechanism since lossy
03/28/13 21:41:10<@luca> lucas: see second comment :)
03/28/13 21:41:32< lucas> ok, but it worked quite fine for the last 3 years
03/28/13 21:41:48<@luca> we think it worked fine?
03/28/13 21:42:27< lucas> there are a few queries that rely on the fact that each package in the archive has an email in the d-d-changes archives
03/28/13 21:43:07<@luca> ah, okay
03/28/13 21:43:19<@luca> let's start with the lucas/Ganneff conversation
03/28/13 21:43:50<@luca> then we can resume here once you have determined that you like / dislike the email approach (hopefully dislike)
03/28/13 21:45:04< lucas> mmh. I'm just a bit annoyed by this being broken since the master move. requiring dak changes is not a good way to have it fixed soon. :/
03/28/13 21:45:50<@zobel> lucas: i can understand you, but i would like to have it fixed the proper way.
03/28/13 21:54:14<@zobel> Ganneff: any way we can abuse archvsyn to mirror master:~debian/lists/debian-devel-changes/* to ullmann?
03/28/13 21:55:01<@zobel> i think, DSA broke the setup, DSA should get it fixed.
03/28/13 21:55:09< Ganneff> zobel: ewww. well, its an rsync. we can sure do that using archvsync and a crontab entry.
03/28/13 21:55:53<@zobel> so maybe we fix that now by syncing files, but encourage UDD and ftp-master people to come up with a better solution in long term.
03/28/13 21:56:28<@zobel> Ganneff: do you want to do it, or shall i just do it?
03/28/13 21:56:41< Ganneff> just do it
03/28/13 21:58:17<@zobel> lucas: i will try to fix it over the weekend by just mirroring the ML to ullmann, but i am in the middle of moving houses, so bear with me, if i don't get it done the next 48h.
03/28/13 21:58:33< lucas> thanks a lot :)
03/28/13 21:59:15<@zobel> still my words stand: i encourage you to speak to ftp-master ppl to find a nicer longterm solution. :)
03/28/13 22:00:21< lucas> yup


So, I think that the conclusion of that discussion is:
03/28/13 21:58:17<@zobel> lucas: i will try to fix it over the weekend by just mirroring the ML to ullmann, but i am in the middle of moving houses, so bear with me, if i don't get it done the next 48h.
03/28/13 21:59:15<@zobel> still my words stand: i encourage you to speak to ftp-master ppl to find a nicer longterm solution. :)

I pinged again on May 7th:
05/07/13 17:26:54< lucas> zobel: at some point you were going to export debian-devel-changes archives to UDD. I'm not sure this was done
(no answer)

and on May 24th:
05/24/13 18:43:03< lucas> hi, could you please look into the  debian-devel-changes@ archives on ullmann thing?
05/24/13 18:43:26< lucas> -publicity@ was asking about the "new contributors" script for DPN that depends on that
(still no answer)

Lucas


Reply to: