[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1035960: All of sudden, the Spanish PO debconf templates is getting full of alien translators :-)

Thanks everybody for the extra info and Cyril for the research

I have killed the spider jobs.
I have modified the crontab in tye.debian.org to disable cron.hourly job (spiderbts) for now.

I have removed the status files [1] and launched a job spiderinit [2] to re-create them.

[1] in /srv/i18n.debian.org/dl10n/data/spiderbts/data/status.??
[2] sudo -u debian-i18n /srv/i18n.debian.org/dl10n/git/cron/spiderinit &

Tomorrow I'll have a look at the logs of the spiderinit job [3] and launch the cron.hourly job once.

[3] /srv/i18n.debian.org/log/spiderinit/spiderinit.20230512-2134.[err|log]

Then I'll see how long does it take and if there is any issue.
If everything went well the webpages should show correct data. Then I'll set the "hourly" job to run 6 times a day and will keep an eye these days.

I agree that a lockfile is needed, I'll try to work on that too and when it's set, and the issue is fixed, I'll update the cron to run hourly again.

Kind regards

El 12/5/23 a las 12:04, Cyril Brulebois escribió:
Cyril Brulebois <kibi@debian.org> (2023-05-12):
I'll keeping looking at what's supposed to happen on tye, but I'm not
sure I'll be able to get to the bottom of it on my own.

At least there's a HUGE red flag on tye. Load to the roof, RAM/swap
almost full, lots of dl10n-spider processes running for the same
language, some of them started May 9th.

     kibi@tye:~$ uptime
      10:02:58 up 12 days, 21:47,  2 users,  load average: 63.24, 64.57, 66.51

     kibi@tye:~$ free -h
                    total        used        free      shared  buff/cache   available
     Mem:           1.9Gi       1.7Gi        69Mi       1.0Mi       125Mi        57Mi
     Swap:          511Mi       511Mi       0.0Ki

     kibi@tye:~$ ps faux|grep dl10n-spider|grep -o -- '--check-bts ..'|sort|uniq -c
           4 --check-bts ca
           1 --check-bts cs
           1 --check-bts da
          51 --check-bts de
           7 --check-bts es
           2 --check-bts fr

     kibi@tye:~$ ps faux|awk '/CRON/ {print $9}'|sort|uniq -c
          11 May09
          23 May10
          23 May11
           1 00:15
           1 02:15
           1 03:15
           1 04:15
           1 05:15
           1 06:15
           1 07:15
           1 08:13
           1 08:15
           1 09:15
           2 10:00
           1 10:01

Note that many de.po occurrences appear in the status file for other
languages, looks like processes heavily stomping onto others' feet?

It looks to me there should be some locking at the very least to avoid
that amount of concurrency. And that it would probably be best to start
afresh, killing all those processes, maybe disabling the cron jobs,
cleaning temporary and maybe corrupted data files, and triggering a
single run manually to see if it works.

But then, I have 0 knowledge about the spider, and I'll leave that up to
someone else: I don't want to risk making the matter worse!


Laura Arjona Reina

Reply to: