Re: buildd timeouts
Wouter Verhelst <firstname.lastname@example.org> writes:
> (not 100% sure whether this is the right list for this, but anyway)
> Buildd has a 'feature' to check for a very specific type of hangs in
> builds, where the build would produce no output for a certain amount of
> Personally, I've started to question the usefulness of this 'feature'.
> In all the years that I've been a buildd maintainer, I cannot remember
> a single instance where the timeout hit because the build did actually
> hang; but I can remember _lots_ of false positives. And since a loop
> usually does produce output every few seconds, I can remember a few
> false negatives.
Python2.4 hangs in a testcase on several archs currently and gets
killed after 150 minutes. e.g. see amd64.
> Since a false positive will usually kill a build that has been running
> for hours and hours already (the (insufficient) default of 300 minutes
> is already 5 hours), this would mean that a false positive is a very
> painful matter, since a build which needs a long amount of time to build
> (and which *has* already taken quite a while to get at the point where
> it was killed) needs to be redone.
> In other words, I feel that the timeout code is doing more harm than it
> is helping. Wouldn't it be better if we would just drop the timeout
On fast archs a 150m timeout is like forever and there is something
seriously wrong then. On slow archs like m68k something taking 150m
isn't too uncommon. That's why several packages have a much longer
timeout and you default to 300m. Something like 15000m would probably
be comparative to 150m on i386 or amd64.
For m68k I totaly agree with you that killing the job is probably
doing more harm than good. Maybe instead of killing the job the buildd
could send a mail to the admin to look into the matter instead. The
admin can then kill the job or let it run and set a package specific
timeout for the next time.
This should be an option. On faster archs killing the job is just fine