[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: buildd timeouts

Wouter Verhelst <wouter@debian.org> writes:

> Hi
> (not 100% sure whether this is the right list for this, but anyway)
> Buildd has a 'feature' to check for a very specific type of hangs in
> builds, where the build would produce no output for a certain amount of
> time.
> Personally, I've started to question the usefulness of this 'feature'.
> In all the years that I've been a buildd maintainer, I cannot remember
> a single instance where the timeout hit because the build did actually
> hang; but I can remember _lots_ of false positives. And since a loop
> usually does produce output every few seconds, I can remember a few
> false negatives.

Python2.4 hangs in a testcase on several archs currently and gets
killed after 150 minutes. e.g. see amd64.

> Since a false positive will usually kill a build that has been running
> for hours and hours already (the (insufficient) default of 300 minutes
> is already 5 hours), this would mean that a false positive is a very
> painful matter, since a build which needs a long amount of time to build
> (and which *has* already taken quite a while to get at the point where
> it was killed) needs to be redone.
> In other words, I feel that the timeout code is doing more harm than it
> is helping. Wouldn't it be better if we would just drop the timeout
> code?

On fast archs a 150m timeout is like forever and there is something
seriously wrong then. On slow archs like m68k something taking 150m
isn't too uncommon. That's why several packages have a much longer
timeout and you default to 300m. Something like 15000m would probably
be comparative to 150m on i386 or amd64.

For m68k I totaly agree with you that killing the job is probably
doing more harm than good. Maybe instead of killing the job the buildd
could send a mail to the admin to look into the matter instead. The
admin can then kill the job or let it run and set a package specific
timeout for the next time.

This should be an option. On faster archs killing the job is just fine
I think.


Reply to: