[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Salsa CI overload



Alexander Wirt writes ("Re: Salsa CI overload"):
> I took some statistics.

Thanks.  This is extremely illuminating.  Earlier I wrote this:

> > Perhaps some of our jobs are wasteful, but I think that overall,
> > adding capacity would be well justified.>

but these startling numbers suggest that I was wrong.

Is there any mechanism we could use for allocating capacity more
fairly?


If not, I think we must we rely on ad-hoc response to overload events,
and social pressure.  That is much less comfortable than an automatic
resource allocation system, as the latter (however imperfect) is
objective, and the message is delivered by a computer rather than an
exasperated human.

Anyway, in that case we don't want that to be a burden just on the
salsa admins, so it would be good to be able to see the numbers
publicly (at least, most of the numbers).

It should be noted of course that number of jobs is perhaps a poor
proxy for resource use, given that jobs can be of widely diffeerent
sizes.  But many jobs are so small that the overhead dominates.


> Top 5 projects today:
>   js-team/node-glob: 2751

I went and looked at this package.  I found that:

The pipeline has been failing on the main branch for some days, so it
doesn't seem like CI is being used to prevent landing broken changes
(or, it isn't effective in doing so).

I looked at one of the pipelines via the web UI and I didn't see the
thousands of jobs I was expecting based on the number above; I don't
know why.  Maybe the web UI just can't show so many.  I looked at two
arbitrary successful jobs.  They each took about 3 minutes.  Our
"docker-machine" executor container takes 1.5 minutes to set up per
job.  With small jobs like this, that time must be considered wasteful
overhead.  That's tolerable if there aren't very many.

But in this case *this project alone* has *wasted* 60 thread-hours'
worth of some kind of capacity, in a 24 hour period.  This is quite
extraordinary.

The remaining ones in the top 5, using 10x fewer jobs each, mostly
seem like repos we might expect to be under intense development.
Probably the computer resource usage there is more proportionate to
the humanm effort input.

Ian.

-- 
Ian Jackson <ijackson@chiark.greenend.org.uk>   These opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.


Reply to: