[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Salsa upgrade, history and future



Hello everyone,

you may have noticed that we had a bit of a downtime with Salsa
recently. What follows is a short summary on how it came to be, why it
took so long, and a bit about the future of Salsa.

But before we start on that, we want to thank Bastian Blank for his long
work on Salsa and it's ansible infrastructure, the service would be
quite a bit less maintainable without the setup he helped create.

Next, a bit of needed background about services running on DSA (Debian
System Administrators) maintained machines (anything in .debian.org):
You may (or not) know that we are running on volunteer maintained
machines (same as we are volunteers maintaining Salsa). DSA has put
together a set of rules they follow on how they run our machines and
what they expect from services. They are offering a good bunch of
services (say, databases if you need one, webservers, ...) and they very
much prefer software either installed from Debian (stable or backports)
or software the services admin(s) install / provide themself.

Those requirements have led to the Salsa service *not* using the
upstream provided package (their "Omnibus" installation method), as that
package is one *huge* beast, bundling everything needed, configuring
everything centrally, outside of the usual known ways. Pretty obviously
this goes against the basic rules from DSA. Instead, we are using the
"install from source" variant, compiling stuff on our own, using ansible
to help do that in a somewhat reliable way. You can find the repository
with our ansible code at https://salsa.debian.org/salsa/salsa-ansible.

Having been running Salsa for a while, Salsa Admins found various points
in the setup, that can be improved, both for easier maintenance of the
service, but also for the user experience - as the setup as it was (and
still is) does have a few deficiences. A proposal on a possible changed
setup has been written and circulated within Salsa Admin, DPL and DSA.
Short summary is that the discussion around it took quite a long time
and did not get to a good/useful conclusion, nor an implementation of
any improvements.

Due to that, Salsa has been in a kind of low-maintenance mode for the
installed parts, which led to Salsa being behind upstream versions.
Recently Gitlab published a critical security fix which forced us into
action - we had to disable the service and could not open it up without
upgrading it to a recent release. To be able to upgrade it, it needed an
upgrade of the underlying machine too, from buster to bullseye.

Thankfully DSA acted quickly on our request to upgrade the machine to
bullseye, as such unblocking the upgrade path on our side to install
more recent versions of gitlab. And then adjust the setup, configs and
local builds to work again. This took a fair bit of work and some more
help from DSA, but the majority of the downtime was actually spend in
something we could only wait for ourself: Database migrations. Gitlab
has changed various parts of the database with their releases and
include a migration way to upgrade your database. Usually this can run
in parallel to the normal operation of the services and as such it is
optimized to not interrupt services. But in our situation we had to wait
for the migrations to finish, before we finally upgraded to a released
version, that no longer included the security hole that started this
upgrade round.

A big thank you has to go to Alexander Wirt, who has invested a huge
amount of work and energy dealing with this upgrade, as well as we have
to thank those various DSA members who helped with upgrading the system
and adjustments needed later on - Tollef Fog Heen, Aurelien Jarno and
Adam D. Barratt.


With all the above, whats our current status? Simple: we are, again, on
the latest released Gitlab version, and while we had a few reports of
errors, they appear fixed now, and Salsa is back in operation. We still
have a few points, that we want to (again) discuss with DSA and see how
the setup can be adjusted, as some of the identified trouble points are
still there. But there is less pressure behind this now, as we currently
are able to closely follow upstream again.


Before we get to the final point, some statistics about Salsa: Salsa
currently hosts 58125 projects for 10930 users over 665 groups. It has
seen 15527 Forks, 36650 merge requests, 302133 notes. Salsa knows of
5789 SSH keys and users created 9425 issues. A total of 342812 pipelines
has been run, of which 226575 have been successful, 101198 failed, the
rest got either cancelled or skipped.

Salsa is running inside a virtual machine with 8 CPU cores, 32G of
available RAM and uses about 1.6TB of space for the git repositories.
Gitlabs background job system "Sidekiq" claims it has processed 68917652
background jobs, of which it declared only 84587 as failed.


Want to help?
If you are a Debian Developer[1] and interested to help us maintain the
Salsa service, including possibly digging into the "bits below" directly
on the machine to make it better for the users, better to maintain, and
in general just keep one huge git forge running, please feel free to
mail us at salsa-admin@debian.org. We also hang out in #salsa on
irc.oftc.net, though that is mainly one of our public support channels.

--
For the salsa admins:
 Joerg

[1] Sorry for that requirement, but with Salsa Admin being a delegated
role, volunteers have to be members. Additionally, Salsa hosts a huge
bunch of Debian repos, some of them not available to the public, but
Salsa admin can see them, so we require admins to be DDs.

Attachment: signature.asc
Description: PGP signature


Reply to: