[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Introducing salsa-status.debian.net



Apologies for a delayed response. I wanted to get back sooner, but things got a bit chaotic wrapping up GSoC.


> First of all, nice work.
> Yes! It's a nice-looking dashboard indeed :)
Thanks, Ananthu and Andrea! :)

> But that aside, the name sound incredibily misleading to me, as it's
> not salsa-status, but rather salsa-ci-status.
The domain name, salsa-status.debian.net, was intentionally kept generic to allow for future expansions if any, such as including global status/stats about Salsa and not just be limited to Salsa CI. The dashboard provides a much richer visualization and improves transparency, and I believe many folks would be interested in having this. This, of course, depends on collaborations from Salsa admins :) We have an issue tracker for this at https://salsa.debian.org/salsa/todo/-/issues/41 to explore the possibilities for the same.

> I'd like it more if I can see it on my mobile screen without needing desktop view though :p
Added to my todo list! :^)


Quoting Alexander,
> Which api endpoint do you scrape and how often?

Thanks, Alex! I understand your concern.
We mostly fetch from these endpoints:
`/projects/:id`
`/projects/:id/pipelines/:pipeline_id`
`/projects/:id/pipelines/:pipeline_id/jobs`

This results in roughly 250 - 300 API calls per hour on a sunny day, which I think is decently low. Initially, while development I expected there to be much higher traffic, so the backend was designed with extreme frugality to avoid hitting API rate limits (even without a token). However, testing revealed that the actual daily pipeline runs are far fewer than anticipated. The fewer runs are also a good sign, as it shows that pipelines aren’t being abused. I doubt that these api calls will put much stress, if at all.

During the last 10 days, we noticed 55+ projects that haven’t been updated in over 2 years but still run multiple scheduled pipelines daily or weekly. From my estimates, that is more than ~3300 pipelines and ~34600 jobs each year run in vain. These projects have been running this way continuously for 2 to 5 years. This is pure wastage of huge amounts of resources and unnecessary traffic, often contributing to long queue delays that I am sure many of us have already experienced. Thanks to the website, we can spot and prevent this now :D

Disabling scheduled pipelines for such inactive projects would prevent wastage, free up significant resources, ease pipeline congestion, and yield benefits far beyond the modest api savings.

I've noted long pipeline pending times mostly between 1600 and 2200 UTC. We also often receive requests on #salsaci and even on #salsa for insights into pending pipeline queues. Perhaps the website could help ease this pain point as well someday (wink wink Salsa admins)

> That is useful information, but in the end leaves me with even more questions. I really want to have an answer to my questions. Otherwise it may happen that we just block the service.
Let’s not rush this :) It would be better to let it bear some fruit first and see what benefits it brings to Salsa and the wider Debian packaging ecosystem that relies on Salsa CI, and then revisit this discussion if needed. From my experience working with the Salsa CI Team, the potential upside feels very needful and valuable. Thanks, Lucas, for responding!


Quoting Richard,
> maybe the x-axis on the grpah should be 24 hours to capture all timezones
Indeed, they are 24h. Note the x-axis labels were reduced to 8 with steps of 3 (8*3=24) to prevent overcrowding.

> i was surprised how low the total number of CI runs are - it's only able
> to run at most 35 pipelines at once? or there's lots of unused capacity?
I have witnessed 90+ pipelines running simultaneously during the spike a few days ago, so yes, that should be the unused capacity.

>  what is the significant of the people on
> the graph? at the moment is looks like you are blaming them for the
> peaks
Those are the merged MRs from the Salsa CI pipeline repo. The idea is that if a MR or feature rollout causes a regression or improvement in pipelines or jobs, we can visualize it by looking at the trends after the merge. They act more like markers in the timeline.

Thanks for clarifying, IOhannes! Yes, as the header on the homepage indicates, they represent the MRs from the Salsa CI pipeline.


Quoting IOhannes,
> but afaict, jobs are being collected via a webhook called from the default pipelines, which I think means that only standard pipelines are going to be captured
Somewhat similar! The Salsa CI pipeline first makes an API call to register the pipeline and project, and then we fetch its status once an hour to determine the final state and record the stats. So all the public pipelines that use Salsa CI with `SALSA_CI_ENABLE_STATS` flag ON (currently enabled by default), will be registered.


Quoting Johannes (or Josch :),
> Could you please remove these lines? The dashboard seems to work fine even
> when instructing my adblocker to not send my data to google.
Hah, nice catch! Those lines are for the Urbanist font we use on the website. I’m not entirely sure if they send any data to Google, but I’ll go ahead and remove the links and add the font to the assets instead. Thanks for pointing it out!


PS.: The issue tracker is at https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/413.

Thanks so much, everyone! ^^

Aayush Raj
GSoC student mentored by Otto Kekäläinen

On Mon, 1 Sept 2025 at 17:36, Alexander Wirt <formorer@debian.org> wrote:
Am Mon, Sep 01, 2025 at 02:00:00PM +0200 schrieb Lucas Nussbaum:
> On 01/09/25 at 12:56 +0200, Alexander Wirt wrote:
> > Am Fri, Aug 29, 2025 at 01:03:19PM +0200 schrieb Alexander Wirt:
> > > Am Fri, Aug 29, 2025 at 06:05:12AM +0530 schrieb Aayush Raj:
> > > > Greetings folks!
> > > >
> > > > I wanted to announce that salsa-status.debian.net is now up and running!
> > > >
> > > > This Status Page provides both Salsa CI users and the Salsa CI developer
> > > > team with visibility into CI performance, helping identify wasteful
> > > > practices, broken configurations, and optimization opportunities across the
> > > > entire Debian package ecosystem.
> > > >
> > > > One of the main objectives is to help catch wasteful CI usage, which hasn't
> > > > been possible/easy before due to lack of overview/stats.
> > > >
> > > > *The main features of the Salsa CI status page are:*
> > > >
> > > >    -
> > > >
> > > >    Real-time Pipeline Monitoring: Pipeline stats, success rates,
> > > >    performance trends, and related metrics.
> > > >    -
> > > >
> > > >    Project Analytics: Detailed insights into projectsด CI history and
> > > >    configurations
> > > >    -
> > > >
> > > >    Job Type Analysis: Insights into types of jobs running in and on top of
> > > >    Salsa CI
> > > >    -
> > > >
> > > >    CI Stats & Performance: CI duration trends and resource consumption
> > > >    -
> > > >
> > > >    Matrix Alerts - Automated notifications for performance degradation at
> > > >    https://matrix.to/#/#salsa-stats:matrix.org
> > >
> > > Which api endpoint do you scrape and how often?
> > Ping?
>
> Hi Alex,
>
> AFAIKน, the CI pipeline provided by the salsa-ci-team teamฒ is
> instrumented to collect data. See
> https://salsa.debian.org/salsa-ci-team/pipeline/-/blob/master/salsa-ci.yml?ref_type=heads#L223
>
> That's why people are rightfully saying that the name is not correct and
> should be salsa-ci-stats.debian.net (or even
> salsa-ci-team-pipeline-stats.debian.net)
>
> I suppose that when the statistics collector endpoint learns about a
> pipeline, then it polls salsa about that pipeline's status.
>
> Lucas
>
> น I'm not involved with status-status.d.n
> ฒ https://salsa.debian.org/salsa-ci-team/pipeline

That is useful information, but in the end leaves me with even more questions. I really want to have an
answer to my questions. Otherwise it may happen that we just block the service.

Alex


Reply to: