Hi, On 07/04/2025 13:06, Adrian Bunk wrote:
On Sun, Apr 06, 2025 at 07:33:22PM +0200, Bastien Roucaries wrote:Le dimanche 6 avril 2025, 09:25:58 heure d’été d’Europe centrale Roberto C. Sánchez a écrit : ...As one example, some time ago I encountered the issue of the size of data/CVE/list, specifically in the context of a git blame operation taking a few hours to complete. I became convinced that data/CVE/list needs to be split. As I've done some research on the topic, the answer to that is far from clear. I'm less convinced now that "split data/CVE/list" is the de facto right solution, and I'm definitely convinced that a big change here will not be accepted without many good reasons and proof that doesn't also include some massive drawbacks.split per year will help here. ...Which is not easy, see https://salsa.debian.org/security-tracker-team/security-tracker-service/-/issues/1
Back then I put together a git-filter-branch rewrite&subdir of security-tracker/data/CVE/, to isolate triaged CVEs per-file and allow near-instant history/blame for any specific CVE, e.g.:
$ gitk 2025/31115 https://lists.debian.org/debian-lts/2020/10/msg00017.html https://lists.debian.org/debian-lts/2020/10/pngYP1m7tAWfw.pngIt took a day to run IIRC, it probably would take much longer now as data/CVE/list more than doubled and gets slower to process.
Nobody gave a damn and I eventually removed it ¯\_(ツ)_/¯I still update my local repo to help understand specific CVEs triage, thanks to the incremental feature of git-filter-branch. I just checked the newer git-filter-repo (which is based on git-fast-import/export and is much faster in general) but sadly it doesn't seem to fit this use case (no incremental/resume, not-so-fast *content* rewrite, splitting files is tricky).
Note: this is orthogonal to the split-by-year issue mentioned above, which is more involved (full path rewrite, tooling updates, ELTS security-tracker fork breakage, etc.)
I can re-upload my rewrite back on Salsa, if there's interest? Cheers! Sylvain