Bug#908678: Update on the security-tracker git discussion
Zobel brought up the security-tracker git discussion in the 
#debian-security irc channel again and I'd like to record a few of the 
items touched there for others that were not present:
DLange has a running mirror of the git repo with split files since three 
months. This is based on anarcat's scripts published previously in this 
bug. The rewriting mirror repo works flawlessly. All history is retained 
sans gpg commit signatures.
Corsac noted that "redoing the tooling is a pain" and anarcat and DLange 
iterated we are willing to help fix the tools. But we need a commitment 
from the security-team that the migration to a split file repo is 
wanted. And we need a prioritized list of tools that need to be 
split-files enabled.
The discussion iterated that "moving elsewhere" doesn't really fix the 
underlying git-usage issue. So while this would take load off salsa, it 
will not improve clone times and hamper collaboration with Debian people 
outside the security team.
Still - to gain some data - DLange tried to push the security-tracker 
repo to github. This bails out as the history contains a file > 100MB 
(hard limit for Github):
remote: error: GH001: Large files detected. You may want to try Git 
Large File Storage - https://git-lfs.github.com.
[..]
remote: error: File data/CVE/allitems.html is 111.44 MB; this exceeds 
GitHub's file size limit of 100.00 MB
So we would have to re-write history for pushing to GitHub. Commits from 
2017-12-29 that introduce "data/CVE/allitems.html" and drop it again 
would need to be modified. Technically all commits after these have to 
be re-written as well. I have not tested whether Github supports 
refs/replace substitutes which would be a work-around.
As noticeable on Salsa and per 
https://gitlab.com/gitlab-com/support-forum/issues/230 Gitlab does not 
enforce per-file size limits.
But the pain of hosting and using this repo is not really different for 
any Gitlab instance.
So that means self-hosting of a non-split-file repo would probably have 
to be on a security DSA machine or similar.
Again, as said above, discussion participants outside the security team 
would prefer a commitment to split the offending data/CVE/list file into 
annual chunks, enable the tooling and stay on salsa.
Reply to: