Package: qa.debian.org
Severity: wishlist
We already have all the file checksums in the database. Removing
(file-level) duplication in the file storage, using hard-links, can be
safely implemented offline, i.e., as long as no debsources update is
ongoing.
Micro-benchmark (from my DebConf14 Debsources talk) of the expected disk
space saving:
select count(*) from checksums; -> 35'370'653
select count(distinct sha256) from checksums; -> 15'822'745
--------------------------
=> deduplicated core: ~45%
Cheers.
--
Stefano Zacchiroli . . . . . . . zack@upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . @zack on identi.ca . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »
Attachment:
signature.asc
Description: Digital signature