git-style file storage for .deb
I've read a few comments along the lines of the following:
http://209.85.141.104/search?q=cache:LkSwhS5wzn0J:madism.org/~madcoder/tmp/git-nopause.pdf+dpkg+git+repository&hl=en&ct=clnk&cd=12&gl=au&lr=lang_en&client=firefox-a
"GIT storage is very efficient and optimized. Some numbers:
- xorg-xserver.git, goes back to 2000, is 20MB big. The last orig.tar.gz
is 8MB big, more than 84MB unpacked.
- dpkg.git, whole history since April 1996, generates a git pack of
15MB. The last dpkg release is 17MB big unpacked.
- GNU libc version 2.7 weights 115MB unpacked. The full glibc history
(starts in the eighties) generates a GIT pack of 104MB.
Though, this won’t probably be true for packages with a lot of binary
stuff in it, where delta compression is less likely to produce good
results"
I've had the thought a few times that it could make sense to store a
repo's files in a git hierarchy, rather than in a package pool.
As in, raw files, with package description files which lookup the SHA
for each file in the package, when a package is installed.
Points for consideration:
- overlap of identical files (benefit)
- this can work inter-release and inter-distro
(debian ubuntu, even * *)
- different low level storage, and transfer protocols (changes)
- package storage - as git patch perhaps?
- package download
- higher level tools may continue to use lower level tools transparently
- similar for package src storage (already underway with git deb stuff
happening)
- with some extra tools, could provide the ultimate gentoo-envy fix
My primary thought is that repository size might be drastically reduced.
Perhaps some md5sum numbers could be run to test this.
Hope this is not too OT.
Zen
--
Homepage: www.SoulSound.net -- Free Australia: www.UPMART.org
Please respect the confidentiality of this email as sensibly warranted.
Reply to: