[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

git-style file storage for .deb



I've read a few comments along the lines of the following:
http://209.85.141.104/search?q=cache:LkSwhS5wzn0J:madism.org/~madcoder/tmp/git-nopause.pdf+dpkg+git+repository&hl=en&ct=clnk&cd=12&gl=au&lr=lang_en&client=firefox-a
"GIT storage is very efficient and optimized. Some numbers:
- xorg-xserver.git, goes back to 2000, is 20MB big. The last orig.tar.gz
  is 8MB big, more than 84MB unpacked.
- dpkg.git, whole history since April 1996, generates a git pack of
  15MB. The last dpkg release is 17MB big unpacked.
- GNU libc version 2.7 weights 115MB unpacked. The full glibc history
  (starts in the eighties) generates a GIT pack of 104MB.
Though, this won’t probably be true for packages with a lot of binary
stuff in it, where delta compression is less likely to produce good
results"

I've had the thought a few times that it could make sense to store a
repo's files in a git hierarchy, rather than in a package pool.

As in, raw files, with package description files which lookup the SHA
for each file in the package, when a package is installed.

Points for consideration:
- overlap of identical files (benefit)
  - this can work inter-release and inter-distro
    (debian ubuntu, even * *)
- different low level storage, and transfer protocols (changes)
  - package storage - as git patch perhaps?
  - package download
- higher level tools may continue to use lower level tools transparently
- similar for package src storage (already underway with git deb stuff
  happening)
- with some extra tools, could provide the ultimate gentoo-envy fix

My primary thought is that repository size might be drastically reduced.
Perhaps some md5sum numbers could be run to test this.

Hope this is not too OT.

Zen

-- 
Homepage: www.SoulSound.net -- Free Australia: www.UPMART.org
Please respect the confidentiality of this email as sensibly warranted.


Reply to: