Re: ITP: auto-apt (automatic apt-get)
At Fri, 7 Jul 2000 11:59:34 -0700,
Joey Hess <joeyh@debian.org> wrote:
>
> Fumitoshi UKAI wrote:
> > I'm developing a tool to run apt-get on demand.
>
> As I said, very neat. The contents database could use a lot of work
> though. The file compaction you're doing is frankly, disgusting.
>
> {"boot/", "!/"},
> {"bin/", "b/"},
> {"sbin/", "s/"},
> {"usr/bin/", "B/"},
> {"usr/sbin/", "S/"},
> {"etc/init.d/", "ei/"},
> {"etc/texmf/", "et/"},
>
> Is this just done to save space? There are better ways. It's also (as you
> know) incredibly slow, and needs a lot of optimization.
Yes, without this tricks, database become too huge. (>=40MB)
It's very slow to generate database, however, not so slow to lookup.
Of course, it requires more optimization
> What I think you should do is talk to Jason Gunthorpe. I think Jason has been
> working on Contents file stuff for apt already, and has some stuff that
> can build a binary db in a few seconds (or at least in minute -- not half
> an hour). Clearly, if apt has support for loading contents files or some
> equivilant structure, it will be useful for auto-apt and lots of other things
> too. I think Jason had some ideas about effecient ways to pack Contents data
> into a db file.
Oh, I didn't know that. I'd like to use apt's db file.
At Fri, 7 Jul 2000 12:34:45 -0700,
Joey Hess <joeyh@debian.org> wrote:
> Aha, see
> http://cvs.debian.org/apt/ftparchive/Attic/contents.cc.diff?r1=1.1&r2=1.1.2.1&cvsroot=deity
Thanks, I'll look into this code and integrate future auto-apt.
> + The GenContents class is a back end for an archive contents generator.
> + It takes a list of per-deb file name and merges it into a memory
> + database of all previos output. This database is stored as a set
> + of binary trees linked across directories to form a tree of all files+dirs
> + given to it. The tree will also be sorted as it is built up thus
> + removing the massive sort time overhead.
> +
> + By breaking all the pathnames into components and storing them
> + seperately a space savings is realized by not duplicating the string
> + over and over again. Ultimately this saving is sacrificed to storage of
> + the tree structure itself but the tree structure yeilds a speed gain
> + in the sorting and processing. Ultimately it takes about 5 seconds to
> + do 141000 nodes and about 5 meg of ram.
Wow, great. How much fast to lookup this database?
Thanks,
Fumitoshi UKAI
Reply to: