[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ITP: auto-apt (automatic apt-get)

At Fri, 7 Jul 2000 11:59:34 -0700,
Joey Hess <joeyh@debian.org> wrote:
> Fumitoshi UKAI wrote:
> > I'm developing a tool to run apt-get on demand.
> As I said, very neat. The contents database could use a lot of work
> though. The file compaction you're doing is frankly, disgusting. 
>     {"boot/",           "!/"},
>     {"bin/",            "b/"},
>     {"sbin/",           "s/"},
>     {"usr/bin/",        "B/"},
>     {"usr/sbin/",       "S/"},
>     {"etc/init.d/",     "ei/"},
>     {"etc/texmf/",      "et/"},
> Is this just done to save space? There are better ways. It's also (as you
> know) incredibly slow, and needs a lot of optimization.

Yes, without this tricks, database become too huge. (>=40MB)
It's very slow to generate database, however, not so slow to lookup.
Of course, it requires more optimization
> What I think you should do is talk to Jason Gunthorpe. I think Jason has been
> working on Contents file stuff for apt already, and has some stuff that
> can build a binary db in a few seconds (or at least in minute -- not half
> an hour). Clearly, if apt has support for loading contents files or some
> equivilant structure, it will be useful for auto-apt and lots of other things
> too. I think Jason had some ideas about effecient ways to pack Contents data
> into a db file.

Oh, I didn't know that.  I'd like to use apt's db file.

At Fri, 7 Jul 2000 12:34:45 -0700,
Joey Hess <joeyh@debian.org> wrote:

> Aha, see
> http://cvs.debian.org/apt/ftparchive/Attic/contents.cc.diff?r1=1.1&r2=

Thanks, I'll look into this code and integrate future auto-apt. 

> +   The GenContents class is a back end for an archive contents generator. 
> +   It takes a list of per-deb file name and merges it into a memory 
> +   database of all previos output. This database is stored as a set
> +   of binary trees linked across directories to form a tree of all files+dirs
> +   given to it. The tree will also be sorted as it is built up thus 
> +   removing the massive sort time overhead.
> +   
> +   By breaking all the pathnames into components and storing them 
> +   seperately a space savings is realized by not duplicating the string
> +   over and over again. Ultimately this saving is sacrificed to storage of
> +   the tree structure itself but the tree structure yeilds a speed gain
> +   in the sorting and processing. Ultimately it takes about 5 seconds to
> +   do 141000 nodes and about 5 meg of ram.

Wow, great.  How much fast to lookup this database?

Fumitoshi UKAI

Reply to: