[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: "dselect" replacement team

On Fri, 11 Apr 1997, Tom Lees wrote:

> On Fri, 11 Apr 1997, Jason Gunthorpe wrote:
> > > Sure, a binary format would be faster.  But I don't think that it would
> > > be that fast that we should lose our ascii files.  Just believe the only
> > > file for debian which is not plain ascii is .deb.
> > 
> > That was why I was doing those timings, reading a text file ins't exactly
> > deadly slow. 
> But searching a binary-sorted tree is always guarunteed to be faster. And
> considering how fast the number of packages has been growing, it may well
> become a problem in the not-too-distant future.

The only important factor is time to read the entire text file and the
resulting memory usage of the internal result. Binary searches are
performed on in memory copies which naturally will be optimized for
whatever operations are most frequently performed on it. 

Storage on disk only becomes a concern when the overhead of a text file
starts to get sickening (gzip helps with this) or when the parse time is
very high, which I doubt we will see, I was doing read tests on 6M
package files and it was only up to about 4 seconds on my 486.

> > > All other files like debian/rules, debian/control, configuration files
> > > etc. etc. are plain ascii files.  I have a very strong feeling for 
> > > keeping them.
> > 
> > Of course, there are too many reasons to stick with simple parsed ascii.
> Most of which are probably "tradition", or "compatibility", which we
> should eventually be able to eliminate.

If they are bad, there are alot of good reasons for having a text control
file -- the biggest being that it it totaly and completely extendable,
anyone can add any field and any program can still use the file. Of course
this can be done with binary but since we humans generate the control file
we'd just end up with a binary pre-parser.

> > Exactly! Thats what I call a cache file.
> OTOH, it may be better to eventually aim to support the text-file through
> a separate utilty, which can extract and re-build the database. dpkg-ftp,
> and others would have to be updated to use the database first though.

Yes that is a good idea, but like I said, lets see how it performs before
we make any judgements. In any event we cannot even think of removing the
status file while dpkg is still around, so till it uses our new library
we can't touch that stuff.
> But agreed, a cache-file is a good idea, since many more programs want
> read-access to the database than write. (Only dpkg and dselect want to
> write, as far as I am aware).
> Hmmm, I just had a thought. We _NEED_ some kind of perl interface to the
> libdpkg, for dpkg-ftp, etc. Arggh.

We can redo dpkg-ftp alot simpler and better in C++. Right now it's just
sickening slow, if it were integrated with dselect it would already know
what packages it needs (90% of the time). We could also then add a fancy
slang GUI for it, little progress bar etc. 

I think starting with built in formats and having one of the build ins the
old perl interface is a good idea though.


Reply to: