[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#43724: experimental patch for very much faster dpkg -R



Package: policy-manual
Version: not known

Richard Kettlewell writes ("experimental patch for very much faster dpkg -R"):
> Below is a patch against Ian's CVS tree which can massively accelerate
> the `dpkg -iGROEB' call from the disk method of dselect (and
> elsewhere).
> 
> It works by parsing filenames to determine whether it can skip
> particular files, instead of looking inside them.  When most of the
> packages on your system are up to date, it leads to an enormous
> speedup, as instead of opening and parsing thousands of *.deb files it 
> just does a bit of work on the filenames.

I think this is a good patch, and I'd always intended something like
it.  Richard observes ...

> In this version there are two special rules that you must follow:
...
   [ the first rule is uninteresting -iwj ]
> 
>  * Secondly, all version numbers must be unique EXCLUDING THE EPOCH.
>    What this means is that if you have two instances of a package where
>    the version number differs only in the epoch, you will get files
>    ignored when possibly they should not be.
> 
>    I do not know how many packages this effects.  Hopefully it is none
>    at all.  Clearly it is possible for it to be none at all, since we
>    have the Debian-specific revision field.
> 
>    If this patch is adopted then either (1) the rule must become
>    mandatory or (2) all filenames must be modified to include the
>    epoch and this patch adjusted to take advantage of this.
> 
>    Other package management tools would benefit from this too, I
>    imagine.

I think this is a reasonable requirement.  It means that it is
possible, given a mirror which preserves filenames (which is pretty
much essential given the way the other methods use the Packages
information), to determine from a file listing which files to install.

I don't think we should put the epochs in filenames because (a) it
could be confusing to the users and (b) : is a filename metacharacter
in some contexts, notably rcp and friends and most path list
syntaxes.  (In some of these cases there is no quoting mechanism for :
which is bad but we have to deal with it.)

We should observe that the requirement for uniqueness only applies for
the possible skew between Packages file and mirrors (in practice, a
few months even with the worst mirrors), because it's only necessary
that dpkg can tell whether the file is the same one as named in the
Packages file for it to be able to skip it.

I plan to release a version of dpkg which has had this patch applied,
or something very similar.

I therefore propose that the policy manual be amended to require that
package version numbers be unique, even not counting the epoch, within
a period of three months.

Ian.


Reply to: