[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: 2 patches for dpkg



Hi,

On Fri, 05 Dec 2008, Modestas Vainius wrote:
> All 3 issues resolved. Patch attached (against lenny branch, applies to master 
> without problems). I have been using this patched dpkg quite heavily lately, 
> no problems noticed.

Cool, thanks.

> This patch optimizes dpkg-shlibdeps by caching parsed symbols files and
> objdump objects. This way neither of the libraries or symbols files are
> parsed more than once. This patch significantly improves performance of
> dpkg-shlibdeps bringing it near to performance levels of << 1.14.8
> dpkg-shlibdeps without loosing any of new functionally at all. Memory
> requirements are reduced too.

Next time, please try to follow the recommendations described on the wiki for
writing log messages:
http://wiki.debian.org/Teams/Dpkg/GitUsage

> +sub merge_from_symfile {
> +    my ($self, $src) = @_;
> +    while (($soname, $srcobj) = each(%{$src->{objects}})) {
> +	if (exists $self->{objects}{$soname}) {
> +	    # Update/override infos only
> +	    $self->{objects}{$soname}{deps} = $srcobj->{deps};

I don't think it makes sense to update the dependency if we already have
the object in the symfile. If the library is described in 2 symbols files,
either it's wrong or one of the two should take precedence, using the
symbol list of one and the dependencies of the second seems wrong.

I don't think that we will encounter such cases in practice, hence I would
suggest to just display a warning stating that we have two conflicting
objects for the same library. Or it means that we're trying to merge
several times the same symfile.

> +	} else {
> +	    # Shallow copy the soname object (because deps can be replaced later)
> +	    my %obj;
> +	    while (($key, $val) = each(%$srcobj)) {
> +		$obj{$key} = $val;
> +	    }
> +	    $self->{objects}{$soname} = \%obj;
> +	}
> +    }

Taking into account the previous comment, this "shallow copy" is probably
not needed. 

Otherwise, if it's a matter of API-cleanliness, then we should rather do a
deep copy to ensure we don't share anything with the merged object. Using
dclone() of Storable would be appopriate in that case. I'm not sure if it
represents a big run-time cost or not.

> -		if (defined $dpkg_symfile) {
> -		    # Load symbol information
> -		    print "Using symbols file $dpkg_symfile for $soname\n" if $debug;
> -		    $symfile->load($dpkg_symfile);
> +		if (defined($dpkg_symfile)) {
> +		    if (exists $dpkg_symfile_cache{$dpkg_symfile}) {
> +			print "Using symbols file $dpkg_symfile (cached) for $soname\n" if $debug;
> +		    } else {
> +			# Load symbol information
> +			print "Using symbols file $dpkg_symfile for $soname\n" if $debug;
> +			$dpkg_symfile_cache{$dpkg_symfile} = new Dpkg::Shlibs::SymbolFile();
> +			$dpkg_symfile_cache{$dpkg_symfile}->load($dpkg_symfile);
> +		    }
> +		    $symfile->merge_from_symfile($dpkg_symfile_cache{$dpkg_symfile});

There's a possibility we're going to merge the same file multiple times
but it's not big deal (except if we display an ugly warning). To avoid
doing too much we could however restrict the merge to only what is of
interest to us: the infos describing the library of soname $soname.

>  		}
>  	    }
>  	    if (defined($dpkg_symfile) && $symfile->has_object($soname)) {
> @@ -214,13 +225,25 @@ foreach my $file (keys %exec) {
>  		}
>  	    } else {
>  		# No symbol file found, fall back to standard shlibs
> -		my $id = $dumplibs_wo_symfile->parse($lib);
> +		my $id;
> +		my $libobj;
> +		if (exists $dpkg_objdump_cache{$lib}) {
> +		    $libobj = $dpkg_objdump_cache{$lib};
> +		    # We don't want to process the same lib more than once (redundant)
> +		    next if ($dumplibs_wo_symfile->get_object($libobj->get_id()));

Looks like an has_object() would be welcome for consistency with the
SymbolFile object.


> +		    $id = $dumplibs_wo_symfile->add_object($dpkg_objdump_cache{$lib});

$dpkg_objdump_cache{$lib} => $libobj

>  sub symfile_has_soname {
>      my ($file, $soname) = @_;
> +
> +    return 1 if (exists $symfile_has_soname_cache{$file}{$soname});
> +
>      open(SYM_FILE, "<", $file) ||
>          syserr(_g("cannot open file %s"), $file);
>      my $result = 0;
>      while (<SYM_FILE>) {
>  	if (/^\Q$soname\E /) {
>  	    $result = 1;
> +	    $symfile_has_soname_cache{$file}{$soname} = 1;
>  	    last;
>  	}
>      }

Why are you not caching a negative result as well? it's not more
complicated…

Otherwise it's fine and I will happily apply it to the master branch. Tell
me if you plan to submit an updated patch or if I should finish the work
myself.

Cheers,
-- 
Raphaël Hertzog

Le best-seller français mis à jour pour Debian Etch :
http://www.ouaza.com/livre/admin-debian/


Reply to: