[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: 2 patches for dpkg



Hi,

2008 m. May 19 d., Monday, Raphael Hertzog rašė:
> First of all, they won't land up in lenny. It's too late, dpkg is frozen.
Yeah, I know, but 0002 does not change much code (actually it is not supposed 
to change main code logic at all) but improves performance _a lot_. If I felt 
this on my rather fast amd64 box, I wonder how much is saved on slow 
arches... Having that mind, maybe an expection can be made...

> As you noted, the win is not that big. And to me the real question is
> "does it make sense to use symbols files" for C++ libraries when:
> - the files are huge and it's difficult to hand-edit since all symbols are
>   mangled
It's quite manageable when C++ libraries are compiled with visibility=hidden 
and includes-hidden. 

> - there are (almost) always arch-specific differences which render files
>   even more difficult to maintain
There are still some differences like e.g. different mangling of size_t among 
different arches. Also, major compiler versions choose to emit different 
symbols so supporting both gcc 4.2 and gcc 4.3 are a bit problematic. 
However, I'm going to automate handling of those differences in some way. I 
think it is worth the effort given that:

1) symbol files allow to track when symbols are dropped.
2) bumping of the shlibs blindly (esp. on snapshot packages) and the end 
effect of that might sometimes be quite painful for everyone. What I like 
about symbol files is that dependency for each package becomes dynamic and if 
the package does not use new API, it does not need to depend on new version 
unnecessarily. That's very important for such a long lasting package as 
kdelibs5.

> Nevertheless, I'd be okay to implement something like that but you should
> really rework the patch to support multiple compressions schemes. You can
> do that easily by reusing the regex $comp_regex from Dpkg::Compression and
> the objects Dpkg::Source::Compressor and/or Dpkg::Source::CompressedFile.
I have not looked at those classes. Yeah, I should probably rework the patch 
then.

> Why would it require less memory? Keeping a cache usually increases the
> memory requirement... or is there a problem with perl's garbage collector?
Well, I really don't know how good perl GC is but current dpkg-shlibdeps keeps 
reloading and reloading the same symbol files and objdump'ing the same 
libraries many times if the package contains quite a number of binaries. 
Since those binaries are usually related, their libdeps are very likely to be 
quite similar. Every SymbolFile and Objdump object use relatively much 
memory. Well, I really don't know perl specifics and when it calls GC, but I 
usually don't have much confidence in garbage collecting.

By the way, do you have any idea what that error code 11 from dpkg-shlibdeps 
really means (in the log I linked in the previous mail)?

> The output of find_symbols_file depend not only on $pkg but also on
> $soname and $lib. You can't assume that you can reuse the same symbols
> file simply because a previous call of find_symbols with the same $kg
> returned something. The key of %dpkg_symfile_cache should really be
> $dpkg_symfile and not $pkg.
Point taken. I chose the key quite poorly. find_symbols_file() does a bit of 
repetitive I/O, which I wanted to avoid too (it is hardly worth it probably, 
but still...). I'll improve this part.

> Why are you using $pkg and $lib as key for this cache? $lib should be
> enough as there's only one objdump output for a given binary file...
Because that part of code is enclosed in a 'foreach my $pkg 
(@{$file2pkg->{$lib}})' which implies that there might be more than one $pkg 
for each $lib.

I'll resend a fixed 0002 patch in a few days.

-- 
Modestas Vainius <modestas@vainius.eu>

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: