Hi, 2008 m. May 19 d., Monday, Raphael Hertzog rašė: > First of all, they won't land up in lenny. It's too late, dpkg is frozen. Yeah, I know, but 0002 does not change much code (actually it is not supposed to change main code logic at all) but improves performance _a lot_. If I felt this on my rather fast amd64 box, I wonder how much is saved on slow arches... Having that mind, maybe an expection can be made... > As you noted, the win is not that big. And to me the real question is > "does it make sense to use symbols files" for C++ libraries when: > - the files are huge and it's difficult to hand-edit since all symbols are > mangled It's quite manageable when C++ libraries are compiled with visibility=hidden and includes-hidden. > - there are (almost) always arch-specific differences which render files > even more difficult to maintain There are still some differences like e.g. different mangling of size_t among different arches. Also, major compiler versions choose to emit different symbols so supporting both gcc 4.2 and gcc 4.3 are a bit problematic. However, I'm going to automate handling of those differences in some way. I think it is worth the effort given that: 1) symbol files allow to track when symbols are dropped. 2) bumping of the shlibs blindly (esp. on snapshot packages) and the end effect of that might sometimes be quite painful for everyone. What I like about symbol files is that dependency for each package becomes dynamic and if the package does not use new API, it does not need to depend on new version unnecessarily. That's very important for such a long lasting package as kdelibs5. > Nevertheless, I'd be okay to implement something like that but you should > really rework the patch to support multiple compressions schemes. You can > do that easily by reusing the regex $comp_regex from Dpkg::Compression and > the objects Dpkg::Source::Compressor and/or Dpkg::Source::CompressedFile. I have not looked at those classes. Yeah, I should probably rework the patch then. > Why would it require less memory? Keeping a cache usually increases the > memory requirement... or is there a problem with perl's garbage collector? Well, I really don't know how good perl GC is but current dpkg-shlibdeps keeps reloading and reloading the same symbol files and objdump'ing the same libraries many times if the package contains quite a number of binaries. Since those binaries are usually related, their libdeps are very likely to be quite similar. Every SymbolFile and Objdump object use relatively much memory. Well, I really don't know perl specifics and when it calls GC, but I usually don't have much confidence in garbage collecting. By the way, do you have any idea what that error code 11 from dpkg-shlibdeps really means (in the log I linked in the previous mail)? > The output of find_symbols_file depend not only on $pkg but also on > $soname and $lib. You can't assume that you can reuse the same symbols > file simply because a previous call of find_symbols with the same $kg > returned something. The key of %dpkg_symfile_cache should really be > $dpkg_symfile and not $pkg. Point taken. I chose the key quite poorly. find_symbols_file() does a bit of repetitive I/O, which I wanted to avoid too (it is hardly worth it probably, but still...). I'll improve this part. > Why are you using $pkg and $lib as key for this cache? $lib should be > enough as there's only one objdump output for a given binary file... Because that part of code is enclosed in a 'foreach my $pkg (@{$file2pkg->{$lib}})' which implies that there might be more than one $pkg for each $lib. I'll resend a fixed 0002 patch in a few days. -- Modestas Vainius <modestas@vainius.eu>
Attachment:
signature.asc
Description: This is a digitally signed message part.