[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Summary of C++ symbols experience (was: Do symbols make sense for C++)



Okay, I've spent parts of another couple of days working on this using the
pkg-kde-tools infrastructure, and I think I can draw some more
conclusions.

First of all, for those who haven't explored the pkg-kde-tools
infrastructure for this, it looks like the effective process goes
something like this:

1. Do a local build of the package and then generate an initial symbols
   template for that architecture.  So, for example, I did:

       fakeroot debian/rules build install
       pkgkde-gensymbols -plibxml-security-c16 -v1.6.1 -Osymbols.i386 \
           -edebian/libxml-security-c16/usr/lib/*/libxml-security-c.so.*.*
       pkgkde-symbolshelper create -o debian/libxml-security-c16.symbols \
           -v 1.6.1 symbols.i386

   This generates an initial symbols file that will work for i386.

2. Build and upload this version of the package.  Now wait for all the
   buildds to fail (because they will, on probably nearly every other
   architecture than your local one).

3. Download all of the build logs and feed them back into
   pkgkde-symbolshelper so that it can adjust your symbols template for
   all of the architecture variations.  So, for example, I did:

       pkgkde-getbuildlogs
       pkgkde-symbolshelper batchpatch -v 1.6.1 *_unstable_logs/*.build

   This will generate rather nice annotated symbols files with appropriate
   arch lists and using subst tags where appropriately (such as when an
   argument to a function is a size_t).

4. Modify your package to use pkg-kde-tools during the build, because
   subst tags are a really nice way to handle a lot of C++ symbol patterns
   but #533916 against dpkg-dev is wontfix, so you need the pkg-kde-tools
   replacement for dpkg-gensymbols.  I added a dependency on pkg-kde-tools
   and then added the (undocumented) pkgkde_symbolshelper add-on to my dh
   --with flag.

5. Build and upload the package again.  Now it will hopefully build on all
   architectures.

This is, as you might expect, a fair bit of work, but it *does* work, and
what comes out the other end is a symbols file that works across the
current buildd architectures and detects any mistakes by upstream that
change the symbols exported.

However, it does have some problems, some obvious, and some more subtle.
Here's a list of the issues that I see:

1. It feels like this symbols file is still likely to be fragile.  While
   pkg-kde-tools detects a *lot* of template instances and marks them
   optional, I was still seeing a lot of "leaked" symbols from other C++
   libraries that my package build-depends on, and which I suspect may
   also disappear.  There's also the problem of optimizers eliminating
   some things that are inlined; for example, that appears to be the
   variation that my package sees on arm.  And for a substantial C++
   library, the symbols file is HUGE.  Over 12,000 lines for one of my
   libraries.  It's kind of scary to think of how many fragility landmines
   are buried under there.

2. By the nature of this process, it's very sensitive to the current
   supported architectures.  I expect any package with a symbols template
   built this way stands a good chance of FTBFS on a new architecture.
   For example, my symbols file has a lot of !armel !armhf patterns, which
   don't appear to be anything unique to arm except that the C++ compiler
   behaves slightly differently there due to inlining, and those are the
   only two architectures in the current set with that pattern.

3. The lack of dpkg-gensymbols support for subst is very unfortunate,
   since it's used by the pkg-kde-tools utilities and it's the Right Thing
   To Do for symbols that take, for example, size_t as an argument.  I
   would strongly encourage the dpkg maintainers to accept the work in
   #533916 except maybe for the vt work, since the alternative is to
   generate regexes that are significantly worse at capturing mistakes
   (since they're going to accept either int or long, rather than only
   whatever one corresponds to size_t, for example), or to embed a lot of
   specific architecture restrictions listing, for instance, all 64-bit
   architectures and guarantee FTBFS for every new architecture.

4. It's a little hard on the infrastructure, since adding a symbols file
   requires uploading a package that's guaranteed to fail to build almost
   everywhere just so that you can get all the build logs to feed into the
   machinery.  I always feel guilty about doing that.

5. It's still not clear that the benefit is worth the amount of effort,
   since I expect most C++ libraries to require frequent SONAME changes
   anyway, which means that the long-term binary compatibility angle of
   symbols is probably futile.  Mostly, all of this is to give you a tool
   to do ABI checking, which will catch some mistakes but not all of
   them.  And I'm guessing it's going to be substantially more work than
   maintaining symbols for a C library if one actually does it properly
   and checks every missing symbol and change in new builds.

I'm going to try this for the Shibboleth packages for a while since I want
to see what it's like across a new upstream release, but I'm still very
much not convinced that this is a useful use of a packager's time.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: