Summary of C++ symbols experience (was: Do symbols make sense for C++)
Okay, I've spent parts of another couple of days working on this using the
pkg-kde-tools infrastructure, and I think I can draw some more
First of all, for those who haven't explored the pkg-kde-tools
infrastructure for this, it looks like the effective process goes
something like this:
1. Do a local build of the package and then generate an initial symbols
template for that architecture. So, for example, I did:
fakeroot debian/rules build install
pkgkde-gensymbols -plibxml-security-c16 -v1.6.1 -Osymbols.i386 \
pkgkde-symbolshelper create -o debian/libxml-security-c16.symbols \
-v 1.6.1 symbols.i386
This generates an initial symbols file that will work for i386.
2. Build and upload this version of the package. Now wait for all the
buildds to fail (because they will, on probably nearly every other
architecture than your local one).
3. Download all of the build logs and feed them back into
pkgkde-symbolshelper so that it can adjust your symbols template for
all of the architecture variations. So, for example, I did:
pkgkde-symbolshelper batchpatch -v 1.6.1 *_unstable_logs/*.build
This will generate rather nice annotated symbols files with appropriate
arch lists and using subst tags where appropriately (such as when an
argument to a function is a size_t).
4. Modify your package to use pkg-kde-tools during the build, because
subst tags are a really nice way to handle a lot of C++ symbol patterns
but #533916 against dpkg-dev is wontfix, so you need the pkg-kde-tools
replacement for dpkg-gensymbols. I added a dependency on pkg-kde-tools
and then added the (undocumented) pkgkde_symbolshelper add-on to my dh
5. Build and upload the package again. Now it will hopefully build on all
This is, as you might expect, a fair bit of work, but it *does* work, and
what comes out the other end is a symbols file that works across the
current buildd architectures and detects any mistakes by upstream that
change the symbols exported.
However, it does have some problems, some obvious, and some more subtle.
Here's a list of the issues that I see:
1. It feels like this symbols file is still likely to be fragile. While
pkg-kde-tools detects a *lot* of template instances and marks them
optional, I was still seeing a lot of "leaked" symbols from other C++
libraries that my package build-depends on, and which I suspect may
also disappear. There's also the problem of optimizers eliminating
some things that are inlined; for example, that appears to be the
variation that my package sees on arm. And for a substantial C++
library, the symbols file is HUGE. Over 12,000 lines for one of my
libraries. It's kind of scary to think of how many fragility landmines
are buried under there.
2. By the nature of this process, it's very sensitive to the current
supported architectures. I expect any package with a symbols template
built this way stands a good chance of FTBFS on a new architecture.
For example, my symbols file has a lot of !armel !armhf patterns, which
don't appear to be anything unique to arm except that the C++ compiler
behaves slightly differently there due to inlining, and those are the
only two architectures in the current set with that pattern.
3. The lack of dpkg-gensymbols support for subst is very unfortunate,
since it's used by the pkg-kde-tools utilities and it's the Right Thing
To Do for symbols that take, for example, size_t as an argument. I
would strongly encourage the dpkg maintainers to accept the work in
#533916 except maybe for the vt work, since the alternative is to
generate regexes that are significantly worse at capturing mistakes
(since they're going to accept either int or long, rather than only
whatever one corresponds to size_t, for example), or to embed a lot of
specific architecture restrictions listing, for instance, all 64-bit
architectures and guarantee FTBFS for every new architecture.
4. It's a little hard on the infrastructure, since adding a symbols file
requires uploading a package that's guaranteed to fail to build almost
everywhere just so that you can get all the build logs to feed into the
machinery. I always feel guilty about doing that.
5. It's still not clear that the benefit is worth the amount of effort,
since I expect most C++ libraries to require frequent SONAME changes
anyway, which means that the long-term binary compatibility angle of
symbols is probably futile. Mostly, all of this is to give you a tool
to do ABI checking, which will catch some mistakes but not all of
them. And I'm guessing it's going to be substantially more work than
maintaining symbols for a C library if one actually does it properly
and checks every missing symbol and change in new builds.
I'm going to try this for the Shibboleth packages for a while since I want
to see what it's like across a new upstream release, but I'm still very
much not convinced that this is a useful use of a packager's time.
Russ Allbery (firstname.lastname@example.org) <http://www.eyrie.org/~eagle/>