[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Do symbols make sense for C++



I'm currently working on the Policy modification to document (and
recommend) use of symbols instead of shlibs, but I'd only personally used
symbols with C libraries.  Today I decided that I should try adding a
symbols file to a C++ library, particularly if I'm going to recommend
everyone do it.  I tried this exercise with xml-security-c, which is, I
think, a reasonably typical C++ library.  Not the sort of core C++ library
that would sit at the center of the distribution, but a random software
package that's in Debian because other things use it.

The experience was rather interesting, and I ended up uploading the new
version without a symbols file and continuing to just use shlibs.  That's
for the following reasons:

1. The generated symbols file was HUGE.  Hundreds of lines.  This is a
   marked difference from the typical C symbols file, which is of quite
   manageable size.  Some of that is that the library provides a lot of
   different classes, but some of it is that C++ just generates a lot of
   exported symbols.  There's no way that I could do what I would do with
   a C library and understand those symbols, why they're there, and
   whether they are likely to have changed between revisions.

2. Generating a reasonable symbols file was a pain.  Generating an
   unreasonable symbols file that just contains all of the mangled symbols
   is largely mechanical and uninteresting, but that symbols file doesn't
   seem to me to convey useful information.  So I did some scripting to
   translate the symbols back with c++filt, and add (c++) tags, and then
   try to understand what I was looking at and figure out whether I should
   sort the symbols list because the default sort is by mangled name,
   which is meaningless.  This is a rather unappealing process.  It's not
   particularly difficult, but it's very awkward and feels like it's
   missing vital tools.

3. The resulting symbols file is incomprehensible to someone without
   strong knowledge of C++.  It's full of opaque entries that don't make
   sense to the non-C++ programmer, wihch I suspect is a substantial
   number of people who package C++ libraries for Debian.  I know enough
   C++ from school that I can evaluate security fixes, make simple
   patches, and review upstream changes, and I think that's all that
   should be needed to package things for Debian.  But I'm deeply
   uncomfortable producing a symbols file on my own that contains entries
   for things that I know nothing about and cannot evaluate when they've
   last changed, like "non-virtual thunk to FooClass::~FooClass@Base".

4. Once I had a symbols file that resulted in a successful build and that
   I could have uploaded, I started thinking about how I was going to
   maintain it.  With a C program, I would change the symbols file
   versions when the underlying function implementation changes in a way
   that may not offer eqiuvalence, similar to bumping shlibs.  I realized
   that I was going to have no idea when that happened, and the only way
   that I would maintain the symbols file would be to either trust
   upstream to maintain ABI equivalence and therefore only change the
   symbols file when upstream changes the SONAME, or not trust upstream to
   maintain ABI equivalence and therefore change all the versions with
   each new upstream release.  That gives me exactly the same semantics as
   a shlibs file, so what's the point in having a symbols file?

5. The exported symbols of the library contained many symbols that
   obviously weren't really from that library, but instead were artifacts
   of the C++ compilation process, things like instantiations of
   std::vector.  Do those go into the symbols file?  Do they change from
   architecture to architecture?  If they disappear again, is that
   actually an ABI break?  How do I know?  It's all very mysterious, and
   while shlibs provides the same semantics as just ignoring this, at
   least I'm not then including in the package data, generated by me,
   things that I'm just blindly ignoring.

I came away from this experience thinking that I should revise the Policy
amendment to say that symbols files are really for C libraries and for C++
libraries with either a tightly maintained symbol export list or
maintained by a C++ expert, and that most C++ library maintainers should
just not bother with this and use shlibs, bumping the shlibs version or
not based on their impression of how good upstream is at maintaining ABI
equivalence.

But that feels like a result contrary to what I had previously thought was
the intended direction, so I wanted to ask the Debian development
community as a whole: am I missing something?  Are these symbols files
actually useful?  Am I missing some trick to make them useful?

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: