[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: C++ symbol mangling difference between arches



On Thu, 2009-06-25 at 22:40 +0200, Raphael Hertzog wrote:
> Hello,
> 
> it is well known that C++ symbol mangling result in different symbol
> names from one architecture to the other. It means that libraries that
> want to provide symbol files have to maintain one symbol file for each
> architecture. To avoid this problem Modestas Vainius has written a patch
> that lets you encode in a specific way the small parts in the symbol name 
> that vary between architecture.
> 
> You can check the patch here:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=0001-Implementation-of-the-subst-tag.patch;att=1;bug=533916
> 
> The symbol name in symbols files become templates that are used at build
> time to generate the correct symbol name. The part that are substituded
> are delimited by curly brackets like this:
>  (subst)_ZN6Phonon11AudioOutput13volumeChangedE{qreal}@Base 4:4.2.0
> 
> In this specific case, the symbol is considered to be
> _ZN6Phonon11AudioOutput13volumeChangedEf@Base on armel
> and _ZN6Phonon11AudioOutput13volumeChangedEd@Base on other arches
> because qreal is a float on armel and a double otherwise.
> 
> Here are my questions for people that are more knowledgeable than me about
> C++ symbol name mangling:
> 
> - how stable are those substitution over time? and is it reasonable
>   to maintain a list of substitution in dpkg-dev for this purpose?

AFAIK the g++ name-mangling rules have been stable, modulo bugs, since
g++ 3.0.

> - it's probably impossible to have substitutions to cover all cases
>   for C++ symbol mangling... do you believe that it is possible
>   to have enough (stable) substitutions to cover most common cases?
> 
>   (in the current patch there is ssize_t, size_t, int64_t, uint64_t,
>   qreal and vt=<size> for C++ virtual table offset)
[...]

The implementation of the substitution vt=<size> seems to be intended to
cover virtual function thunks, but I think it is wrong.

First, some background.  In a class with multiple bases, all but one
base class instance must be offset from the address of the derived class
instance.  If the derived class overrides a virtual function from one of
these bases, the compiler must generate a wrapper function, a "virtual
function thunk", which subtracts that offset from the "this" pointer
before calling the implementation.  In the case of virtual inheritance,
a more complex adjustment is necessary.

The symbol names for virtual function thunks include a description of
the adjustment, taking the form:

    "Th" <non-virtual-offset>
or  "Tv" <non-virtual-offset> "_" <virtual-offset>

Since the <non-virtual-offset> depends on the sizes of base classes,
there is no simple relationship between its values on different
architectures.  This may be true for the <virtual-offset> as well but
I'm not sure.

There's also a nasty possibility that on some architectures two
parameter types could be identical and so the second instance would be
represented by a back-reference ("S_" or "S" <number> "_"), whereas on
others they would be different and the second would be represented
independently.

Ben.

-- 
Ben Hutchings
It is impossible to make anything foolproof because fools are so ingenious.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: