[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: C++ symbol mangling difference between arches


On 2009 m. June 26 d., Friday 02:02:48 Ben Hutchings wrote:

> > - it's probably impossible to have substitutions to cover all cases
> >   for C++ symbol mangling... do you believe that it is possible
> >   to have enough (stable) substitutions to cover most common cases?
> >
> >   (in the current patch there is ssize_t, size_t, int64_t, uint64_t,
> >   qreal and vt=<size> for C++ virtual table offset)
> [...]
> The implementation of the substitution vt=<size> seems to be intended to
> cover virtual function thunks, but I think it is wrong.
> First, some background.  In a class with multiple bases, all but one
> base class instance must be offset from the address of the derived class
> instance.  If the derived class overrides a virtual function from one of
> these bases, the compiler must generate a wrapper function, a "virtual
> function thunk", which subtracts that offset from the "this" pointer
> before calling the implementation.  In the case of virtual inheritance,
> a more complex adjustment is necessary.

Ok, I agree with you, there is no direct relation of <size>*2 in case of non 
virtual offsets. So the name 'vt' is actually very misleading as it is now. It 
could rather be something like ptr=<count> (aka pointer) (4 on 32bit arches, 8 
on 64bit arches mutipled by <count>=1 by default) which would be 100% correct. 
However, in practise ptr may be actually enough for some (most?) classes 

1) In order to keep binary compatibility, the base class can neither shrink 
nor grow. Hence techniques like d-pointer are frequently used for public ABI 
stable classes to workaround this limitation. So if the base class contains 
only a d-pointer (or more pointers) and an internal pointer to the virtual 
table, ptr will be enough.

2) However, nothing prevents the d-pointer based class to contain data members 
which are never going to be removed (like int x, y in Coord class). Then the 
ptr is no longer valid for such non virtual offsets.

2a) The subst can be like vt=<constant_size>+<ptr_count> (e.g. vt=8+4), where 
constant_size (default to 0) is constant on each arch while <prt_count> would 
vary according to the pointer size. vt=8+4 would be 12 on 32bit arches and 16 
on 64bit arches. 

2b) Still 2a is not enough if the base class contains such data members like 
(s)size_t (on s390) or qreal (on armel). To support such cases, vt can only be 
a complex expression with recursive subst expansion like 
vt=size_t*2+ptr*2+qreal+4. But do we want this? The alternative would be to 
maintain such symbols (like destructors) as arch specific.

If I understand correncly, this adjustment is only needed in case of multiple 
inheritance only for virtual functions of the same name which are in multiple 
base classes and they are overriden in the derived class. Unfortunately, 
virtual destructors satisfy those conditions in any case of multiple 
inheritance and good style of C++ programming :/ So this issue is important.

So which way to choose: 2a or 2b or another?

> The symbol names for virtual function thunks include a description of
> the adjustment, taking the form:
>     "Th" <non-virtual-offset>
> or  "Tv" <non-virtual-offset> "_" <virtual-offset>
> Since the <non-virtual-offset> depends on the sizes of base classes,
> there is no simple relationship between its values on different
> architectures.  This may be true for the <virtual-offset> as well but
> I'm not sure.

I'm also not so sure about <virtual-offset> but hopefully it is more rare than 
<non-virtual-offset>. I guess ptr should always be enough for it.

> There's also a nasty possibility that on some architectures two
> parameter types could be identical and so the second instance would be
> represented by a back-reference ("S_" or "S" <number> "_"), whereas on
> others they would be different and the second would be represented
> independently.

There is definitely no need to support such cases via substitutions because
they are rare and arch specific symbols look like a perfect workaround 
(solution?). Also dpkg-gensymbols is never going to add substitutions (or any 
other symbol tags), this will solely be responsibility of the maintainer or 
some external tool.

Modestas Vainius <modestas@vainius.eu>

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply to: