Re: i386 compatibility & libstdc++
Arnd Bergmann <arnd@arndb.de> writes:
> a) The patch gets merged upstream. It won't hurt anyone who is
> building i486+ optimized binaries and fixes a real bug.
Upstream won't accept the patch, because of the performance penalty.
Even if upstream accepts the patch, that won't be before gcc
3.4. However, the gcc 3.2/3.3 ABI will stay with us for a long time
(most likely until after the next Debian release).
> b) We provide a libstdc++-i386.so.$(version) file that contains
> only the __exchange_and_add function and is linked to
> libstdc++.so.
That would work, yes.
> We can shave a bit off by making the function __attribute__((regparm(2)))
Even with that change, and -fomit-frame-pointer, I get
inline: 2.39809
out-of-line: 4.0224
i.e. this is still a 60% slowdown (in a test case where the processor
does branch prediction correctly all the time, and everything is in
the cache). The assembler code is
_Z11atomic_add2PVii:
lock; addl %edx,(%eax)
ret
so it can't get any better. The performance hit is still unacceptable.
> and perhaps by using a trivial non-locking variant when compiling
> without threads, as the i386 version uses the mutex only in those
> cases and AFAICS it is compatible with the i486 version otherwise.
That won't help anything. "Compiling without threads" isn't really
supported on Linux: if threads are not used, this is always a
link-time/runtime issue, not a compile time issue.
> If we know at compile time that locking (neither 'lock;' prefix nor
> the mutex call) is never needed, we can even get much faster than the
> current i486 code.
We never know that.
> Also, if an application or library cares about this sort of
> micro-optimization, it probably should be provided in an optimized
> version anyway.
I think the performance loss for applications like KDE will be
significant. I doubt that providing two versions of KDE (i386
and i486+) would be feasible.
Regards,
Martin
Reply to: