[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please review packaging of clanlib2



On 06/03/12 09:46, Mark Page wrote:
> The ClanLib main components (clanCore, clanDisplay, clanGL, clanGL1,
> clanSound) optionally use SSE2 to improve performance (using
> intrinsic's)

One thing a general-purpose distribution like Debian could do is to
compile the libraries twice on x86 platforms - once with SSE2 and once
without - and put the SSE2 version in /usr/lib/sse2 [1], and the
non-SSE2 version in /usr/lib. At compile time, every game would be
linked against the non-SSE2 version; at runtime, the runtime linker will
automatically substitute the SSE2 version [2] if the CPU is good enough.
(The glibc runtime linker is clever like that.)

This only works if the ABI of the SSE2 and non-SSE2 versions is the
same, or more precisely, if the SSE2 version is a drop-in replacement
for the non-SSE2 version.

On non-x86, only the generic version can exist, regardless.

[1] Actually /usr/lib/i386-linux-gnu/sse2 vs. /usr/lib/i386-linux-gnu
    if we use multiarch paths for it, but you get the idea.

[2] <http://wiki.debian.org/Multiarch/LibraryPathOverview>,
    search for "important hwcap"

> The ClanLib OpenGL software renderer (clanSWRender) only supports
> SSE2. It would be too slow to not use SSE2 (this is why it has not
> been ported)

The DarkPlaces Quake engine fork (Debian: src:darkplaces) is in the same
situation, for what it's worth. We compile the rest of the engine for
generic lowest-common-denominator x86 (the minimum is currently i486 on
Debian and i686 on Ubuntu), compile the software renderer for SSE2, and
refuse to use the software renderer at runtime unless the running CPU
actually supports SSE2.

This only works because the software renderer is in separate translation
units (i.e. .o files) - each translation unit has to be compiled for a
particular instruction set.

On non-x86, we don't even try to compile the software renderer (and I've
had to patch the rest of the engine so it'll compile without it - as
provided by upstream, it's not portable).

> clanSound will have a drop in performance on non SSE2 builds. I am
> unsure if it would be usable on older CPU's, since it uses floating
> points for sample manipulation.

Requiring a fast FPU for it to be useful is fine, I think; insufficient
performance on i486 is acceptable, but abruptly dying with SIGILL
(illegal instruction trap) isn't.

    S


Reply to: