RE: Please review packaging of clanlib2
My thoughts on ClanLib 2.3 Package with relation to SSE2 builds.
ClanLib main components are: (only including the ones relevant to this conversation)
clanCore - (Contains SSE2 intrinsic that can be turned off by a #define)
clanDisplay - (Contains SSE2 intrinsic that can be turned off by a #define)
clanSound - (Contains SSE2 intrinsic that can be turned off by a #define)
clanGL - (Display Target: OpenGL 2.x and above)
clanGL1 - (Display Target: OpenGL 1.3)
clanSWRender - (Display Target: Software renderer. Only supports SSE2 intrinsic)
Looking in detail at these components, disabling SSE2 would:
clanCore and clanDisplay: Gives only a very slight performance penalty. So we can ignore that
clanSound: Runs at a slower speed.
clanSWRender: This component is not available
For the average game without sound. This is not a problem.
For example, clanCore, clanDisplay with clanGL1 works nicely on the sparc platform (with minor fixes to some #includes)
All AMD64 builds should have SSE2 enabled. (Since AMD64 has always supported SSE2)
For the i386 platform the options are:
A) Disable SSE2 totally
B) Create 2 versions of the library ( something like /usr/lib and /usr/lib/sse2 - http://wiki.debian.org/Multiarch/LibraryPathOverview )
C) Enable GCC SSE2 only on the ".o" files that contain the SSE2 intrinsic. (The library chooses the correct code path at runtime)
Option A:
clanSWRender is not available.
There will not be any difference to games without sound running OpenGL.
With sound, the speed difference depends on the CPU speed.
Option B:
This adds extra complexity for little gain
Application developers have to be aware that SSE2 may or may not be available
A non-SSE2 clanSWRender would have to be implemented that throws a "not available" exception
In the future ClanLib 2.4 may contains inline SSE2 intrinsic in the API. This might be an issue.
Option C:
This requires someone changing the ClanLib source code tree
Having a "if (sse2_enabled) func_sse() else func()" on matrix functions would have a performance penalty when called 10000's times.
In the future ClanLib 2.4 may contains inline SSE2 intrinsic in the API. This might be an issue.
Adds code complexity
So I think the simplest option is preferable. "Option A"
The work involved in using Option B and C, is not worth the effort. Also the ClanLib core developers may be reluctant to maintain it, since generally they work on Windows7 and have a minimum system specification)
Reply to: