Re: Correct way to build .deb with -mieee
First, Chris, I do use autoconf, and could do an AM_CONDITIONAL around
something like petscgraphics_c_CFLAGS+=-mieee (I'll have to search the
automake list archives for the right name there). What's the right
architecture test to use in configure.in? Something like:
AM_CONDITIONAL(IS_ALPHA, test "$ARCH" == "alpha")
but what can I use for $ARCH? Or is there a test like AC_NEEDS_MIEEE?
I don't see such things in the autoconf info pages.
Stefan Schroepfer wrote:
Of course, if it's possible to avoid divides by zero without performance
penalty, I would love to.
On Tue, 19 Jun 2001, Adam C Powell IV wrote:
I have a program (petscgraphics) which, when built without -mieee, fails
with SIGFPE (division by zero); with -mieee, works perfectly (still
divides by zero, but works anyway).
Please excuse my ignorance (I know nothing about petscgraphics),
but this problem could almost surely be solved in source code.
("Works anyway, but still divides by zero" --- brrr --- Is there
any chance to use some limits to prevent this to happen? --- Does
the code work on other architectures by chance? --- On all other
Just speaking from my own experience. In most cases I have seen,
the need for denormals came from erroneous source code and not
from a real dependency on that special feature of a CPU.
The offending code considers a tetrahedron with field values at the
corners, call them C0, C1, C2 and C3 (doubles), and calculates the edge
intercepts of the plane defined by C=q in the linearized C field defined
by the corner values. It uses this to generate zero to two triangles
representing the cut of C=q across that tentahedron. The set of such
triangles on all of the tetrahedra make up the C=q isoquant surface
approximation, which is sent to Geomview for display.
Here's the code, which loops through the six tetrahedra that make up a
hexahedron (often a cube):
for(tet=0; tet<6; tet++)
/* Within a tetrahedron, edges 0 through 5 connect corners:
0,1; 1,2; 2,0; 0,3; 1,3; 2,3 respectively */
c0 = tetras[tet];
c1 = tetras[tet];
c2 = tetras[tet];
c3 = tetras[tet];
edge0 = (isoquant-vals[c0]) / (vals[c1]-vals[c0]);
edge1 = (isoquant-vals[c1]) / (vals[c2]-vals[c1]);
edge3 = (isoquant-vals[c0]) / (vals[c3]-vals[c0]);
whichplane = (edge0>0. && edge0<1.) | ((edge1>0. && edge1<1.) << 1) |
((edge3>0. && edge3<1.) << 2);
isoquant, edge0,edge1,edge3, whichplane, color); CHKERRQ (ierr);
DrawTetWithPlane() is a static inline function which actually creates
the triangle(s). So the problem is, if C1=C0, C2=C1, or C3=C0, then
edge0, edge1 or edge3 intercept (respectively) will be infinite, i.e.
there is no intercept. The whichplane variable tells which of the seven
possible cuts through the edges is made here, with zero indicating no
triangles in this tetrahedron.
I could trap for these conditions with an if statement around them, but
wouldn't that slow things down? Note that it is possible to have one of
these three divisions by zero and still produce a triangle or two; one
can even have C1=C2<q and C0=C3>q and get two triangles, so we can't
just test for any zeroes and skip the whole thing. It just seems easier
to do it this way, and because it's just for graphics, I don't really
care about the divide by zero.
Then again, I do have a test for whichplane in there, and a
switch(whichplane) in DrawTetWithPlane()... Hmm, with that switch, I
could probably lose the if(whichplane). Then again, the optimizer might
do that automatically...
I don't know whether this algorithm is optimal, nor whether my
implementation can be improved, just that it worked right the first
time, with lots and lots of code in DrawTetWithPlane(), and I haven't
touched it since. If you have the interest and time to try to lose the
SIGFPE without -mieee, or even speed things up, feel free to apt-get
source petscgraphics and look at petscgraphics.c. (It takes longer than
the other files to compile- I'll bet the optimizer has quite a good time
vectorizing inlined DrawTetWithPlane() within two functions... :-)
Share and enjoy, thanks in advance for any help you can provide.
GPG fingerprint: D54D 1AEE B11C CE9B A02B C5DD 526F 01E8 564E E4B6
Welcome to the best software in the world today cafe!