[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Correct way to build .deb with -mieee



Hello,

First, Chris, I do use autoconf, and could do an AM_CONDITIONAL around something like petscgraphics_c_CFLAGS+=-mieee (I'll have to search the automake list archives for the right name there). What's the right architecture test to use in configure.in? Something like:

 AM_CONDITIONAL(IS_ALPHA, test "$ARCH" == "alpha")

but what can I use for $ARCH? Or is there a test like AC_NEEDS_MIEEE? I don't see such things in the autoconf info pages.

Stefan Schroepfer wrote:

On Tue, 19 Jun 2001, Adam C Powell IV wrote:

I have a program (petscgraphics) which, when built without -mieee, fails
with SIGFPE (division by zero); with -mieee, works perfectly (still
divides by zero, but works anyway).


Please excuse my ignorance (I know nothing about petscgraphics),
but this problem could almost surely be solved in source code.
("Works anyway, but still divides by zero" --- brrr --- Is there
any chance to use some limits to prevent this to happen? --- Does
the code work on other architectures by chance? --- On all other
architectures?).

Just speaking from my own experience. In most cases I have seen,
the need for denormals came from erroneous source code and not
from a real dependency on that special feature of a CPU.

Of course, if it's possible to avoid divides by zero without performance penalty, I would love to.

The offending code considers a tetrahedron with field values at the corners, call them C0, C1, C2 and C3 (doubles), and calculates the edge intercepts of the plane defined by C=q in the linearized C field defined by the corner values. It uses this to generate zero to two triangles representing the cut of C=q across that tentahedron. The set of such triangles on all of the tetrahedra make up the C=q isoquant surface approximation, which is sent to Geomview for display.

Here's the code, which loops through the six tetrahedra that make up a hexahedron (often a cube):

 for(tet=0; tet<6; tet++)
   {
     /* Within a tetrahedron, edges 0 through 5 connect corners:
    0,1; 1,2; 2,0; 0,3; 1,3; 2,3 respectively */
     c0 = tetras[tet][0];
     c1 = tetras[tet][1];
     c2 = tetras[tet][2];
     c3 = tetras[tet][3];
     edge0 = (isoquant-vals[c0]) / (vals[c1]-vals[c0]);
     edge1 = (isoquant-vals[c1]) / (vals[c2]-vals[c1]);
     edge3 = (isoquant-vals[c0]) / (vals[c3]-vals[c0]);
     whichplane = (edge0>0. && edge0<1.) | ((edge1>0. && edge1<1.) << 1) |
   ((edge3>0. && edge3<1.) << 2);
     if (whichplane)
   {
     ierr=DrawTetWithPlane
       (coords[c0&1],coords[2+((c0&2)>>1)],coords[4+((c0&4)>>2)],vals[c0],
        coords[c1&1],coords[2+((c1&2)>>1)],coords[4+((c1&4)>>2)],vals[c1],
        coords[c2&1],coords[2+((c2&2)>>1)],coords[4+((c2&4)>>2)],vals[c2],
        coords[c3&1],coords[2+((c3&2)>>1)],coords[4+((c3&4)>>2)],vals[c3],
        isoquant, edge0,edge1,edge3, whichplane, color); CHKERRQ (ierr);
   }
   }

DrawTetWithPlane() is a static inline function which actually creates the triangle(s). So the problem is, if C1=C0, C2=C1, or C3=C0, then edge0, edge1 or edge3 intercept (respectively) will be infinite, i.e. there is no intercept. The whichplane variable tells which of the seven possible cuts through the edges is made here, with zero indicating no triangles in this tetrahedron.

I could trap for these conditions with an if statement around them, but wouldn't that slow things down? Note that it is possible to have one of these three divisions by zero and still produce a triangle or two; one can even have C1=C2<q and C0=C3>q and get two triangles, so we can't just test for any zeroes and skip the whole thing. It just seems easier to do it this way, and because it's just for graphics, I don't really care about the divide by zero.

Then again, I do have a test for whichplane in there, and a switch(whichplane) in DrawTetWithPlane()... Hmm, with that switch, I could probably lose the if(whichplane). Then again, the optimizer might do that automatically...

I don't know whether this algorithm is optimal, nor whether my implementation can be improved, just that it worked right the first time, with lots and lots of code in DrawTetWithPlane(), and I haven't touched it since. If you have the interest and time to try to lose the SIGFPE without -mieee, or even speed things up, feel free to apt-get source petscgraphics and look at petscgraphics.c. (It takes longer than the other files to compile- I'll bet the optimizer has quite a good time vectorizing inlined DrawTetWithPlane() within two functions... :-)

Share and enjoy, thanks in advance for any help you can provide.
--

-Adam P.

GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6

Welcome to the best software in the world today cafe! <http://lyre.mit.edu/%7Epowell/The_Best_Stuff_In_The_World_Today_Cafe.ogg>





Reply to: