Re: SIGFPE and -mieee

To: debian-alpha@lists.debian.org
Subject: Re: SIGFPE and -mieee
From: Tyson Whitehead <twhitehe@uwo.ca>
Date: Wed, 18 Jun 2003 22:47:23 -0400
Message-id: <[🔎] 200306182247.24545.twhitehe@uwo.ca>
Reply-to: twhitehe@uwo.ca

Personally I don't consider a SIGFPE under non-IEEE conformance an indicator 
of buggy software, and I'm going to share my rant about why! *grin*

About a year ago I was coding up some Monte Carlo simulations on our local 
Beowulf cluster, and I kept having this problem where about every 100th time 
I ran my code it would mysteriously SIGFPE.  I wasted a great deal of time 
looking for the 'bug' in my code, only to discover that I was being hit by 
-mieee.

I was calculating an error correction term, and under the right draws in the 
Monte Carlo simulation the error term would approach zero.  This lead to a 
vary small possibility (about 1 in 10000000) of the math library (the 
standard C erf function in this case) generating a denormalized number.

Once a denormalized number has been generated in a program, it's sunk.  The 
Alpha does a SIGFPE as soon as you load it up in a register (you can't even 
do a comparison -- to detect it -- without generating a SIGFPE).  The only 
solution (besides linking against the Compaq math libraries -- which don't 
generate denormalized numbers) is to recompile with -mieee.

At the time I considered it a bug in the GNU math libraries.  I figured they 
should add some code to detect if -mieee is in operation and only generate 
denormalized numbers if that is the case.  I wrote some emails to the 
appropriate list and basically got told that if the FPU wasn't IEEE compliant 
then that wasn't their problem.  

I now realize, however, that what I was trying to purpose isn't even possible.  
-mieee actually causes the compiler to set a bit in the opcode of the 
floating point instructions it generates.  There is no global bit in a FP 
control register somewhere that you can check at runtime.

-mieee applies instruction by instruction.  It's fully possible to compile 
your code with -mieee, link against a library that someone else has compiled 
without -mieee, and boom, have your program crash just because you pass a 
denormalized number to the library.  Even worse, there is no way to crash 
gracefully as you don't even have precise exception handling for the FPU 
without -mieee (well, you can, but you loose the speed advantage of not 
having -mieee).

So, what am I trying to say?  I guess just that I disagree.  Because not 
specifying -mieee causes a program to foul up on a more than just division by 
zero (more specifically, in the denormalization region just before an 
underflow -- which arises very naturally in perfectly non buggy software), I 
think -mieee should be the norm.

Later  -T

PS:  Peronsally I consider it a bug in GCC.  Optimizations that break 
perfectly reasonable code should be off by default (how does a -nomieee flag 
sound *grin*)!  A new user (or casual package maintainers) should not be 
expected to be familiar with all the esoteric architecture specific options 
for the compiler.  Architecture specific speed tweaking options should only 
have to be of interest to the hardcore speed demons.

PPS:  I've focused on denormalization in this email, because it is the one 
that would broadside most standard apps (I would suspect that is what is 
happening to both mpg321 and Xine).  However, the rest of the IEEE standard 
is quite natural and makes perfect sense under the correct (non-esoteric) 
circumstances as well.

For example, when you are working with sums from distributions, it makes 
perfect sense for a positive underflow to go to positive zero, and then a 
division by the positive zero to go to infinity, and then the infinity to go 
back to positive zero again (in something like exp(-x/b) with x>0 and b->0+).

It is an indication of a term that does not contribute anything of 
significance to the sum, not buggy software.

-- 
 Tyson Whitehead  (-twhitehe@uwo.ca -- WSC-)
 Computer Engineer                          Dept. of Applied Mathematics,
 Graduate Student- Applied Mathematics      University of Western Ontario,
 GnuPG Key ID# 0x8A2AB5D8                   London, Ontario, Canada

Reply to:

Prev by Date: Re: initrd/cramfs/etc (was Re: srm nightmare, milo & kernel future questions)
Next by Date: Re: [vinyvat@yahoo.com: Thankyou for the xfree86 4.3.0 sid alpha.debs !]
Previous by thread: Re: SIGFPE and -mieee
Next by thread: Re: SIGFPE and -mieee
Index(es):
- Date
- Thread