[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[PATCH 1/2]: FastFPE 'lfm' fix for denormalized numbers

FastFPE's implementation of the lfm opcode ('load floating-point
multiple') marks all floating-point numbers that it loads as normalized.
Unfortunately, zero and infinity are denormalized numbers, and FastFPE's
lfm will therefore corrupt any stored zero or infinity values reloaded
from memory.

The enclosed patch fixes this bug by testing for the two magic exponents
in FastFPE that indicate zero or infinity, and ensuring that the
'normalized' bit is cleared.

Minimal test cases that illustrate the problem are at
<http://www.booyaka.com/~paul/arm/fpa11_lfm_tests.tar.gz>.  After patching
with the patch below, the result of the test cases is identical between
NWFPE and FastFPE.  (The test cases are derived from code posted earlier
by Nicholas Clark <nick@unfortu.net> - thanks!)

Particular thanks go to Tony Lindgren <tony@atomide.com> for providing the
opportunity to catch the bug, and for rebuilding Linux kernels for his
Psion as required to test solutions.

The following bug reports in other software are known to be caused by this 

. 'Re: [ID 20020504.002] 1 == 1 but 0 != 0'

. 'Perl IPC::Open3 bug, affects apt-get, compiler issue?' 

- Paul

--- linux-2.4.19-rmk2/arch/arm/fastfpe/CPDT.S	Thu Sep 26 12:09:16 2002
+++ linux-2.4.19-rmk2-pjw/arch/arm/fastfpe/CPDT.S	Thu Sep 26 14:39:32 2002
@@ -402,16 +402,25 @@
 	.globl	CPDT_lfm
 	add	r2,r10,r0,lsr#8
-	ldr	r3,[r6],#4
-	and	r4,r3,#0x80000000
-	orr	r3,r3,#0x80000000
-	str	r4,[r2,#0]
-	str	r3,[r2,#4]
+	ldr	r4,[r6],#4
+	and	r3,r4,#0x80000000
+	str	r3,[r2,#0]		
 	ldr	r3,[r6],#4
 	str	r3,[r2,#8]
 	ldr	r3,[r6],#4
 	str	r3,[r2,#12]
+	cmp	r3,#0x80000000		@ does the exp indicate zero?
+	biceq	r4,r4,#0x80000000	@ if so, indicate 'denormalized'
+	beq	CPDT_lfm_storer4
+	cmp	r3,#0x7fffffff		@ does the exp indicate inf or NaN?
+	biceq	r4,r4,#0x80000000	@ if so, indicate 'denormalized'
+	beq	CPDT_lfm_storer4
+	orrne	r4,r4,#0x80000000	@ otherwise, set normalized bit
+	str	r4,[r2,#4]
 	add	r0,r0,#1<<12
 	and	r0,r0,#7<<12
 	subs	r1,r1,#1

Reply to: