[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: #215067 mozilla FTBFS



ARM has 2 different floating point formats.

The newer one is called VFP and is ANSI/IEEE 754-195 compliant.  I'm not
sure which processors (if any yet) have this hardware.

The older floating point format is called FPA.

Most ARM processors don't have floating point hardware.  These are
typically run with FP emulators in the kernel.  I think these emulators
use the older floating point format (for now).  There are also soft fp
libraries and compiler options.

The main difference between the VFP and FPA is a different fp
instruction set.  But there is also this difference (as you suspect)
from the ARM Architecture Reference Manual, Page C2-4:

"The word order [for DP format] defined here for the VFP architecture
differs from that of the earlier FPA floating-point architecture.  In
the FPA architecture, the most significant word always appeared at the
lower memory address, with the least significant word at the higher,
regarless of the memory system endianess".

For the VFP is states for DP format:

"When held in memory, the two words must appear consecutively and must
both be word-aligned.  The order of the two words depends on the
endianness of the memory system:

- In a little endian memory system, the least significan word appears at
the lower memory address and the most significant word at the higher
memory address.

- In a big-endian memory system, the most significant word appears at
the lower memory address and the least significant word at the lower
memory address."

The whole floating point thing with ARM seems too difficult.  What with
fp emulators vs soft fp libraries, and 2 different double precision
memory layouts.

I think for now, only the old FPA memory layout is used in Debian
binaries.  I think this is the format you describe in your message
that's causing the problem.

In the longer term I wonder how the transition to the new memory layout
and VFP instructions would be accomplished...

Hope this helps.

> -----Original Message-----
> From: Adam C Powell IV [mailto:hazelsct@debian.org] 
> Sent: Thursday, October 23, 2003 8:41 AM
> To: Mark Howard
> Cc: Debian ARM; 215067-forwarded@bugs.debian.org
> Subject: Re: #215067 mozilla FTBFS
> 
> 
> Okay ARM hackers, as a user/neophyte, I need your help.
> 
> As discussed below (the good stuff is at the end), I've 
> traced the mozilla segfault to their PR_dtoa function, which 
> converts doubles to strings.  Because it directly manipulates 
> the bits of doubles, it is making some gross errors, like 
> apparently converting 1.0 to 5^1242306295!  This overflows a 
> table, which I've worked around to fix the segfault (patch 
> attached to previous post), but the workaround results in 
> insane use of time and memory to analyze these huge numbers.
> 
> Does ARM store doubles in a non-IEEE way?  Is it secretly 
> big-endian for its float/double emulation, and little-endian 
> for ints?  What else could be wrong with mozilla's 
> assumptions about double format?
> 
> Please help out here, I think this may have been the cause of 
> a lot of related problems, and when it's gone I think we'll 
> have a decently working mozilla/galeon/epiphany on ARM -- or 
> at least, one that builds and installs!
> 
> On Tue, 2003-10-21 at 11:15, Adam C Powell IV wrote:
> > On Mon, 2003-10-20 at 22:24, Adam C Powell IV wrote:
> > > Okay, just a bit more "manual backtrace" info:
> > > 
> > > On Mon, 2003-10-20 at 21:06, Adam C Powell IV wrote:
> > > > During the call to NSS_Init, nss_makeFlags(1,0,0,0,0,1) returns 
> > > > 0x219a8, and the resulting moduleSpec is:
> > > > 
> > > > name="NSS Internal Module" 
> > > > parameters="configdir='/home/hazelsct/.netscape' certPrefix='' 
> > > > keyPrefix='' secmod='secmod.db' flags=readOnly,optimizeSpace " 
> > > > NSS="flags=internal,moduleDB,moduleDBOnly,critical"
> > > > 
> > > > Then SECMOD_LoadModule() returns something non-null, but 
> > > > apparently
> > > > ->loaded is zero because nss_Init returns -1.
> > > > 
> > > > During the call to NSS_NoDB_Init, nss_makeFlags(1,1,1,1,0,1) 
> > > > returns 0x25268 (okay, maybe this is an address whose value is 
> > > > meaningless, not what I thought), and the resulting 
> moduleSpec is:
> > > > 
> > > > name="NSS Internal Module" parameters="configdir='' 
> certPrefix='' 
> > > > keyPrefix='' secmod='' 
> > > > flags=readOnly,noCertDB,noModDB,forceOpen,optimizeSpace " 
> > > > NSS="flags=internal,moduleDB,moduleDBOnly,critical"
> > > > 
> > > > Then ->loaded seems to work, because it calls 
> secoid_Init(), then 
> > > > segfaults in the call to 
> STAN_LoadDefaultNSS3TrustDomain().  Which 
> > > > in turn segfaults in NSSTrustDomain_Create(), which 
> segfaults in 
> > > > NSSArena_Create().  (God, how I wish I could just "backtrace"!!)
> > > 
> > > This calls nss_ClearErrorStack() in nss/lib/base/arena.c, which 
> > > calls error_get_my_stack(), and since 
> error_stack_index=0, it calls
> > > PR_CallOnce() in nsprpub/pr/src/misc/prinit.c; that's where the 
> > > segfault is.
> > 
> > Okay, now I can't tear myself away...
> > 
> > PR_CallOnce() calls the function passed to it by 
> error_get_my_stack(), 
> > which is error_once_function(); that calls 
> > nss_NewThreadPrivateIndex(), which calls set_whatnspr(), 
> which calls 
> > PR_dtoa() in nsprpub/pr/src/misc/prdtoa.c, which is 
> supposed to print 
> > a double value to an ASCII string.
> > 
> > Its pow5mult() is EXTREMELY slow on ARM, taking about 30 seconds to 
> > segfault, with most of the time spent in the very slow 
> mult() function 
> > it calls, which is where the segfault is.  (Is there really 
> no better 
> > way to print double values?)  It segfaults in mult() on "c = 
> > Balloc(k)" with k=16 (after succeeding in about 20 previous mult() 
> > calls with k=1 to 15).
> > 
> > Okay.  So Balloc() uses a freelist to find a small chunk of 
> memory of 
> > size k.  Most of the time (in fact, all but one other time 
> when k=1), 
> > freelist[k] is NULL, so it allocates 
> > (2^k-1)*sizeof(Long)+sizeof(Bigint)
> > bytes.  For some reason, when k=16, freelist[k] is non-null, and
> > PR_Unlock(freelist_lock) segfaults, perhaps because it sets 
> freelist[16]
> > to rv->next.  [Doesn't glibc already use something like 
> this freelist to
> > handle small malloc/free entries efficiently?]
> 
> Scratch the glibc comment, the freelist seems an efficient 
> way to handle repeated allocs of a small set of fixed sizes.
> 
> > Akhaa!  The freelist is just Kmax=15 entries long, explaining the 
> > bogus non-null freelist[16], and the segfault.  The attached patch 
> > thus cures the segfault, and should be sent upstream; note however 
> > that it leaks memory, as I couldn't figure out how to use PR_Free 
> > (compiler error 'missing binary operator before token 
> "PR_Free"' even 
> > though it's void).
> 
> This does indeed fix the segfault.  But it takes about 30 
> seconds to reach the point where k>Kmax, so a better fix 
> would be to just throw some kind of overflow error when it 
> reaches this point -- or better yet, when pow5mult gets an 
> argument large enough to push k above kmax.
> 
> > So why does it take so long (25 minutes and counting) and so much 
> > memory (1400K and counting) to represent double numbers -- like 1.0 
> > (which PR_dtoa is called with here) -- as strings?
> 
> Make that 45 hours and 19 MB and counting...  (Yes, it's 
> rather pointless at this point, but just for fun. :-)
> 
> > There may be a deeper ARM
> > issue involved, which is why it calls pow5mult(i2b(1),1242306295).
> > Perhaps it's trying to print 5^1242306295?  So d=1.0, but 
> d2=4.29078e+9,
> > and d2 -> k -> s5 is the argument to pow5mult.  d2 is set using the
> > macros:
> > 
> > dval(d2) = x = word1(d) << (32-i), where i=30 and 
> word1(d)=0x3ff00000,
> > word0(d2) -= 31*Exp_msk1, where Exp_msk1=0x80,
> > 
> > where word0() and word1() are unsigned longs representing the first 
> > and second sub-words of d (the dtoa argument, double 1.0).  
> I'd hate 
> > to see what this does on a 64-bit arch, where a double doesn't have 
> > two long sub-words!  I'd speculate it's that word1(d) which 
> is causing 
> > the problem here.
> > 
> > I'm out of time to work on this today, but needless to say, 
> PR_dtoa is 
> > quite broken on ARM, and almost certainly needlessly duplicates 
> > something which is done very well by glibc.
> -- 
> -Adam P.
> 
> GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6
> 
> Welcome to the best software in the world today cafe! 
http://lyre.mit.edu/~powell/The_Best_Stuff_In_The_World_Today_Cafe.ogg


-- 
To UNSUBSCRIBE, email to debian-arm-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact
listmaster@lists.debian.org



Reply to: