[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cortex / arm-hardfloat-linux-gnueabi (was Re: armelfp: new architecture name for an armel variant)



> > Switching to the hard-float ABI certainly does give some benefit. While
> > 20% isn't a trivial difference, it's important to keep this in context.
> >  This is on top of what I'd guess is a 10x (i.e. 1000%) speedup achieved
> > without breaking the ABI and requiring a whole new port.
> 
> How do you figure a 10x speedup?

A fairly conservative guess at the cost of software floating point. Even a 
dog-slow FPU like on the Cortex-A8 should be at an order of magnitude faster 
than software.

> > about performance then a NEON optimized version of your critical code
> > should get you annother 4x or so on a Cortex-A8.
> 
> Yes it's about 4x mathematically but 2x in practice because of the ABI
> fudging.

Theoretical peak gain is way more than 4x. VFP on the A8 has a peak single 
precision performance of about 0.1 FLOP/cycle, maybe 0.2 if you enable runfast 
mode. NEON peak performance is 4 FLOP/cycle.
I've seen 2-3x speedup on plain scalar code without even attempting 
vectorization, so 4x seems fairly realistic given a bit of effort.

> >> What would not be so great is that even if it was fixed, the option to
> >> use a faster floating point ABI drags in a clone of
> >> every package on your system (at the very least, libc, libm, and all
> >> the system library dependencies) increasing the
> >> size of the installed system.
> > 
> > What you're describing here is multiarch.
> 
> Yes, which is needed anyway to support NEON where it's available. 

A new port (or arch) is only required if you break the ABI. Enabling NEON has 
no effect on the ABI.

Paul


Reply to: