Re: cortex / arm-hardfloat-linux-gnueabi (was Re: armelfp: new architecture name for an armel variant)
> > Switching to the hard-float ABI certainly does give some benefit. While
> > 20% isn't a trivial difference, it's important to keep this in context.
> > This is on top of what I'd guess is a 10x (i.e. 1000%) speedup achieved
> > without breaking the ABI and requiring a whole new port.
> How do you figure a 10x speedup?
A fairly conservative guess at the cost of software floating point. Even a
dog-slow FPU like on the Cortex-A8 should be at an order of magnitude faster
> > about performance then a NEON optimized version of your critical code
> > should get you annother 4x or so on a Cortex-A8.
> Yes it's about 4x mathematically but 2x in practice because of the ABI
Theoretical peak gain is way more than 4x. VFP on the A8 has a peak single
precision performance of about 0.1 FLOP/cycle, maybe 0.2 if you enable runfast
mode. NEON peak performance is 4 FLOP/cycle.
I've seen 2-3x speedup on plain scalar code without even attempting
vectorization, so 4x seems fairly realistic given a bit of effort.
> >> What would not be so great is that even if it was fixed, the option to
> >> use a faster floating point ABI drags in a clone of
> >> every package on your system (at the very least, libc, libm, and all
> >> the system library dependencies) increasing the
> >> size of the installed system.
> > What you're describing here is multiarch.
> Yes, which is needed anyway to support NEON where it's available.
A new port (or arch) is only required if you break the ABI. Enabling NEON has
no effect on the ABI.