[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: cortex / arm-hardfloat-linux-gnueabi (was Re: armelfp: new architecture name for an armel variant)



On Thursday 15 July 2010 19:19:13 Paul Brook wrote:
> However changing the ABI doesn't solve many of the underlying problem.
> Specifically how to provide optimized binaries that take advantage of new
> features on modern CPUs while still supporting older hardware.

You can't bridge that gap. There are many optimized distros out there, each of 
them requiring a specific minimum to run, for a reason. It would be impossible 
or at the very least, extremely hard for them to maintain backwards 
compatibility, and in the end, what for? Having yet another substandard 
performance port and read articles about how Gentoo or even Ubuntu beats the 
crap out of base Debian in speed? (I know Debian is not about speed, but 
seriously, it doesn't have to be *slow*).

> Switching to the hard-float ABI certainly does give some benefit. While 20%
> isn't a trivial difference, it's important to keep this in context.  This
>  is on top of what I'd guess is a 10x (i.e. 1000%) speedup achieved without
>  breaking the ABI and requiring a whole new port.  If you're really serious
>  about performance then a NEON optimized version of your critical code
>  should get you annother 4x or so on a Cortex-A8.

Yes, and we're working on NEON optimizations as well, but that's much harder, 
as it requires algorithm optimization and not just a simple recompile -
autovectorization is a joke. Eg. optimizing NEON took less than a day, but I 
already knew how it worked as I had done the AltiVec port before, and because 
it was such a beautiful design. Stuff like zlib took longer -I had to rework 
the algorithm to paralellize it- and it was useless after all because the 
license prevents altered binary releases.

> What you're describing here is multiarch.

Multiarch still isn't there, even after at least 5 years when I saw the first 
presentation. It may have been hard on x86/x86_64 or ppc/ppc64 where there 
were only 2 variants, here we have what? 5? 10? I seriously think it's not 
worth it.

It's much easier and less intrusive to the end user to not force upon him all 
the variants and just have a lean system that does what it promises to do.

Regards

Konstantinos


Reply to: