[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Fwd: Bug#738981: Switch to use generic_fpu for ARM



Reinhard Tartler wrote:
Dear ARM porters,

Please see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=738981
for full context. I've uploaded a patch proposed by Riku that AFAIUI
makes mpg123 really slow on all arm targets, while unbreaking it on
some others.

As one of the maintainers of the mpg123 without familiarity about the
scope of the Debian arm port
Debian has TWO official arm ports.

Armel has a minimum CPU requirement of armv4t with no gaurantee that there will be any floating point hardware at all. Armhf has a minimum CPU requirement of armv7-a with vfpv3-d16, neon is not gauranteed to be available (but it is very likely to be in pratice).

As well as the two official arm ports there is also Raspbian which targets armv6 with vfpv2, raspbian also uses the armhf architecture name but can be distinguished by using dpkg-vendor.

---------- Forwarded message ----------
From: Thomas Orgis <thomas-forum@orgis.org>
Date: Sun, Feb 16, 2014 at 5:46 AM
Subject: Bug#738981: Switch to use generic_fpu for ARM
To: 738981@bugs.debian.org


Sorry for being late to the party, but I have to say that this is a
rather unfortunate situation now. Not using the assembly-optimized
fixed-point ARM code of the arm_nofpu decoder and resorting to the
generic_fpu one (all plain C) will make mpg123 really slow in
comparison. I'm not sure what hardware we are targeting here ... is it
armel with softfloat? With gcc -mfpu=vfp, generic_fpu might be fine,
although using the neon decoder is still preferred on supporting CPUs.

There is no runtime detection in mpg123 for this and at least for the
decision of fixed or floating point decoding, it likely will never be
as that is a very basic decision on the whole decoder code, not just
some optimization. I can imagine combining generic_fpu and neon builds
with run-time detection,
That would seem like it would be a good approach for Debian armhf if it was implemented.

 but this still assumes a hardware floating
point unit to make sense. We have arm_nofpu and generic_nofpu for the
cases without one.

Something which is possible right now is to produce one libmpg123.so with
the standard build to please users using slow ARM machines who just
want plain 16 bit playback and produce one libmpg123_float.so for people
using beefy machines and who are using audacious as a media player.
Seems like a good approach if the so files can be gauranteed to be ABI compatible. There is already a mechanism for loading different versions of libraries based on hardware capabilities.
I could implement a conversion step to floating point with the
arm_nofpu decoder. That would make audacious work (although wasting
precision on machines that have hardware floating point, or even NEON)
and have the benefit of the command-line mpg123 still being fast with
16 bit output.
Seems like a reasoanable idea for armel, frankly people who have vfp hardware probablly shouldn't be running armel anyway.


Reply to: