[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ffmpeg 3.4 vfp optimization issues



Hi,

I have been attempting to debug the armhf build failure of ffmpeg 3.4
[1] and have encountered some strange issues. The test which fails is
the "checkasm-float_dsp" test - specifically in the VFP optimizations.
This part of the test was added in 3.4 so there are no previous versions
of ffmpeg to compare it with.

On harris.d.o the test passes.

On amdahl.d.o the test fails with wrong results. I think this is because
the optimizations try to use the SIMD parts of VFP which don't exist in
ARMv8 anymore (even in 32-bit mode).

On abel.d.o (and probably the buildd) the code causes a buffer overflow
and the test is aborted by gcc's stack smashing code. The relevant
function is "ff_vector_fmul_vfp" which is the first function in
libavutil/arm/float_dsp_vfp.S [2]. If I step through this function, I
see in gdb that the inner loop is incorrectly executed 17 times instead
of 16. However, by inserting a nop before the bgt at the end of the
loop, the loop is correctly executed 16 times and the tests pass:

>  ittt            ge
>  vmulge.f32      s8,  s0,  s8
>  vstmiage        r0!, {s24-s27}
>  vstmiage        r0!, {s28-s31}
>  nop
>  bgt             1b

Can anyone explain how adding a nop fixes the bug, and why the test only
fails on this machine?

[1]
https://buildd.debian.org/status/fetch.php?pkg=ffmpeg&arch=armhf&ver=7%3A3.4-1&stamp=1508621764&raw=0

[2]
https://anonscm.debian.org/cgit/pkg-multimedia/ffmpeg.git/tree/libavutil/arm/float_dsp_vfp.S?h=debian/7%253.4-1

Thanks,
James

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: