Hi, I have been attempting to debug the armhf build failure of ffmpeg 3.4 [1] and have encountered some strange issues. The test which fails is the "checkasm-float_dsp" test - specifically in the VFP optimizations. This part of the test was added in 3.4 so there are no previous versions of ffmpeg to compare it with. On harris.d.o the test passes. On amdahl.d.o the test fails with wrong results. I think this is because the optimizations try to use the SIMD parts of VFP which don't exist in ARMv8 anymore (even in 32-bit mode). On abel.d.o (and probably the buildd) the code causes a buffer overflow and the test is aborted by gcc's stack smashing code. The relevant function is "ff_vector_fmul_vfp" which is the first function in libavutil/arm/float_dsp_vfp.S [2]. If I step through this function, I see in gdb that the inner loop is incorrectly executed 17 times instead of 16. However, by inserting a nop before the bgt at the end of the loop, the loop is correctly executed 16 times and the tests pass: > ittt ge > vmulge.f32 s8, s0, s8 > vstmiage r0!, {s24-s27} > vstmiage r0!, {s28-s31} > nop > bgt 1b Can anyone explain how adding a nop fixes the bug, and why the test only fails on this machine? [1] https://buildd.debian.org/status/fetch.php?pkg=ffmpeg&arch=armhf&ver=7%3A3.4-1&stamp=1508621764&raw=0 [2] https://anonscm.debian.org/cgit/pkg-multimedia/ffmpeg.git/tree/libavutil/arm/float_dsp_vfp.S?h=debian/7%253.4-1 Thanks, James
Attachment:
signature.asc
Description: OpenPGP digital signature