Re: Bug#956324: Clustalo bus error on mipsel (Was: Bug#956324: python-biopython: FTBFS on mipsel)
> On Apr 29, 2020, at 02:12, Andreas Tille <andreas@an3as.eu> wrote:
>
> Hi,
>
> On Wed, Apr 29, 2020 at 10:30:35AM +0800, 黄佳文 wrote:
>> I am a developer from Loongson company (R & D CPU/mip64el), I've been
>> looking at this recently.
>
> Very nice to see mips developers to care for biological software. :-)
>
>> I did two experiments, and I found that when I used Python 3,7 to compile
>> python-biopython, Build successfully.
>> In the same environment, I just upgrade Python 3.7 to Python 3.8, and then
>> compile python-biopytho, Build fails, but not bus error.
>> I found through online query that some symbol tables were added and deleted
>> in upgrading Python 3.7 to 3.8. The following are symbol tables:
>
> Sorry to insist here - I do not think that it is a Python version problem
> at all. The issue can be reproduced in clustalo only which is pure C code.
> May be you have a look at
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=956324#59
>
> and the following discussion. Despite Matthew has found some issues inside
> the C code it did not helped to prevent:
>
>
> Starting program: /home/tille/clustalo/src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/mipsel-linux-gnu/libthread_db.so.1".
>
> Program received signal SIGBUS, Bus error.
> 0x5556a1b8 in PairDistances (distmat=0x7fff278c, mseq=0x55692a30, pairdist_type=<optimized out>, bPercID=<optimized out>, istart=0, iend=3, jstart=0, jend=3, fdist_in=0x0,
> fdist_out=0x0) at pair_dist.c:346
> 346 NewProgress(&prProgress, LogGetFP(&rLog, LOG_INFO),
>
>
> That's the issue we need to care about here.
To add another data point to this discussion, one other (fruitless) thing I tried previously was cross-compiling Clustal Omega. From an amd64 host, it’s possible to target mipsel using the GCC cross-compilers in the standard Debian repositories. You can then run the resulting binary using Qemu’s user mode. Using this technique, the f002 test runs to completion with no bus error. This is not really surprising as AFAIK unaligned accesses that would trigger a bus error on mipsel hardware would be silently allowed in this configuration (Qemu doesn’t faithfully emulate this hardware behaviour and amd64 allows unaligned access).
Unfortunately the repositories’ cross-compilers have been built without ASan enabled and you can’t attach to an emulated mipsel process with a native Valgrind. So debugging memory safety issues is not straightforward. To go further with this approach, you would have to build a mipsel-targeting cross-compiler with ASan enabled or cross-compile Valgrind to mipsel. For a true masochist, it may be possible to attach to the process with GDB or rr and reverse-step from the location Andreas has quoted, but I wouldn’t trust the debugger not to crash in this configuration. Even then the issue may not be reproducible because it may be dependent on transformations/optimizations only performed by the particular version of the native mipsel compiler called during packaging.
For those on this thread who have access to mipsel hardware or can shell in to one of the mipsel build machines, I would suggest running an ASan-instrumented test there (`export CFLAGS="-g -fsanitize=address"; export CXXFLAGS="-g -fsanitize=address"`) and see what we learn.
Reply to: