[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#956324: Clustalo bus error on mipsel (Was: Bug#956324: python-biopython: FTBFS on mipsel)




On Apr 17, 2020, at 13:18, Andreas Tille <tille@debian.org> wrote:

Hi Matthew,

On Fri, Apr 17, 2020 at 08:18:29AM -0700, Matthew Fernandez wrote:
Thanks for the patch which I applied to packaging Git.  I assume you
want to express that while these fixes are definitely good coding
practice the bus error problem is not fixed by it, right?

Thanks, Andreas. It may fix the bus error, but I don’t have a MIPS machine
to test on. Some of those logging calls had the potential to leave you with
a misaligned stack pointer. IIUC unaligned loads on MIPS could cause such a
bus error.

I tried with hope ... but failed:

(sid_mipsel-dchroot)tille@eller:~/clustalo$ gdb --args src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
GNU gdb (Debian 9.1-3) 9.1
...
Reading symbols from src/clustalo...
(gdb) run
Starting program: /home/tille/clustalo/src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/mipsel-linux-gnu/libthread_db.so.1".

Program received signal SIGBUS, Bus error.
0x5556a1b8 in PairDistances (distmat=0x7fff278c, mseq=0x55692a30, pairdist_type=<optimized out>, bPercID=<optimized out>, istart=0, iend=3, jstart=0, jend=3, fdist_in=0x0,
   fdist_out=0x0) at pair_dist.c:346
346                 NewProgress(&prProgress, LogGetFP(&rLog, LOG_INFO),

OK, let me try a little harder :)

    $ # enable debugging symbols and Address Sanitizer
    $ CFLAGS="-g -fsanitize=address" CXXFLAGS="-g -fsanitize=address" ./configure
    …
    $ make clean && make
    …
    $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
    =================================================================
    ==30264==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffcfcbf5784 at pc 0x5620f0aa478c bp 0x7ffcfcbf56c0 sp 0x7ffcfcbf56b8
    WRITE of size 4 at 0x7ffcfcbf5784 thread T0
        #0 0x5620f0aa478b in PairDistances /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336
        #1 0x5620f0a91d9f in AlignmentOrder /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:835
        #2 0x5620f0a95c04 in Align /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1221
        #3 0x5620f0a90d76 in MyMain /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192
        #4 0x5620f0a88ca2 in main /home/matthew/clustal-omega-1.2.4/src/main.cpp:469
        #5 0x7f3773d9009a in __libc_start_main ../csu/libc-start.c:308
        #6 0x5620f0a89ad9 in _start (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9)

    Address 0x7ffcfcbf5784 is located in stack of thread T0
    SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336 in PairDistances
    Shadow bytes around the buggy address:
      0x10001f976aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x10001f976ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x10001f976ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x10001f976ad0: 00 00 00 00 00 00 00 00 00 00 00 00 ca ca ca ca
      0x10001f976ae0: 04 cb cb cb cb cb cb cb 00 00 00 00 ca ca ca ca
    =>0x10001f976af0:[04]cb cb cb cb cb cb cb 00 00 00 00 00 00 00 00
      0x10001f976b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x10001f976b10: f1 f1 f1 f1 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2
      0x10001f976b20: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2 f2 f2
      0x10001f976b30: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
      0x10001f976b40: 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 f2 f2
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==30264==ABORTING

Looking at line 336 of pair_dist.c, it looks like the bound on the containing loop is wrong. So let’s try adjusting that:

    $ vim src/clustal/pair_dist.c
    $ git diff src/clustal/pair_dist.c
    diff --git a/src/clustal/pair_dist.c b/src/clustal/pair_dist.c
    index e6dbdc3..bb79e61 100644
    --- a/src/clustal/pair_dist.c
    +++ b/src/clustal/pair_dist.c
    @@ -321,7 +321,7 @@ PairDistances(symmatrix_t **distmat, mseq_t *mseq, int pairdist_type, bool bPerc

             /* FIXME: can get rid of iChunkStart, iChunkEnd now that we're using the arrays */
             iChunkStart = iend;
    -        for(iChunk = 0; iChunk <= iNumberOfThreads; iChunk++)
    +        for(iChunk = 0; iChunk < iNumberOfThreads; iChunk++)
             {
                 iChunkEnd = iChunkStart;
                 if (iChunk == iNumberOfThreads - 1){
    $ make
    …
    $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
    =================================================================
    ==30601==ERROR: AddressSanitizer: global-buffer-overflow on address 0x561188847864 at pc 0x5611886da6e7 bp 0x7fffe6d77ef0 sp 0x7fffe6d77ee8
    READ of size 4 at 0x561188847864 thread T0
        #0 0x5611886da6e6 in FullAlignment::Build(HMM&, Hit&, char*) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250
        #1 0x5611886df3eb in HitList::PrintAlignments(char**, char**, char*, char*, HMM&, char*, char) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhhitlist-C.h:197
        #2 0x5611886f379b in hhalign /home/matthew/clustal-omega-1.2.4/src/hhalign/hhalign.cpp:1211
        #3 0x56118863f848 in HHalignWrapper /home/matthew/clustal-omega-1.2.4/src/clustal/hhalign_wrapper.c:1342
        #4 0x561188637db1 in Align /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1250
        #5 0x561188632d76 in MyMain /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192
        #6 0x56118862aca2 in main /home/matthew/clustal-omega-1.2.4/src/main.cpp:469
        #7 0x7f6d857f109a in __libc_start_main ../csu/libc-start.c:308
        #8 0x56118862bad9 in _start (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9)

    0x561188847864 is located 60 bytes to the left of global variable 'Sim' defined in 'hhdecl-C.h:234:7' (0x5611888478a0) of size 1764
    0x561188847864 is located 0 bytes to the right of global variable 'S' defined in 'hhdecl-C.h:235:7' (0x561188847180) of size 1764
    SUMMARY: AddressSanitizer: global-buffer-overflow /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250 in FullAlignment::Build(HMM&, Hit&, char*)
    Shadow bytes around the buggy address:
      0x0ac2b1100eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    =>0x0ac2b1100f00: 00 00 00 00 00 00 00 00 00 00 00 00[04]f9 f9 f9
      0x0ac2b1100f10: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0ac2b1100f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==30601==ABORTING

Looking at line 250 of hhfullalignment-C.h, we can see it’s reading the array S’ out of bounds here. Someone has helpfully left a debugging line below this, so let’s shuffle it ahead of the faulting access and remove the part where it is also performing the faulting access:

    $ vim src/hhalign/hhfullalignment-C.h
    $ git diff src/hhalign/hhfullalignment-C.h
    diff --git a/src/hhalign/hhfullalignment-C.h b/src/hhalign/hhfullalignment-C.h
    index 8f40fd1..fd759f9 100644
    --- a/src/hhalign/hhfullalignment-C.h
    +++ b/src/hhalign/hhfullalignment-C.h
    @@ -247,8 +247,8 @@ FullAlignment::Build(HMM& q, Hit& hit, char zcError[])
                 char qc=qa->seq[  q.nfirst][ qa->m[  q.nfirst][hit.i[step]] ];
                 char tc=ta->seq[hit.nfirst][ ta->m[hit.nfirst][hit.j[step]] ];
                 if (qc==tc) identities++;  // count identical amino acids
    +            fprintf(stderr,"%3i %3i  %3i %3i  %3i %1c %1c %6.2f %6.2f %6.2f  \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,score_sim,hit.P_posterior[step],hit.sum_of_probs); //DEBUG
                 score_sim += S[(int)aa2i(qc)][(int)aa2i(tc)];
    -            //         fprintf(stderr,"%3i %3i  %3i %3i  %3i %1c %1c %6.2f %6.2f %6.2f %6.2f  \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,S[(int)aa2i(qc)][(int)aa2i(tc)],score_sim,hit.P_posterior[step],hit.sum_of_probs); //DEBUG
             }
       }

    $ make
    …
    $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal —force
    …
     28 582  386 559   10 N - 127.25   0.01   2.91
    =================================================================
    ==30936==ERROR: AddressSanitizer: global-buffer-overflow on address 0x563d5b2258a4 at pc 0x563d5b0b79e8 bp 0x7ffd269e0e40 sp 0x7ffd269e0e38
    READ of size 4 at 0x563d5b2258a4 thread T0
        #0 0x563d5b0b79e7 in FullAlignment::Build(HMM&, Hit&, char*) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:251
    …

So the values of qc and tc at this point are 'N' and '-', respectively. This results in an access to S[20][21], which is indeed out of range as S is a 21x21 array. To go further, I think I need some guidance from a domain expert. Is aa2i() ever expected to be called with a value that maps to GAP or ANY? Maybe S is actually meant to be a 22x22 array? Maybe the loop in hhfullalignment-C.h is meant to skip any iteration for which qc or tc map to GAP?

By the way, Andreas, I am doing this debugging on the upstream 1.2.4 release on an x86-64 machine so I still have no certainty that this is related to the root cause of your observed problem on MIPS.

Reply to: