[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Trying to crack the Firefox crashing issue



Hi Damien,

thanks for looking into this!

On Fri, 2025-05-09 at 17:02 +1000, Damien Stewart wrote:
> So I this is really a follow up to the "What are the current available 
> browser options for debian-ppc64?" thread where it there was a technical 
> discussion on why Firefox was crashing which ended up being rather 
> anti-climatic. But I wanted to check myself since I'm aware the last few 
> years all Firefox does is crash on load. At the time PPC was last 
> officially supported on Jessie, Firefox was becoming unstable then. It 
> loaded and worked but would easily crash and exit. Now it's much worse.

It's probably more an issue with Firefox crashing on big-endian systems rather
than Firefox crashing on PowerPC as it's known to work on little-endian PowerPC.

> So unlike most of the PPC people out there I don't have a quad G5 power 
> horse. I do have a rather rare X1000 with a PASemi PA6T. Only dual but 
> 64 bit and does the job. I soon found out running Firefox under GDB 
> needs over 6GB RAM and I only had 4GB with HDD swap space. I rarely need 
> swap on PPC, unlike my laptop. But I had some spare backup RAM and 
> decided to max it out. After wrestling with DDR2 RAM slots I managed to 
> get it working. A 64 bit PowerPC machine with 8GB RAM and Debian 64 
> installed on SSD. Okay I've broken the 32 bit barrier and now I'm 
> talking. :-D

Good idea. FWIW, you can also use the POWER8 machine available in the GCC
Compile Farm after applying for an account. It runs Debian unstable and
has 64 GB of RAM and 64 cores. It should allow for faster debugging.

I am one of the admins of this machine and can install build dependencies
for Firefox if necessary.

To get an account for the GCC Compile Farm, please see this page:

https://gcc.gnu.org/wiki/CompileFarm

> My results to summarise it are that it is the same crash. Different day, 
> same code. That streqci() function again. This time in firefox_138.0.1. 
> But here's some info I picked up that may help to close in on it. With a 
> running commentary. :-)
> 
> damien@ubuntu:~$ gdb firefox.real
> GNU gdb (Debian 16.3-1) 16.3
> 
> ...
> Reading symbols from firefox.real...
> Reading symbols from 
> /usr/lib/debug/.build-id/fd/6adabdb8b6655f970f65deffcea09f8d7dac41.debug...
> 
> (gdb) run
> Starting program: /usr/bin/firefox.real
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library 
> "/lib/powerpc64-linux-gnu/libthread_db.so.1".
> 
> ... A minute or two filling up 6GB of RAM...
> 
> Thread 1 "firefox.real" received signal SIGSEGV, Segmentation fault.
> w2c_rlbox_streqci (var_p0=var_p0@entry=262000, var_p1=2016478208,
>      instance=<optimized out>) at rlbox.wasm.c:55615
> warning: 55615    rlbox.wasm.c: No such file or directory
> 
> As you can see different day, same code. Same function but without that  
> i32_load8_u. I don't like the look of that instance. Why is instance 
> optimized out? The frame is omitted.

Might be because Firefox was built with -O2. To debug the problem, it might
be better to build Firefox from git. It's actually not that difficult, see:

https://firefox-source-docs.mozilla.org/setup/linux_build.html

> Back trace...
> 
> (gdb) bt
> #0  w2c_rlbox_streqci (var_p0=var_p0@entry=262000, var_p1=2016478208,
>      instance=<optimized out>) at rlbox.wasm.c:55615
> #1  0x00003fffe8e1e268 in w2c_rlbox_getEncodingIndex (
>      instance=<optimized out>, var_p0=<optimized out>) at rlbox.wasm.c:55548
> #2  w2c_rlbox_getEncodingIndex (instance=0x3fffda90f000, var_p0=262000)
>      at rlbox.wasm.c:55531
> #3  w2c_rlbox_MOZ_XmlInitEncodingNS_0 (instance=0x3fffda90f000, 
> var_p0=325428,
>      var_p1=325424, var_p2=262000) at rlbox.wasm.c:57164
> #4  0x00003fffe8e4ce1c in w2c_rlbox_initializeEncoding (
>      instance=instance@entry=0x3fffda90f000, var_p0=var_p0@entry=325280)
>      at rlbox.wasm.c:37816
> 
> The hit...
> 
> (gdb) disas
> Dump of assembler code for function w2c_rlbox_streqci:
>     0x00003fffe8e1e150 <+0>:    ld      r3,0(r3)
>     0x00003fffe8e1e154 <+4>:    subf    r4,r5,r4
>     0x00003fffe8e1e158 <+8>:    nop
>     0x00003fffe8e1e15c <+12>:    nop
> => 0x00003fffe8e1e160 <+16>:    lbzx    r9,r3,r5
>     0x00003fffe8e1e164 <+20>:    add     r10,r4,r5
>     0x00003fffe8e1e168 <+24>:    clrlwi  r9,r9,24
>     0x00003fffe8e1e16c <+28>:    clrldi  r10,r10,32
>     0x00003fffe8e1e170 <+32>:    lbzx    r10,r3,r10
> 
> Why is there nop? Does it mean ori? PPC doesn't have nop. Why doesn't 
> gdb list the machine code as standard? Supposed to be a debugger. This 
> code looks sus.
> 
> Registers...
> (gdb) info r
> r0             0x3fffe8e4ce1c      70368356519452
> r1             0x3fffffffbe90      70368744160912
> r2             0x3ffff413c500      70368544146688
> r3             0x3ffb00000000      70347269341184
> r4             0xffffffff87d2fb70  18446744071693335408
> r5             0x78310400          2016478208
> 
> Ok so it doesn't like r9 = [r3 + r5]. What's wrong with 3FFB78310400? 
> Apart from r5 being a large 32 bit integer.
> 
> I had apt sourced the source but gdb couldn't see it so needed to so 
> some digging...
> 
> damien@ubuntu:~/Applications/firefox-debug/firefox-138.0.1$ grep -ir 
> "streqci" .
> ./parser/expat/expat/lib/xmltok.c:streqci(const char *s1, const char *s2) {
> ./parser/expat/expat/lib/xmltok.c:      /* The following line will never 
> get executed.  streqci() is
> ./parser/expat/expat/lib/xmltok.c:    if (streqci(name, encodingNames[i]))
> ./parser/expat/expat/lib/xmltok_ns.c:  if (streqci(buf, KW_UTF_16) && 
> enc->minBytesPerChar == 2)
> 
> The source:
> static int FASTCALL
> streqci(const char *s1, const char *s2) {
>    for (;;) {
>      char c1 = *s1++;
>      char c2 = *s2++;
>      if (ASCII_a <= c1 && c1 <= ASCII_z)
>        c1 += ASCII_A - ASCII_a;
>      if (ASCII_a <= c2 && c2 <= ASCII_z)
>        /* The following line will never get executed.  streqci() is
>         * only called from two places, both of which guarantee to put
>         * upper-case strings into s2.
>         */
>        c2 += ASCII_A - ASCII_a; /* LCOV_EXCL_LINE */
>      if (c1 != c2)
>        return 0;
>      if (! c1)
>        break;
>    }
>    return 1;
> }
> 
> This code appears to be poor quality. It doesn't validate the input 
> strings nor check for null bytes. Not to mention that icky for ever. 
> That if test is in a strange order making some sort of coding 
> palindrome. Funny. :-)

It seems that this code is supposed to compare two strings while ignoring
their case by enforcing the strings to be uppercase. At first sight, I don't
see anything suspicious, but it might be an idea to replace calls to this
function to a wrapper to the corresponding functions of the C++ standard
library.

> Perhaps for an internal API, not checking input when input must be given 
> is acceptable, but these are the reasons C lib str*() functions are 
> criticised now days. This streqci() is uncommon in my search and 
> particular to XML parsing. Ok, so what is going wrong with it? Given 
> it's embedded into this rlbox.wasm.c where is it generated from? The 
> build itself? I don't see that exact file in source.

It's part of the expat library, see:

https://sources.debian.org/src/expat/2.7.1-1/expat/lib/xmltok.c/?hl=1007#L1007

>  From what I can tell it wants to use r5 as an index with lbzx but 
> instead does something funky with r5 instead of zeroing it. Before doing 
> nothing twice. It would have been better off using lbzu! So what kind of 
> contraption caused the C compiler to generate asm code like that? The C 
> code looks straight forward enough for a C compiler to understand but 
> the binary code is corrupted. Is this is a result of the build process 
> wrecking it? Or does PPC GCC have some rare bug causing a code side 
> effect of broken code? I know this is old news by now but I just don't 
> know how it ended up generating broken code that is only broken on PPC. :-?

Could be a bug in the LLVM PowerPC backend. Try building with GCC instead.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


Reply to: