[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Trying to crack the Firefox crashing issue



On 9/5/25 8:55 pm, John Paul Adrian Glaubitz wrote:
Hi Damien,

thanks for looking into this!

No problem. Well it is a problem, to be solved, and possibly a big one but I couldn't help trying to find out why. :-)

It's probably more an issue with Firefox crashing on big-endian systems rather
than Firefox crashing on PowerPC as it's known to work on little-endian PowerPC.

Yes that's true. I tend to think of PPC as the last big endian platform left that anyone would still use as a desktop. 

What I wonder, and it might not reveal any useful results, is to do a binary comparison of the LE binaries against the BE binaries in the Firefox files. To see, while taking into account endian differences, any major differences in code. I recall something about the machine code being same order but that may have changed. Of course if there is more difference that simply endians being swapped and ELF structured differently that won't work. This is rather a raw way of thinking about it and using objump to dump asm may be a better way of doing such a raw code comparison.

Good idea. FWIW, you can also use the POWER8 machine available in the GCC
Compile Farm after applying for an account. It runs Debian unstable and
has 64 GB of RAM and 64 cores. It should allow for faster debugging.

I am one of the admins of this machine and can install build dependencies
for Firefox if necessary.

To get an account for the GCC Compile Farm, please see this page:

https://gcc.gnu.org/wiki/CompileFarm

I was aware of this. Funny as it is, I bet machine could eat both my i5 laptop and X1000 for breakfast, before it could even byte into lunch. :-D

I'm not experienced with powerful remote machines and only my local ones so haven't yet looked further into how to manage it all. But I did wonder if such a machine could be used (or abused) to pull down all the latest Ubuntu desktop source files and a build a PPC version. ;-)

Might be because Firefox was built with -O2. To debug the problem, it might
be better to build Firefox from git. It's actually not that difficult, see:

https://firefox-source-docs.mozilla.org/setup/linux_build.html

I'd likely cross compile on my i5 for starters. Not exactly a workstation, but I use it a work horse at times.

It seems that this code is supposed to compare two strings while ignoring
their case by enforcing the strings to be uppercase. At first sight, I don't
see anything suspicious, but it might be an idea to replace calls to this
function to a wrapper to the corresponding functions of the C++ standard
library.
Yes that would be it. I do wonder why a native XML library isn't used and why it's in a WASM box. But it may be that way so all common operations go through the sandbox. I was surprised the PPC version has any sandbox as I expected that to need a full JIT compiler or similar to generate machine code on the fly.Perhaps for an internal API, not checking input when input must be given
It's part of the expat library, see:

https://sources.debian.org/src/expat/2.7.1-1/expat/lib/xmltok.c/?hl=1007#L1007

I did find some other links to other expat code. I just expected it to be some kind of C lib function going by the name. So when I didn't see many examples of it on C tutorial sites it looked rare. What I found was just in XML code and I didn't see any examples using it outside of that. I suppose it is designed to be used internally.

In any case I don't see anything thing majorly wrong with  the code. Casting my coding criticisms aside I don't think the main source is the issue here. Going by the stack track and reading up on how this wasm VM works it looks like a pointer or reference is being corrupted. So I read WASM is stack based VM CPU. No registers. Kinda old fashioned sounding design but it wouldn't exactly use a real stack and on PPC the code works similar to usual PPC ABI and uses registers for parameter passing. Going by the code I examined. So it looks like there is a central instance pointer containing core VM private data. And the parameters passed by some kind of reference or offset.

Comparing with var_p0=var_p0@entry=262000 and var_p1=2016478208 p1 looks corrupted. There could some possibility of endian corruption. I reversed it and got $41578 or 267640 which is within range of 262000 in p0. Why p0 looks fine only p1 doesn't I don't know. Being register based means PPC has less chance of endian errors if registers are used to pass parameters. So if a code block could pass all generic data in registers, process it natively, then return result in registers, there would be less chance of endian errors. I've also read originally WASM was endian agnostic or processed in the native endian of host CPU, and was designed that way, but later they decided to change it to be non portable and hack it to be little endian only because that's defacto standard endian now days. Or something like that.

Could be a bug in the LLVM PowerPC backend. Try building with GCC instead.
Having read the latest about GCC compile still crashing reveals there is still some issue. Not exactly in the same spot. But soon after that streqci call.


-- 
My regards,

Damien Stewart.

Reply to: