Update on rlbox wasm sandbox crashes

To: debian-powerpc@lists.debian.org
Subject: Update on rlbox wasm sandbox crashes
From: Damien Stewart <hypexed@yahoo.com.au>
Date: Thu, 10 Jul 2025 16:06:03 +1000
Message-id: <[🔎] 8892c615-a14a-4b69-a7a5-4e88c751f7cb@yahoo.com.au>
References: <8892c615-a14a-4b69-a7a5-4e88c751f7cb.ref@yahoo.com.au>

Hi guys.

So those who are familiar with this will know I dug into what wascrashing Firefox on startup. Which was expat crashing inside a wasmcompiled sandbox. An issue known about for years and worked around bydisabling the sandboxing. An existing fix for s390x. Which fixed theFirefox build on Debian ppc/64 so it worked again. (Thanks Adrian.)

I pretty much threw myself into the deep end as I wasn't familiar withRLBox nor how it was integrated into Firefox but did know about wasm. SoI had to learn what was crashing in GDB and undo my confusion as to whatwas causing it and why sources for symbols were missing. I've spent timesince investigating it and learning about all the processes involvingthis sandboxing.

(TL;DR, or afraid to but still reading this line, it looks like wasm2cdid it.)



There's three main components involved:

* rbox - The core sandbox.

* rlbox_wasm2c_sandbox - Sand boxing using wasm as a middle man.

* rlbox-book - Example code and tutorials for porting and testingsandboxing.

These come in different levels with their own sources. The first, rlboxis small. This easily compiled on ppc64 and passed all tests. The logicappears fine. Next, rlbox_wasm2c_sandbox, is more complicated. This isdesigned to compile the whole toolchain before you can even use it tocompile and test any code for sandboxing. The last, rlbox-book, has codefor a tutorial and examples for basic "no-op" false sandbox sourcetesting and a real "wasm" sandbox source example. This relies onrlbox_wasm2c_sandbox but, the sources are obsolete or mismatched, as theexample code expects sources generated from the build that don't exist.So you cannot build it all without the examples still breaking. Somewhattime wasting if you need to hunt down what it's looking for. Which I did.

RLBox and WASM have been available for some time now days so thereshould be no need to build the entire toolchain just to do some simpletest compiles. And there are prebult packages. In particular there iswabt which provides wasm2c, wasi-sdk which provides wasm build tools,and clang actually can compile source to wasm code itself. Of theseDebian ppc64 doesn't have wasi-sdk. But does have wabt and a wasi-libc.And the ppc64 clang build does support wasm. As expected there are noppc/64 releases of wasi-sdk. However, tools are available here in ppc64for what building an rlbox wasm example needs, if you find where tolook. :-)

The whole RLBox and WASM integration is one big convoluted monster. Itwill take a nice small source code and bloat it up into a bigcomplicated mess. All for the purpose of a software VM to isolate codeand protect the host. To simplify the process, a source needs to becompiled into wasm, then wasm code reverse engineered (or back compiled)into C, then finally wrapped in the rlbox API and compiled as C/C++ intothe end binary. Sounds simple! I wanted to compile this on ppc64 sincethis process breaks binaries on ppc64 and I had a devil of a time doingso. But, I also wanted to avoid building the whole toolchain, since thetools do exist. First, I needed a generic clang to work for me, sincethe wasm clang build is custom. A search revealed clang needed--target=wasm32-wasi so was a simple fix. Next I needed wasm2c to worknormally. Unfortunately, the documentation is a bit sparse. For example,wasm2c from wabt kept generating source that caused clang (and gcc forthat matter) to spew out errors everywhere. All the labels had beensurrounded by 0x24 and 0x2e (dollar and dot if you know your ASCII hex)and it kept wrecking the whole build, which looked strange, as thesources it output were isolated to one C and header file. Turns out,with the wasm2c I was using, I needed to pass --no-debug-names. Lookssimple, but this took me ages to find out. It also didn't help that someoptions give errors about being unimplemented as if was2c was still anunfinished beta. I looked on the net for info on wasm2c putting 0x24 and0x2e in labels and got no results at all! In the end I just added thatdebug option as a guess since it looked related and that was it. Wellthey could have mentioned it! How about the manual actually mentioningthat by default this wasm2c will put 0x24 and 0x2e in labels? Or wreckit so it won't compile? :-facepalm.

After getting over all that I could finally do some real testing in gdb.At this point, rlbox and an rlbox wasm noop-hello-example fromrlbox-book compiled and worked fine. Now I had the wasm-hello-examplefrom rlbox-book compiled which was the real challenge. I ran the testbinary and it crashed. Actually, unlike expat in Firefox, it didn'tactually crash but aborted itself. Either way this was good as I nowknew what code was broken and what built the code. I've now ran the codethrough gdb and also compared with running on x64. A few things I haveascertained:

* Code wasm2c generates is big endian aware. Despite lacking tier 1support the host code generated checks for big endian and runs alternatecode to fix bytes. So regardless of host endian the output of wasm2ccode is designed to and should work on big endian/ppc native. This isgood as not only does the upstream code take big endian into account itis designed to run correctly on a big endian host. I checked and bothx64 and ppc64 builds of wasm2c produced big endian aware sources.

* Related to my first point there is a quirky macro that fixes endianwhen ran from big. It actually will slow down the code so although it'sdesigned to fix endian it should be fixed to be optimised. It's a loaddata routine that normally just does a mem copy. On big endian it does amem copy and then, then reverses the code in place, almost neatly byreading a byte at one end and swapping with a byte at the opposite endwith a half length array loop. Although it looks novel I wonder why theyjust didn't stick a reverse mem copy in? But that's just how I would doit. Any built in mem copy is rendered useless by the whole operation.

I have found some issues. The example code I compiled from thewasm-hello-example breaks on big endian. I don't mean to repeat theobvious, but relating to my first point, any code compiled on big endianwill not include the macro. It relies on a WABT_BIG_ENDIAN define beingset to 1 when the final wasm2c C code is compiled. On my system this wasnot set although I expected the upper headers included would test andset it. This could relate to my test build but is one to watch out forupstream. It is tested with from a CMake list in rlbox_wasm2c_sandbox.But the example just has a simple Makefile with no checks for configuring.

Next is loading data. The wasm2c code has data built in. There are bothdata tables and some function tables all built from byte arrays. Pluswhat looks like hex encoded ASCII string tables. Looking at the data itappears there are scalars embedded in little endian order. This would beconsistent with wasm being little endian but will obviously causetrouble in big endian. Although the code should be aware of any bytepositions. I suspect this could be what is crashing expat where just oneoffset among a few parameters to a function was byte swapped. I don'tknow how clean the code is, since it doesn't look clean from the onset,but if data load routines are doing tricks, such as reading the LSB orreading one byte from a bigger scalar it will badly break. There couldbe big trouble in little endian here! :-D

The macro has more issues. It is designed for scalars but accepts anyvalue. I don't know if they intended it to be like that to me it lookslike a bug. The routine is designed to load data but is abused/confusedby a macro that reverses the loaded data. However, since it wasn'tenabled in my build, this wasn't what was aborting the code. For fun Ienabled the WABT_BIG_ENDIAN macro on x64. Which I'm also doingcomparative testing with. It broke it and caused the binary to abort.Lol. But, on ppc64, enabling WABT_BIG_ENDIAN almost made it worse. Thecode doesn't crash or abort but ends up in an endless loop! :-?

After examining more code I can see this getting more complex. Due to itcoding these tables in. This will result in mixed endian data throughoutthe code. There are fields code assigns and reads back in native endianfrom source. There's scalars read from data. But since they are both setand read in the same data table it's hard to see what endian will bewhat! Only wasm2c would know what the offset is. If it knows what anoffset does in the first place. Then the code itself appears to besimulating a CPU. It assumes all these local variables it spends timereading and writing to, doing conditionals on with gotos, passing valuesaround and doing what looks like the least optimised way of anoperations. Doing operations atomically. It actually looks like a Cversion of RISC ASM. :-)

This is running deep. Of course, this is due to taking perfectlyportable code, compiling it into little endian machine code, thenreverse engineering that into an analogous C source. The result hasended up with introducing a false dependence of little endian. Be goodif wasm2c could be told to generate big endian or portable code. Butsince the world is now running on a little endian web, this problemwon't go away now.

After gathering the evidence so far I can only conclude the entireprocess going on here has made an already convoluted process morecomplicated. That becomes more worse on an endian it isn't designed for.I've already dealt with code like this before that, based onexamination, was generated by reverse engineering x86 code into a nonportable convoluted mess of C. It wasn't pretty. Yet I manged to findevery byte out of place and fix to be portable. So now, I can see thatdisabling the sandboxing is for now likely the best fix, because if itworked the code would be even slower. But, the compromise is here ismore security, so getting it working down the track is a good idearegardless and even for an "exotic endian" platform. In the mean time,I'll continue digging, now it's becoming clearer where to point thefinger. And what I can use as an MVE.



Links:

https://github.com/PLSysSec/rlbox

https://github.com/PLSysSec/rlbox_wasm2c_sandbox

https://github.com/PLSysSec/rlbox-book


The wasm2c [excuse for a] man page example:

https://manpages.debian.org/bookworm/wabt/wasm2c.1.en.html


That quirky macro:

static inline void load_data(void* dest, const void* src, size_t n) {
  if (!n) {
    return;
  }
  size_t i = 0;
  u8* dest_chars = dest;
  wasm_rt_memcpy(dest, src, n);
  for (i = 0; i < (n >> 1); i++) {
    u8 cursor = dest_chars[i];
    dest_chars[i] = dest_chars[n - i - 1];
    dest_chars[n - i - 1] = cursor;
  }
}



--
My regards,

Damien Stewart.

Reply to:

Follow-Ups:
- Re: Update on rlbox wasm sandbox crashes
  - From: John Ogness <powerpc@ogness.net>

Prev by Date: Re: u-boot
Next by Date: Re: Update on rlbox wasm sandbox crashes
Previous by thread: Re: u-boot
Next by thread: Re: Update on rlbox wasm sandbox crashes
Index(es):
- Date
- Thread