Re: gcc v4 interworking patch
> > Comments from people who actually understand GCC are very welcome -
> > so far we haven't got this to actually work.
> Yes, I'm afraid I'm totally stuck. I would certainly appreciate some
> help from anyone who understands gcc's code generation on arm. The bit
> that's failing is (much code snipped for brevity):
> If I comment out the first test (x == const_true_rtx), the second gets
> triggered as well.
I recommend disabling conditional call instructions altogether. There's a
fairly obvious place to do this in arm_final_prescan_insn. use_return_insn
should probably also be hardwired to zero.
> > I gather from some comments Paul Brook made a while back that the
> > above code is only part of the solution and that something similar is
> > needed for library code too (or something like that). Clues welcome.
> There is some library code included in the gcc/gcc/config/arm directory,
> which is written in assembler and probably needs to be patched to use
> Richard's original tricks. There are also two places in glibc (the NPTL
> code, again ARM-specific bits written in assembler) which need patching
> in this manner as well.
Plus all the third party libraries that have assembly code in them (gmp,
probably ffmpeg and others).
We really want to have the assembler enforce this for us. i.e. either:
(a) Invent a new v4+ architecture that looks a lot like v4t and magically
expands bx to the tst;mov;bx triplet. This breaks indirect calls from
assembly code. e.g.
mov lr, pc
On the upside it'll be broken on all hardware, so we're fairly likely to
notice it. We can also add assembler heuristics to catch this case. Libraries
that are already interworking aware should DTRT and probably not need any
modification. There's a small chance we break code that's keeping the flags
live over a bx instuction, but I've never seen happen in practice.
(b) Use the regular v4 assembler and fix everything in the source/gcc.This
requires manually fixing all hand written assembly. The biggest problem is
even interworking aware libraries will break. They'll see that we're building
for v4t and generate non-interworked code. The breakage is also subtle. It
doesn't show up until you actually introduce Thumb libraries.
In both cases we probably want a pseudo-instruction to generate a base bx
instruction. For (a) this is handy when writing code that's aware fo the
sneaky tricks, and for (b) is avoids accidentally accepting v4t/v5 code.
I'm currently leaning quite strongly towards (a). FTBFS is a whole lot easier
to fix than random runtime crashes when linked with third party libraries.
 I'm open to suggestions for the name. It could be an additional option
rather than a new architecture variant.
 My suggested alternative are either (automagically DTRT when assembled for
v4t/v5 without needing two implementations, but needs a label):
addr lr, 1f
or (no label needed, but will still be crappy code when assembled for v4t):
addr lr, .+16
tst r0, #1
moveq pc, r0