Re: teaching gcc's optimiser new tricks
On Sat, Aug 03, 2002 at 06:41:04PM +0100, Philip Blundell wrote:
> On Sat, 2002-08-03 at 17:36, Nicholas Clark wrote:
> > cmp r4, #0
> > movne ip, #0
> > moveq ip, #1
> >
> > how does one teach the optimiser that this is equivalent and 33% faster:
> >
> > rsbs ip, r4, #1
> > movls ip, #0
>
> Oh, that's rather clever. Look at the compare_scc pattern in arm.md,
> that's where this stuff all happens.
It took a lot of thinking round and round in circles "it must be possible to
do this in 2 instructions" to actually come up with them.
I think the appended patch will do it. I have to go to bed now (and
turn the machine off). Test results so far are that gcc compiles, and
produces this output at -O2:
@ Generated by gcc 2.95.2 20000516 (release) [Rebel.com] for ARM/elf
.file "lognot.c"
gcc2_compiled.:
.text
.align 2
.global lognot
.type lognot,function
lognot:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, current_function_anonymous_args = 0
mov ip, sp
stmfd sp!, {fp, ip, lr, pc}
sub fp, ip, #4
rsbs r0, r0, #1
movls r0, #0
ldmea fp, {fp, sp, pc}
.Lfe1:
.size lognot,.Lfe1-lognot
.ident "GCC: (GNU) 2.95.2 20000516 (release) [Rebel.com]"
from this
int lognot (int a) {
return !a;
}
The old code
lognot:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, current_function_anonymous_args = 0
mov ip, sp
stmfd sp!, {fp, ip, lr, pc}
sub fp, ip, #4
cmp r0, #0
movne r0, #0
moveq r0, #1
ldmea fp, {fp, sp, pc}
.Lfe1:
is 1 instruction longer.
I will test the compiler bootstraps tomorrow, although anyone else with
CPU to burn (at better still gcc 3.1.1 sources readily to hand) is welcome
to beat me to it.
Nicholas Clark
--
Even better than the real thing: http://nms-cgi.sourceforge.net/
--- gcc-2.95.2-arm6/gcc/config/arm/arm.md Mon Aug 28 14:31:18 2000
+++ gcc-2.95.2-arm6-new/gcc/config/arm/arm.md Mon Aug 12 23:23:57 2002
@@ -4667,6 +4667,9 @@
if (GET_CODE (operands[1]) == GE && operands[3] == const0_rtx)
return \"mvn\\t%0, %2\;mov\\t%0, %0, lsr #31\";
+ if (GET_CODE (operands[1]) == EQ && operands[3] == const0_rtx)
+ return \"rsbs\\t%0, %2, #1\;movls\\t%0, #0\";
+
if (GET_CODE (operands[1]) == NE)
{
if (which_alternative == 1)
Reply to: