Bug#785475: arm64 shift+rotate optimization bug

To: submit@bugs.debian.org
Subject: Bug#785475: arm64 shift+rotate optimization bug
From: Magnus Holmgren <holmgren@debian.org>
Date: Sat, 16 May 2015 20:47:34 +0200
Message-id: <[🔎] 1448436.14ne3StXKU@johansson>
Reply-to: Magnus Holmgren <holmgren@debian.org>, 785475@bugs.debian.org

Package: gcc-4.9
Version: 4.9.2-16

I think I may have discovered an optimizer bug that results in incorrect code 
when building nettle. The Camellia cipher contains code similar to the 
following, which reproduces the bug:

#include <stdint.h>
#define ROTL32(n,x) (((x)<<(n)) | ((x)>>(-(n)&31)))
#define ROTR32(n,x) (((x)>>(n)) | ((x)<<(-(n)&31)))

uint64_t func(uint64_t x1, uint64_t x2) {

  uint32_t dw;

  dw = (x1 & x2) >> 32; x1 ^= ROTL32(1, dw);
  return x1;
}

The above results in the following machine code with -O1 or greater:

   0:   8a010001        and     x1, x0, x1
   4:   9361fc21        asr     x1, x1, #33
   8:   2a0103e1        mov     w1, w1
   c:   ca000020        eor     x0, x1, x0
  10:   d65f03c0        ret

which would be correct, I believe, if we substitute ROTR32 for ROTL32.
Note that if we use dw, for example by printing it, 

4.9.2-10 produces the correct result:

   0:   8a010001        and     x1, x0, x1
   4:   d360fc21        lsr     x1, x1, #32
   8:   13817c21        ror     w1, w1, #31
   c:   ca000020        eor     x0, x1, x0
  10:   d65f03c0        ret

Moreover, the following inline function

static inline uint32_t rotl32 (int n, uint32_t x)
{
  return (x << n) | (x >> (-n & 31));
}

results in equivalent incorrect, but much more compact, machine code:

   0:   8a010001        and     x1, x0, x1
   4:   ca818400        eor     x0, x0, x1, asr #33
   8:   d65f03c0        ret

-- 
Magnus Holmgren        holmgren@debian.org
Debian Developer

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply to:

Follow-Ups:
- Bug#785475: marked as done (arm64 shift+rotate optimization bug)
  - From: owner@bugs.debian.org (Debian Bug Tracking System)

Prev by Date: gcc-4.9_4.9.2-17_ppc64el.changes ACCEPTED into unstable
Next by Date: Bug#785475: arm64 shift+rotate optimization bug
Previous by thread: gcc-4.9_4.9.2-17_ppc64el.changes ACCEPTED into unstable
Next by thread: Bug#785475: arm64 shift+rotate optimization bug
Index(es):
- Date
- Thread