Bug#892088: golang-1.10: FTBFS on mips when built on Octeon III buildds

To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: Bug#892088: golang-1.10: FTBFS on mips when built on Octeon III buildds
From: James Cowgill <jcowgill@debian.org>
Date: Mon, 5 Mar 2018 10:33:46 +0000
Message-id: <[🔎] 082fd79b-f3a0-bfa3-249c-0ec9af9586ca@debian.org>
Reply-to: James Cowgill <jcowgill@debian.org>, 892088@bugs.debian.org

Source: golang-1.10
Version: 1.10-1
Severity: serious
Tags: upstream patch
Forwarded: https://go-review.googlesource.com/c/go/+/97735
X-Debbugs-CC: debian-mips@lists.debian.org

[CC the list for the HW issue]

Hi,

golang-1.10 (and 1.9) FTBFS on mips big-endian with various floating
point test errors, but only when built on the Octeon III buildds.

After tearing my head out investigating this, I have concluded that
there is a hardware bug in the Octeon IIIs. The bug occurs when:
- You run a big floating point double operation (like div.d).
- Wait a few instructions (a store here seems to be important).
- Read the *odd* register which the above float operation stored into.
This ends up reading the value from before the double operation instead
of the result of the operation. I've attached a small C program which
reproduces this. It should happen 99% of the time (it will only fail if
you are unlucky and get an interrupt at the wrong time).

I think this does not affect the wider Debian archive because:
- MFHC1 (move from high float) seems to be unaffected (probably...)
- If mips32r2 and fpxx are enabled, GCC only reads the high part of a
float register using MFHC1 (in the few cases it needs to do this - like
integer <-> float conversion).

However, golang is affected because it uses FP32 and on big endian, all
double loads are split into word loads with the odd register loaded
first. I have attached a patch which fixes this which I have also
submitted upstream.

Thanks,
James

#include <stdint.h>
#include <stdio.h>

static void div_octeon_bug(double a, double b, uint32_t* buf)
{
	asm(
		"mtc1 %[fake_clobber], $f5\n"
		"div.d $f4, %[a], %[b]\n"
		"nop\n"
		"nop\n"
		"nop\n"
		"nop\n"
		"swc1 $f0, 0(%[buf])\n"
		"swc1 $f0, 0(%[buf])\n"
		"swc1 $f5, 0(%[buf])\n"
		"swc1 $f4, 4(%[buf])\n"
		"swc1 $f5, 8(%[buf])\n"
		"swc1 $f4, 12(%[buf])\n"
		:
		: [buf]"r"(buf), [fake_clobber]"r"(0xdeadbeef), [a]"f"(a), [b]"f"(b)
		: "memory", "f4", "$f5");
}

int main(void)
{
	uint32_t buf[4];
	div_octeon_bug(1.0, 21.63538985889851, buf);
	printf("%08x %08x\n", buf[0], buf[1]);
	printf("%08x %08x\n", buf[2], buf[3]);
	return 0;
}

From 5ab26b4e5996a3557a1d6f1af7f1e54104448a79 Mon Sep 17 00:00:00 2001
From: James Cowgill <james.cowgill@mips.com>
Date: Wed, 28 Feb 2018 16:10:14 +0000
Subject: [PATCH] cmd/internal/obj/mips: load/store even float registers first

There is a bug in Octeon III processors where storing an odd floating
point register after it has recently been written to by a double
floating point operation will store the old value from before the double
operation (there are some extra details - the operation and store
must be a certain number of cycles apart). However, this bug does not
occur if the even register is stored first. Currently the bug only
happens on big endian because go always loads the even register first on
little endian.

Workaround the bug by always loading / storing the even floating point
register first. Since this is just an instruction reordering, it should
have no performance penalty. This follows other compilers like GCC which
will always store the even register first (although you do have to set
the ISA level to MIPS I to prevent it from using SDC1).

Change-Id: I5e73daa4d724ca1df7bf5228aab19f53f26a4976
---
 src/cmd/internal/obj/mips/obj0.go | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/cmd/internal/obj/mips/obj0.go b/src/cmd/internal/obj/mips/obj0.go
index 2b9f18c942..c3bab5c48e 100644
--- a/src/cmd/internal/obj/mips/obj0.go
+++ b/src/cmd/internal/obj/mips/obj0.go
@@ -558,20 +558,22 @@ func preprocess(ctxt *obj.Link, cursym *obj.LSym, newprog obj.ProgAlloc) {
 			p.Link = q
 			p1 = q.Link

-			var regOff int16
+			var addrOff int64
 			if c.ctxt.Arch.ByteOrder == binary.BigEndian {
-				regOff = 1 // load odd register first
+				addrOff = 4 // swap load/save order
 			}
 			if p.From.Type == obj.TYPE_MEM {
 				reg := REG_F0 + (p.To.Reg-REG_F0)&^1
-				p.To.Reg = reg + regOff
-				q.To.Reg = reg + 1 - regOff
-				q.From.Offset += 4
+				p.To.Reg = reg
+				q.To.Reg = reg + 1
+				p.From.Offset += addrOff
+				q.From.Offset += 4 - addrOff
 			} else if p.To.Type == obj.TYPE_MEM {
 				reg := REG_F0 + (p.From.Reg-REG_F0)&^1
-				p.From.Reg = reg + regOff
-				q.From.Reg = reg + 1 - regOff
-				q.To.Offset += 4
+				p.From.Reg = reg
+				q.From.Reg = reg + 1
+				p.To.Offset += addrOff
+				q.To.Offset += 4 - addrOff
 			}
 		}
 	}
-- 
2.16.2

Attachment: signature.asc
Description: OpenPGP digital signature

Reply to:

Next by Date: OpenVPN static binary
Next by thread: OpenVPN static binary
Index(es):
- Date
- Thread