[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Linux Sparc FPU register corruption



James Y Knight <jyknight@google.com> writes:

> Anyone else reading this able to reproduce problems with my test program (or not?)

I ran your test program and got some interesting results. First though
I'd like to say that I've been using Debian sid on my Sun T5120 (T2
processor) for several months and I haven't had any of the problems
you're describing. I've compiled lots of large programs like gcc and
openjdk and I've never had gcc segfault. I'm running on the bare
hardware so maybe the bugs you're experiencing are exposed by using an
ldom?

Anyway about your test program, to try it out I first grabbed Linus's
kernel tree and built a kernel from it. Hopefully this will make my
observations consistent with David Miller's.

Once it was booted I ran your test program and within 15 seconds I got
an error:


FP regs xx: 0: 0xffff800fd404f860 0xffff800fc21e09e0
0xffff800fc2773680 0xffff800fcaabd5c0 0xffff800fca9b99a0
0xffff800fc1d13c20 0xffff800fc5100080 0xffff800fd11c2960
FP regs xx: 1: 0xffff800fc2773680 0xffff800fcaabd5c0
0xffff800fca9b99a0 0xffff800fc1d13c20 0xffff800fc5100080
0xffff800fd11c2960 0xffff800fd404f860 0xffff800fc21e09e0
FP regs xx: 2: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708
FP regs xx: 3: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708

FP regs yy: 0: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708
FP regs yy: 1: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708
FP regs yy: 2: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708
FP regs yy: 3: 0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708 0x102030405060708
0x102030405060708 0x102030405060708


I tried it a few times and got the same result. I then noticed that a
few things were weird in the config of the kernel I just built. So I
changed a few settings, compiled a kernel and booted it.

On a whim I ran your test program again and it ran for 2 hours without
an error!

I booted the first kernel and it produced the error almost instantly,
just as before. All right! Same kernel source, same compiler, it has to
be the config that makes a difference.

The only significant thing I changed was SLAB -> SLUB. To test it I took
the second kernel config, the one that produced no error, and changed
just SLUB -> SLAB. That third kernel produces the error with your
program.

So I have no idea why the choice of SLAB allocator would make a
difference in this case but you should try a kernel using SLUB and see
if it makes a difference.

-David Mattli


Reply to: