Re: 2.6.5rc2 panics on ss5/110

To: debian-sparc@lists.debian.org
Subject: Re: 2.6.5rc2 panics on ss5/110
From: Martin Habets <errandir_news@mph.eclipse.co.uk>
Date: Mon, 14 Jun 2004 14:23:36 +0100
Message-id: <[🔎] 20040614132335.GA31849@palantir8>

In-Reply-To: 20040331161006.GA13251

I got the same error yesterday, booting 2.6.6 on a ss20.
Got around it with this patch:

--- sd.c.orig	2004-06-14 12:29:49.000000000 +0100
+++ sd.c	2004-06-14 13:30:55.000000000 +0100
@@ -1083,9 +1083,9 @@
 		sector_div(mb, 1950);
 
 		printk(KERN_NOTICE "SCSI device %s: "
-		       "%llu %d-byte hdwr sectors (%llu MB)\n",
-		       diskname, (unsigned long long)sdkp->capacity,
-		       hard_sector, (unsigned long long)mb);
+		       "%lu %d-byte hdwr sectors (%lu MB)\n",
+		       diskname, (unsigned long)sdkp->capacity,
+		       hard_sector, (unsigned long)mb);
 	}
 
 	/* Rescale capacity to 512-byte units */


The "%llu" was causing the oops somehow...
Later I got another oops from the same area during poweroff:

Sending all processes the TERM signal...Unable to handle kernel NULL pointer dere
tsk->{mm,active_mm}->context = 000008b4
tsk->{mm,active_mm}->pgd = fc01b800
              \|/ ____ \|/
              "@'/ ,. \`@"
              /_| \__/ |_\
                 \__U_/
killall5(1138): Oops [#1]
PSR: 404000c4 PC: f00bb134 NPC: f00bb138 Y: 00000000    Not tainted
PC: <vsnprintf+0x4f0/0x604>
%G: 000225a1 00000008  00000000 00000022  f014fed6 0000004c  f08be000 efffe970
%O: 00000000 f08bfd98  00000008 00000000  fffffee8 00000000  f08bfc10 f00bb12c
RPC: <vsnprintf+0x4e8/0x604>
%L: ffffffff 00000000  ffffffff ffffffff  f0b3f000 00000000  f08be000 50171ddc
%I: f0b3f045 0f4c1000  0000000a f08bfd90  fc02524c fc014500  f08bfc88 f00bb31c
Caller[f00bb31c]: sprintf+0x1c/0x2c
Caller[f0090b28]: proc_pid_stat+0x330/0x354
Caller[f008dd0c]: proc_info_read+0x40/0x158
Caller[f00652cc]: vfs_read+0xd8/0x11c
Caller[f00654dc]: sys_read+0x2c/0x58
Caller[f00147d8]: syscall_is_too_hard+0x34/0x40
Caller[5009e168]: 0x5009e168
Instruction DUMP: 9210001b  40000eeb  9007bff0 <d41a0000> 1080002b  b606e008  12 

Sure enough, proc_id_stat uses %llu too. But from the vsnprintf code I
can't see anything wrong with the handling of it.
BUT, I do have CONFIG_CC_OPTIMIZE_FOR_SIZE set.

My compiler:
$ gcc -v
Reading specs from /usr/lib/gcc-lib/sparc-linux/3.3.3/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --with-cpu=v7 --enable-objc-gc sparc-linux
Thread model: posix
gcc version 3.3.3 (Debian 20040422)

Let me know if you get any further with this. I think I'll first turn
off CONFIG_CC_OPTIMIZE_FOR_SIZE. If that doesn't do it, I'll guess I'll
try an older compiler...

Here's the patch of code in vsnprintf:
f00bb118:       80 a1 60 4c     cmp  %g5, 0x4c
f00bb11c:       12 80 00 09     bne  f00bb140 <vsnprintf+0x4fc>
f00bb120:       80 a1 60 6c     cmp  %g5, 0x6c
f00bb124:       94 10 20 08     mov  8, %o2
f00bb128:       92 10 00 1b     mov  %i3, %o1
f00bb12c:       40 00 0e eb     call  f00becd8 <__memcpy>
f00bb130:       90 07 bf f0     add  %fp, -16, %o0
f00bb134:       d4 1a 00 00     ldd  [ %o0 ], %o2
f00bb138:       10 80 00 2b     b  f00bb1e4 <vsnprintf+0x5a0>
f00bb13c:       b6 06 e0 08     add  %i3, 8, %i3

                if (qualifier == 'L')
                        num = va_arg(args, long long);
                else if (qualifier == 'l') {
                        num = va_arg(args, unsigned long);
                        if (flags & SIGN)
                                num = (signed long) num;
                } else if (qualifier == 'Z' || qualifier == 'z') {

I'm curious about the 8 on line f00bb124. It is okay for long long
(which is what we use), but I think unsigned long is only 4 bytes
on sparc32. And there is no bne after the cmp on f00bb120. Odd...

--
Martin
---------------------------------------------------------------------------
30 years from now GNU/Linux will be as redundant a term as UNIX/MERT is 
today. - Martin Habets
---------------------------------------------------------------------------

Reply to:

Follow-Ups:
- Re: 2.6.5rc2 panics on ss5/110
  - From: Martin Habets <errandir_news@mph.eclipse.co.uk>

Prev by Date: Mon, 14 Jun 2004 05:25:16 -0600
Next by Date: unknown partition table
Previous by thread: Mon, 14 Jun 2004 05:25:16 -0600
Next by thread: Re: 2.6.5rc2 panics on ss5/110
Index(es):
- Date
- Thread