[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

problem with smp and isolcpus



Title: problem with smp and isolcpus

Hi

Seem to have a problem with one of my cpus on a ES45, cpu2 seems to be dying, I have had 3 lockups in 2 days

Jul 26 12:26:23 keyzervega kernel: smp_call_function_on_cpu: initial timeout -- trying long wait

Jul 26 12:26:53 keyzervega kernel: lib/kernel_lock.c:229 spinlock stuck in nifd at fffffc00012c65f0(3) owner hald-addon-stor at fffffc00012c65f

0(0) lib/kernel_lock.c:229

Jul 26 12:26:53 keyzervega kernel: lib/kernel_lock.c:229 spinlock stuck in automount at fffffc00012c65f0(1) owner hald-addon-stor at fffffc0001

2c65f0(0) lib/kernel_lock.c:229

Jul 26 12:26:53 keyzervega kernel: Kernel bug at arch/alpha/kernel/smp.c:858

Jul 26 12:26:53 keyzervega kernel: CPU 0 hald-addon-stor(1801): Kernel Bug 1

Jul 26 12:26:53 keyzervega kernel: pc = [<fffffc000101c4ac>]  ra = [<fffffc000101c404>]  ps = 0000    Not tainted

Jul 26 12:26:53 keyzervega kernel: pc is at smp_call_function_on_cpu+0x220/0x264,  ra is at smp_call_function_on_cpu+0x178/0x264

Jul 26 12:26:53 keyzervega kernel: v0 = 0000000000000041  t0 = 0000000000000001  t1 = 0000000000000001

Jul 26 12:26:53 keyzervega kernel: t2 = 0000000100728747  t3 = fffffc0008bbd108  t4 = 000000003b5f2d38

Jul 26 12:26:53 keyzervega kernel: t5 = 0000000000000089  t6 = fffffc03fe78d640  t7 = fffffc03f4118000

Jul 26 12:26:53 keyzervega kernel: a0 = 0000000000000000  a1 = 0000000000000000  a2 = 0000000000000001

Jul 26 12:26:53 keyzervega kernel: a3 = 0000000000000000  a4 = fffffc00012c6038  a5 = 0000000000000000

Jul 26 12:26:53 keyzervega kernel: t8 = 0000000000000200  t9 = 0000000000000020  t10= 0000000000000000

Jul 26 12:26:53 keyzervega kernel: t11= 0000000000000001  pv = fffffc000101ca78  at = 0000000000000000

Jul 26 12:26:53 keyzervega kernel: gp = fffffc00018b2d00  sp = fffffc03f411bde8

Jul 26 12:26:53 keyzervega kernel: Trace:

Jul 26 12:26:53 keyzervega kernel: [<fffffc000108ad04>] invalidate_bdev+0x3c/0x84

Jul 26 12:26:53 keyzervega kernel: [<fffffc000108ba9c>] invalidate_bh_lru+0x0/0x74

Jul 26 12:26:53 keyzervega kernel: [<fffffc000108ba9c>] invalidate_bh_lru+0x0/0x74

Jul 26 12:26:53 keyzervega kernel: [<fffffc0001093098>] kill_bdev+0x24/0x58

Jul 26 12:26:53 keyzervega kernel: [<fffffc0001094020>] blkdev_put+0xa8/0x26c

Jul 26 12:26:53 keyzervega kernel: [<fffffc00010898d8>] __fput+0x80/0x1bc

Jul 26 12:26:53 keyzervega kernel: [<fffffc0001087f64>] filp_close+0xb0/0xd4

Jul 26 12:26:53 keyzervega kernel: [<fffffc000108806c>] sys_close+0xe4/0x114

Jul 26 12:26:53 keyzervega kernel: [<fffffc0001010ff4>] entSys+0xa4/0xc0



I have had a look through and I havent seen anything for CPU 2 so I am presuming that it is CPU that is dying the death.

I thought I would isolate cpu 2 from the schedular but when I try placing isolcpus=2 in the kernel parameter it doesnt seem to make any difference for the schel, the affinity mask for all the processes is still f and less /var/log/dmesg still shows that it is using 4 cpus!

I would prefer to do it in linux so I can test the cpu and not mask it out in srm, which it looks like I am going to have to do.

Is this a know issue is the a resolve, if not where can I log a bug? Where is bug tracking for it ?

Alex


Reply to: