[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: debian on eServer 9111-520



Dear Alexandre,

On 3/29/19 23:46, Alexandre Bencz wrote:
Hi Frank!
Thanks for the awesome explanation!!

I tried, using only the DASD of position 1, but an error occurred, the
system was trying to detect the other DASDs
I will try with other DASDs combination and with other DASDs :)

I'm using this version:
https://cdimage.debian.org/cdimage/ports/2019-01-27/debian-10.0-ppc64-NETINST-1.iso

Thanks for clarification. I didn't have time to check with my 9131-52A
last weekend but I could fetch my 9111-520 from storage beginning of the
week and check things there.

From my testing it really looks like the ipr driver is involved here,
and possibly MP operation. I wondered why there were no issues during
installation (which also has to use the ipr driver to access the
disks!), but currently I assume this maybe could be due to SP operation
during installation. I assume the kernel on the installer ISO is SP not
MP, but I have to check.

I have a working NFS root FS for this machine so I can load the kernel
and initramfs from disk using the on-disk GRUB, blacklist the ipr module
and use the NFS root FS instead of the on-disk root FS.

The machine boots the OS (Debian GNU/Linux Sid from Debian Ports for
ppc64, updated on 2019-04-01) successfully this way.

But when I try to load the ipr driver after the OS has booted, things go
wrong pretty quick:

```
root@p5-520:/tmp# lsblk
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0   11:0    1 1024M  0 rom
root@p5-520:/tmp# modprobe -v ipr
insmod /lib/modules/4.19.0-4-powerpc64/kernel/drivers/scsi/ipr.ko
[ 1010.077364] ipr: IBM Power RAID SCSI Device Driver version: 2.6.4
(March 14, 2017)
[ 1010.077437] ipr 0003:d0:01.0: Found IOA with IRQ: 151
[ 1010.080789] ipr 0003:d0:01.0: Starting IOA initialization sequence.
[ 1010.080804] scsi host2: IBM 0 Storage Adapter
[ 1010.084310] ipr 0003:d0:01.0: Adapter firmware version: 050D0090
[ 1010.144203] ipr 0003:d0:01.0: IOA initialized.
[ 1010.146769] scsi 2:255:255:255: No Device         IBM      5709001
      0150 PQ: 0 ANSI: 0
[ 1010.150052] scsi 2:0:15:0: Enclosure         IBM      VSBPD4E1
U4SCSI 4610 PQ: 0 ANSI: 2
[ 1010.153197] scsi 2:1:15:0: Enclosure         IBM      VSBPD4E1
U4SCSI 4610 PQ: 0 ANSI: 2
[ 1010.175610] scsi 2:1:8:0: Direct-Access     IBM   H0 HUS103014FL3800
RPQR PQ: 0 ANSI: 4
[ 1010.203688] scsi 2:0:3:0: Direct-Access     IBM   H0 HUS103014FL3800
RPQR PQ: 0 ANSI: 4
[ 1010.217574] scsi 2:255:255:255: Attached scsi generic sg1 type 31
[ 1010.218039] scsi 2:0:15:0: Attached scsi generic sg2 type 13
[ 1010.218526] scsi 2:1:15:0: Attached scsi generic sg3 type 13
[ 1010.218968] scsi 2:1:8:0: Attached scsi generic sg4 type 0
[ 1010.219429] scsi 2:0:3:0: Attached scsi generic sg5 type 0
root@p5-520:/tmp#
root@p5-520:/tmp#
root@p5-520:/tmp# lsblk
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0   11:0    1 1024M  0 rom
root@p5-520:/tmp# [ 1031.229214] rcu: INFO: rcu_sched detected stalls on
CPUs/tasks:
[ 1031.229235] rcu:     0-...0: (0 ticks this GP)
idle=cda/1/0x4000000000000002 softirq=30239/30240 fqs=2568
[ 1031.229247] rcu:     (detected by 1, t=5252 jiffies, g=56957, q=1287)
[ 1031.229262] Sending NMI from CPU 1 to CPUs 0:
[ 1037.294854] CPU 0 didn't respond to backtrace IPI, inspecting paca.
[ 1037.294866] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 4963
(systemd-udevd)
[ 1037.294880] Back trace of paca->saved_r1 (0xc00000000fffb7c0)
(possibly stale):
[ 1037.294890] Call Trace:
[ 1037.294900] [c00000000fffb7c0] [c00000000fffb870] 0xc00000000fffb870
(unreliable)
[ 1037.294918] [c00000000fffb850] [c000000000763f90]
.scsi_eh_scmd_add+0x50/0x1a0
[ 1037.294932] [c00000000fffb8f0] [c000000000769868]
.scsi_softirq_done+0xe8/0x210
[ 1037.294948] [c00000000fffb990] [c0000000005adb1c]
.blk_mq_complete_request+0x11c/0x1e0
[ 1037.294962] [c00000000fffba20] [c00000000076798c] .scsi_mq_done+0x2c/0xe0
[ 1037.294982] [c00000000fffbaa0] [d00000000536e658]
.ipr_scsi_done+0x88/0x820 [ipr]
[ 1037.294999] [c00000000fffbb50] [d00000000536c820]
.ipr_isr+0x3c0/0x890 [ipr]
[ 1037.295015] [c00000000fffbc30] [c0000000001cbc60]
.__handle_irq_event_percpu+0xa0/0x310
[ 1037.295029] [c00000000fffbd00] [c0000000001cbf04]
.handle_irq_event_percpu+0x34/0xb0
[ 1037.295044] [c00000000fffbd90] [c0000000001cbfdc]
.handle_irq_event+0x5c/0xc0
[ 1037.295057] [c00000000fffbe10] [c0000000001d1b30]
.handle_fasteoi_irq+0xc0/0x200
[ 1037.295071] [c00000000fffbe90] [c0000000001ca300]
.generic_handle_irq+0x50/0x80
[ 1037.295086] [c00000000fffbf10] [c00000000001b284] .__do_irq+0x64/0x200
[ 1037.295100] [c00000000fffbf90] [c00000000002f030] .call_do_irq+0x14/0x24
[ 1037.295114] [c000000003b2f6e0] [c00000000001b4b8] .do_IRQ+0x98/0x130
[ 1037.295128] [c000000003b2f780] [c000000000008e10]
hardware_interrupt_common+0x170/0x180
[...]
```

Oh btw, this is one disk in slot 1 (scsi 2:0:3:0) and one disk in slot 8
(scsi 2:1:8:0).

I will try to check if SP operation makes a difference, if I find out
how to disable MP operation, because there doesn't seem to be a SP
kernel available for ppc64. Hence I'm also not 100% sure that the
installer really operates in SP mode. I have to check `/proc/cpuinfo`
during an installation to be sure.

Cheers
Frank


Reply to: