Bug#758115: Disabled wait state X'32EE' on IPL of zIPL
On Thu, 21 Aug 2014 19:26:44 -0400 (EDT)
Stephen Powell <zlinuxman@wowway.com> wrote:
> Here are the last few instructions prior to the failure on the failing
> version, thanks to the CP TRACE facility under z/VM on a real IBM z/890:
>
> 0000000000002A78 STG E310F0A80024 >> 000000000000FEB0 CC 2
> 0000000000002A7E LG E32030000004 00000000000001B0 CC 2
> 0000000000002A84 LG E31030080004 00000000000001B8 CC 2
> 0000000000002A8A STG E34030000024 >> 00000000000001B0 CC 2
> 0000000000002A90 LA 4140F0A0 = 000000000000FEA8 CC 2
> 0000000000002A94 LARL C0500000000B CC 2
> 0000000000002A9A STG E35040080024 >> 000000000000FEB0 CC 2
> 0000000000002AA0 STG E35030080024 >> 00000000000001B8 CC 2
> 0000000000002AA6 LPSWE B2B2F0A0 000000000000FEA8 CC 0
> 0000000000002AAA LMG EBDFF0B00004 ???????????????? CC 0
> 0000000000002AB0 STG E32030000024 >> 00000000000001B0 CC 0
> 0000000000002AB6 STG E31030080024 >> 00000000000001B8 CC 0
> 0000000000002ABC BR 07FE -> 00000000000032E6 CC 0
> -> 00000000000032E6 LH 48100086 0000000000000086 CC 0
> 00000000000032EA BRU A7F40001 -> 00000000000032EC CC 0
> -> 00000000000032EC ???? 0001
> *** 00000000000032EC PROG 0001 -> 00000000000039A8
>
> And here is what appears to be the equivalent code on the working
> version, compiled under wheezy:
>
> 0000000000002A38 STG E310F0A80024 >> 000000000000FEA0 CC 2
> 0000000000002A3E LG E32030000004 00000000000001B0 CC 2
> 0000000000002A44 LG E31030080004 00000000000001B8 CC 2
> 0000000000002A4A STG E34030000024 >> 00000000000001B0 CC 2
> 0000000000002A50 LA 4140F0A0 = 000000000000FE98 CC 2
> 0000000000002A54 LARL C0500000000B CC 2
> 0000000000002A5A STG E35040080024 >> 000000000000FEA0 CC 2
> 0000000000002A60 STG E35030080024 >> 00000000000001B8 CC 2
> 0000000000002A66 LPSWE B2B2F0A0 000000000000FE98 CC 0
> 0000000000002A6A LMG EBDFF0B00004 ???????????????? CC 0
> 0000000000002A70 STG E32030000024 >> 00000000000001B0 CC 0
> 0000000000002A76 STG E31030080024 >> 00000000000001B8 CC 0
> 0000000000002A7C BR 07FE -> 00000000000032C0 CC 0
> -> 00000000000032C0 LLGH E31000860091 0000000000000086 CC 0
> 00000000000032C6 CHI A71E1004 CC 2
> 00000000000032CA BRZ A784000A 00000000000032DE CC 2
> ...
>
> And on we go from there. The BRU instruction in the first sequence
> is clearly bad. In assembler language format, the equivalent instruction
> would be "BRU *+2". This is a bad branch. The instruction branches
> into the middle of itself, picking up "0001" as the next machine instruction,
> which causes an operation exception. Since the failing "instruction"
> starts at storage address 32EC, and is two bytes long, that means that
> the updated instruction address in the PSW at the time of the program
> interruption will be 32EE, which is the value used in the disabled wait
> PSW.
Hi Stephen,
You can get a disassembly for the eckd boot loader code when you go
to s390-tools/zipl/boot and:
1) make
2) objdump -S eckd2.exec > eckd2.list
I think the corresponding code in zipl is load_wait_psw() in libc.c:
__attribute__ ((noinline)) void load_wait_psw(uint64_t psw_mask, struct psw_t *psw)
{
struct psw_t wait_psw = { .mask = psw_mask, .addr = 0 };
2df6: e3 20 f0 a0 00 24 stg %r2,160(%r15)
struct psw_t old_psw, *wait_psw_ptr = &wait_psw;
unsigned long addr;
old_psw = *psw;
psw->mask = 0x0000000180000000ULL;
2dfc: e3 10 30 00 00 24 stg %r1,0(%r3)
asm volatile(
2e02: 41 20 f0 a0 la %r2,160(%r15)
{
struct psw_t wait_psw = { .mask = psw_mask, .addr = 0 };
struct psw_t old_psw, *wait_psw_ptr = &wait_psw;
unsigned long addr;
old_psw = *psw;
2e06: e3 10 30 08 00 04 lg %r1,8(%r3)
psw->mask = 0x0000000180000000ULL;
asm volatile(
2e0c: c0 50 00 00 00 0b larl %r5,2e22 <load_wait_psw+0x5a>
2e12: e3 50 20 08 00 24 stg %r5,8(%r2)
2e18: e3 50 30 08 00 24 stg %r5,8(%r3)
2e1e: b2 b2 f0 a0 lpswe 160(%r15)
".Lwait: \n"
: [addr] "=&d" (addr)
: [wait_psw] "Q" (wait_psw), [wait_psw_ptr] "a" (wait_psw_ptr),
[psw] "a" (psw)
: "cc", "memory");
*psw = old_psw;
2e22: e3 40 30 00 00 24 stg %r4,0(%r3)
2e28: e3 10 30 08 00 24 stg %r1,8(%r3)
}
2e2e: eb df f0 b0 00 04 lmg %r13,%r15,176(%r15)
2e34: 07 fe br %r14
load_wait_psw() is called from wait():
static inline int wait(void)
{
do {
load_wait_psw(0x0102000180000000ULL, &S390_lowcore.external_new_psw);
33d0: e3 20 d0 00 00 04 lg %r2,0(%r13)
33d6: a7 39 01 b0 lghi %r3,432
33da: c0 e5 ff ff fc f7 brasl %r14,2dc8 <load_wait_psw>
if (S390_lowcore.ext_int_code == 0x1004)
33e0: e3 10 00 86 00 91 llgh %r1,134
33e6: a7 1e 10 04 chi %r1,4100
33ea: a7 74 00 06 jne 33f6 <sclp_wait_for_int+0x9a>
33ee: a7 28 00 02 lhi %r2,2
33f2: a7 f4 00 08 j 3402 <sclp_wait_for_int+0xa6>
return ETIMEOUT;
} while (S390_lowcore.ext_int_code != 0x2401);
33f6: a7 1e 24 01 chi %r1,9217
33fa: a7 74 ff eb jne 33d0 <sclp_wait_for_int+0x74>
33fe: a7 28 00 00 lhi %r2,0
Would be interesting how the disassembly looks on your system.
Michael
Reply to: