Re: core dump analysis, was Re: stack smashing detected
On Tue, 4 Apr 2023, I wrote:
> On Tue, 4 Apr 2023, I wrote:
>
> >
> > The actual corruption might offer a clue here. I believe the saved %a3
> > was clobbered with the value 0xefee1068 which seems to be a pointer into
> > some stack frame that would have come into existence shortly after
> > __GI___wait4_time64 was called.
>
> Wrong... it is a pointer to the location below the __wait3 stack frame.
>
> (gdb) info frame
> Stack level 8, frame at 0xefee10e0:
> pc = 0xc00e0172 in __wait3 (../sysdeps/unix/sysv/linux/wait3.c:41);
> saved pc = 0xd000c38e
> called by frame at 0xefee11dc, caller of frame at 0xefee106c
> source language c.
> Arglist at 0xefee10d8, args: stat_loc=<optimized out>,
> options=<optimized out>, usage=<optimized out>
> Locals at 0xefee10d8, Previous frame's sp is 0xefee10e0
> Saved registers:
> a2 at 0xefee106c, a3 at 0xefee1070, a5 at 0xefee1074, fp at 0xefee10d8,
> pc at 0xefee10dc
>
> That shows %a2 was saved at 0xefee106c, and the address of interest is the
> stack location immediately below that. But it has no particular
> significance: it holds a NULL pointer when the struct __rusage64 *usage
> argument to __wait4_time64() gets pushed there:
>
> 0xc00e8152 <__wait3+226>: clrl %sp@-
> 0xc00e8154 <__wait3+228>: movel %fp@(12),%sp@-
> 0xc00e8158 <__wait3+232>: movel %d0,%sp@-
> 0xc00e815a <__wait3+234>: pea 0xffffffff
> 0xc00e815e <__wait3+238>: bsrl 0xc00e8174 <__GI___wait4_time64>
>
> But it's no longer a NULL pointer at the time of the crash, though it
> should be, since that stack frame is still active.
>
> (gdb) x/16z 0xefee1068
> 0xefee1068: 0xc00e0172 0xd001e718 0xd001e498 0xd001b874
> 0xefee1078: 0x00170700 0x00170700 0x00170700 0x00005360
> 0xefee1088: 0x0000e920 0x00000006 0x00002000 0x00000002
> 0xefee1098: 0x00171f20 0x00171f20 0x00171f20 0x000000e0
>
> Beats me.
>
At the time of the crash, the corrupted %a3 was a pointer to location in
__wait3's stack. That location was a NULL pointer (the *usage parameter)
when __GI___wait4_time64 was called but now points to 0xc00e0172, which is
just after the __wait3 text and just before __GI___wait4_time64 text.
(gdb) disass __wait3
Dump of assembler code for function __wait3:
...
0xc00e015e <+238>: bsrl 0xc00e0174 <__GI___wait4_time64>
0xc00e0164 <+244>: lea %sp@(16),%sp
0xc00e0168 <+248>: braw 0xc00e00b2 <__wait3+66>
0xc00e016c <+252>: bsrl 0xc012a38c <__stack_chk_fail>
End of assembler dump.
(gdb) disass __GI___wait4_time64
Dump of assembler code for function __GI___wait4_time64:
0xc00e0174 <+0>: lea %sp@(-80),%sp
0xc00e0178 <+4>: moveml %d2-%d5/%a2-%a3/%a5,%sp@-
0xc00e017c <+8>: lea %pc@(0xc0198000),%a5
0xc00e0184 <+16>: movel %sp@(116),%d2
...
But I realize now that this stack location gets overwritten with the
return address for bsrl __stack_chk_fail, so there's nothing wrong there.
Perhaps its just a coincidence that the saved %a3, once corrupted, ended
up pointing to the *usage parameter... I don't know what to make of that.
Reply to: