Re: Fixed strace [ was Re: ls -l is broken ]
> On Wed, May 06, 2009 at 01:39:49PM -0400, John David Anglin wrote:
> > > The tombstone is:
> > >
> > > do_page_fault() pid=10205 command='strace' type=15 address=0x407d2f18
> > > vm_start = 0x4068d000, vm_end = 0x4068f000
> >
> > So, the pointer passed to __canonicalize_funcptr_for_compare is outside
> > the vm range.
> >
> > Maybe "info sharedlib" will show something. Need to find out why the
> > address of the function descriptor is outside the vm range.
> >
>
> 405c0000-405c2000 rwxp 405c0000 00:00 0
>
> is the output of /proc/maps there... No idea wtf this is. :/
What are vm_start and vm_end?
Another segv last night.
Core was generated by `rm -f ada/bldtools/nmake_s/sinfo.ads ada/bldtools/nmake_s/nmake.adt ada/bldtool'.
Program terminated with signal 11, Segmentation fault.
[New process 12287]
#0 _dl_relocate_object (scope=0x40000db8, lazy=<value optimized out>,
consider_profiling=0) at do-rel.h:119
119 do-rel.h: No such file or directory.
in do-rel.h
(gdb) p/x $pc
$1 = 0x402759ac
0x40275980 <_dl_relocate_object+728>: ldw 4(ret0),r17
0x40275984 <_dl_relocate_object+732>: ldw 4(r9),r21
0x40275988 <_dl_relocate_object+736>: extrw,u r21,23,24,r22
0x4027598c <_dl_relocate_object+740>: depw,z r22,27,28,r13
0x40275990 <_dl_relocate_object+744>: ldw 0(r9),r20
0x40275994 <_dl_relocate_object+748>: add,l r16,r13,r8
0x40275998 <_dl_relocate_object+752>: stw r8,8(r3)
0x4027599c <_dl_relocate_object+756>: add,l r12,r20,r10
0x402759a0 <_dl_relocate_object+760>: ldb c(r8),ret0
0x402759a4 <_dl_relocate_object+764>: extrw,u r21,31,8,r6
---Type <return> to continue, or q <return> to quit---
0x402759a8 <_dl_relocate_object+768>: extrw,u ret0,27,28,ret0
0x402759ac <_dl_relocate_object+772>: ldh,s r22(r17),r31
0x402759b0 <_dl_relocate_object+776>: ldw 170(r11),ret1
0x402759b4 <_dl_relocate_object+780>: cmpib,= 0,ret0,0x40275a64 <_dl_relocate_object+956>
0x402759b8 <_dl_relocate_object+784>: copy r11,r5
End of assembler dump.
(gdb) p/x $r11
$3 = 0x40000c00
(gdb) p/x $r11 + 0xe4
$4 = 0x40000ce4
(gdb) x/x 0x40000ce4
0x40000ce4: 0x4071e828
(gdb) x/x 0x4071e828 + 4
0x4071e82c <.LC2+200>: 0xa38cf763
(gdb) p/x $r22
$6 = 0x1
May 8 22:39:41 mx3210 kernel: do_page_fault() pid=12287 command='rm' type=15 address=0xa38cf765
May 8 22:39:41 mx3210 kernel: vm_start = 0x40724000, vm_end = 0x40726000
So, the address for ldh,s matches that in the tombstone.
...
002ae000-005f5000 rwxp 002ae000 00:00 0 [heap]
40000000-4000c000 rw-p 40000000 00:00 0
4000c000-40011000 r-xp 00000000 08:03 640603 /lib/libthread_db-1.0.so
...
4066d000-40670000 rwxp 0007a000 08:03 641636 /lib/libm-2.9.so
40670000-4073d000 rw-p 4028d000 00:00 0
...
40b26000-410fe000 rw-p 40b26000 00:00 0
c0215000-c022a000 rwxp c0215000 00:00 0 [stack]
I don't see anything in /proc/maps that matches the vm range in the tombstone.
Comparing memory for the core dump with a normal start to main:
Core dump
(gdb) x/64x 0x4071e800
0x4071e800 <.LC2+156>: 0x6ffffffc 0x000129a8 0x6ffffffd 0x00000013
0x4071e810 <.LC2+172>: 0x0000001e 0x00000014 0x6ffffffe 0x00012c3c
0x4071e820 <.LC2+188>: 0x6fffffff 0x00000001 0x6ffffff0 0xa38cf763
0x4071e830 <.LC2+204>: 0x6ffffff9 0x00000d5b 0x00000000 0x00000000
0x4071e840 <.LC2+220>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e850 <.LC2+236>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e860 <__libc_multiple_libcs>: 0x00000001 0x00000000 0x00000000 0x00000000
0x4071e870 <__gconv_lock>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e880 <__gconv_lock+16>: 0x00000001 0x00000001 0x00000001 0x00000001
0x4071e890 <__gconv_lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8a0 <lock.11041>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8b0 <lock.11041+16>: 0x00000001 0x00000001 0x00000001 ---Type <return> to continue, or q <return> to quit---
0x00000001
0x4071e8c0 <lock.11041+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8d0 <lock>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8e0 <lock+16>: 0x00000001 0x00000001 0x00000001 0x00000001
0x4071e8f0 <lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
Normal:
(gdb) x/64x 0x4071e800
0x4071e800 <.LC2+156>: 0x6ffffffc 0x000129a8 0x6ffffffd 0x00000013
0x4071e810 <.LC2+172>: 0x0000001e 0x00000014 0x6ffffffe 0x00012c3c
0x4071e820 <.LC2+188>: 0x6fffffff 0x00000001 0x6ffffff0 0x405ea8cc
0x4071e830 <.LC2+204>: 0x6ffffff9 0x00000d5b 0x00000000 0x00000000
0x4071e840 <.LC2+220>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e850 <.LC2+236>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e860 <__libc_multiple_libcs>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e870 <__gconv_lock>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e880 <__gconv_lock+16>: 0x00000001 0x00000001 0x00000001 0x00000001
0x4071e890 <__gconv_lock+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8a0 <lock.11041>: 0x00000000 0x00000000 0x00000000 0x00000000
0x4071e8b0 <lock.11041+16>: 0x00000001 0x00000001 0x00000001
I sure looks as if memory has been stomped. Specifically, the word that
caused the segv. The surrounding values are the same.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Reply to: