Re: dpkg-1.1.1elf dumps core
(This was originally private email, but it's so puzzling I'd like some
of the other people from debian-devel to take a look and see if they
can spot anything I can't. I hope Chris doesn't mind.)
Christopher J. Fearnley writes ("dpkg-1.1.1elf dumps core"):
> I forget which file I was trying to install (gpm?). I did some other
> dpkg maintenance and then, eventually I was able to install this and
> the other packages that dpkg bombed on. I was using simple dpkg -i
> pkg-name commands. Here is the core dump:
Thanks for sending me that (but, btw, my address for dpkg maintainance
is this one, or the one in the Maintainer field, not ijackson@chiark).
I'm very puzzled by this coredump. I've examined it in detail and
AFAICT[1] it looks like a hardware problem or some such !
The state of the registers at the time of the coredump does not appear
to be consistent with the state of the memory and the code immediately
preceding the program counter. Look:
(gdb) disassemble 0x800ecb5 0x800ecd3
Dump of assembler code from 0x800ecb5 to 0x800ecd3:
0x800ecb5 <parsedb+2837>: movl 0xffffff8c(%ebp),%eax
0x800ecb8 <parsedb+2840>: testl %eax,%eax
0x800ecba <parsedb+2842>: je 0x800ecd9 <parsedb+2873>
0x800ecbc <parsedb+2844>: cmpb $0x0,(%eax)
0x800ecbf <parsedb+2847>: je 0x800ecd9 <parsedb+2873>
0x800ecc1 <parsedb+2849>: testb $0x4,0xc(%ebp)
0x800ecc5 <parsedb+2853>: je 0x800ecd3 <parsedb+2867>
0x800ecc7 <parsedb+2855>: movl 0x1c(%edx),%eax
0x800ecca <parsedb+2858>: testl %eax,%eax
0x800eccc <parsedb+2860>: je 0x800ecd3 <parsedb+2867>
0x800ecce <parsedb+2862>: cmpb $0x0,(%eax)
0x800ecd1 <parsedb+2865>: jne 0x800ecd9 <parsedb+2873>
End of assembler dump.
(gdb) print $pc
$34 = (void *) 0x800ecce
(gdb) p/x $eax
$35 = 0x2000
(gdb) p/x $edx+0x1c
$38 = 0x80395b8
(gdb) p/x *(void**)((char*)$edx+0x1c)
$36 = 0x0
(gdb) print *(void**)0x80395b8
$39 = (void *) 0x0
(gdb)
[ There's a full register dump below. ]
As you can see, pc is set to 0x800ecce, the address of the final cmpb,
and indeed eax contains 0x2000, an invalid address, so it's not
surprising that the program should dump core then. However, three
instructions previously eax was loaded from edx+0x1c, which contains
zero in the core image, according to gdb at least !
If we look at it at a higher level, according to the debugging symbols
and so forth, the program stopped at line 257, where the code looks
like this:
if (newpig.section && *newpig.section &&
!((flags & pdb_weakclassification) && pigp->section && *pigp->section))
pigp->section= newpig.section;
This looks quite consistent with the disassembly, and if I print *pigp
I see that ->section is indeed null:
(gdb) print *pigp
$40 = {next = 0x0, name = 0x8039590 "xtron", want = want_unknown,
eflag = eflagv_ok, status = stat_notinstalled, priority = pri_unknown,
otherpriority = 0x0, section = 0x0, configversion = 0x0,
configrevision = 0x0, files = 0x0, installed = {valid = 1, depends = 0x0,
depended = 0x0, essential = 0, description = 0x0, maintainer = 0x0,
version = 0x0, revision = 0x0, source = 0x0, architecture = 0x0,
conffiles = 0x0, arbs = 0x0}, available = {valid = 0, depends = 0x0,
depended = 0x0, essential = 0, description = 0x0, maintainer = 0x0,
version = 0x0, revision = 0x0, source = 0x0, architecture = 0x0,
conffiles = 0x0, arbs = 0x0}, clientdata = 0x0}
(gdb)
So, I conclude that there is something wrong with either the
transmission of the core file or the hardware in your system.
Here's the MD5 of the core file:
05c5d6099ee83bc615fbb2742923da7e core
Could you please check the MD5 of your copy (I hope you still have
it) ?
Thanks,
Ian.
(gdb) info r
eax 0x2000 8192
ecx 0x8039600 134452736
edx 0x803959c 134452636
ebx 0xa 10
esp 0xbffffc34 0xbffffc34
ebp 0xbffffd68 0xbffffd68
esi 0xbffffc6c -1073742740
edi 0x80395c8 134452680
eip 0x800ecce 0x800ecce
ps 0x10206 66054
cs 0x100023 1048611
ss 0x9002b 589867
ds 0xffff002b -65493
es 0x803002b 134414379
fs 0x2b 43
gs 0x2b 43
(gdb)
Reply to: