[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: apt-get crashed 2.6.10



Justin Pryzby wrote:

On Thu, Feb 24, 2005 at 10:39:02AM +0100, yves geunes wrote:
yves geunes wrote:
Justin Pryzby wrote:
On Thu, Feb 24, 2005 at 12:30:02AM +0100, yves geunes wrote:

Hi,
I installed sarge 2.4.27-1-386 and downloaded 2.6.10. I compiled it for 686 and RAID and installed it. Whenever I run apt-get (or dselect) under the 2.6.10 kernel, the system crashes and everything dies.
When I reboot to 2.4.27 apt works normal.

Does anybody have a hint? Can I screw up the kernel so that apt crashes the system?

Is it comparable to bug #296274?

No, I think it is a new one
Okay; debian tools causing kernel crashes seemed potentially related.
Oops: 0002 [#1]
PREEMPT SMP Modules linked in: md5 ipv6 af_packet eepro100 hw_random shpchp pci_hotplug intel_agp agpgart evdev ehci_hcd usbcore e100 mii e1000 dm_mod raid1 ide_cd cdrom rtc unix ext3 jbd ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx_old pdc202xx_new opti621 ns87415 hpt366 ide_disk hpt34x generic cy82c693 cs5530 cmd64x atiixp amd74xx alim15x3 aec62xx ide_core sd_mod
CPU:    1
EIP:    0060:[<f890610c>]    Not tainted VLI
EFLAGS: 00010202 (2.6.10) EIP is at ext3_block_truncate_page+0x12c/0x330 [ext3]
eax: 00000000   ebx: 00000000   ecx: 000002d0   edx: 00000b40
esi: 00000000   edi: fa7c14c0   ebp: c174f820   esp: f559bddc
ds: 007b   es: 007b   ss: 0068
Process dpkg (pid: 4709, threadinfo=f559a000 task=f78d3520)
Stack: f587d6a0 00001000 000000d2 00000000 00000000 f14314f4 00000b40 000004c0 f1bf0ddc 00000001 00000000 f587d6a0 00000000 f8906a7d f587d6a0 c174f820 f1431598 000004c0 00000000 f1431598 c174f820 f1431598 00000400 f1431430 Call Trace:
[<f8906a7d>] ext3_truncate+0x14d/0x5e0 [ext3]
[<f8906930>] ext3_truncate+0x0/0x5e0 [ext3]
[<c014858d>] vmtruncate+0xbd/0x150
[<c0173186>] inode_setattr+0x176/0x190
[<f8907bae>] ext3_setattr+0x13e/0x290 [ext3]
[<c01733b5>] notify_change+0x1b5/0x1f0
[<c0155dec>] do_truncate+0x6c/0xb0
[<c0158dd9>] fget+0x49/0x60
[<c015648c>] sys_ftruncate64+0xcc/0x130
[<c010318f>] syscall_call+0x7/0xb

I recompiled the kernel WITHOUT SMP. I had to run dpkg --configure -a

I tried dselect afterwards and it ran fine. I'll test it further this afternoon.

Some strange things:
-The crash only happened using apt. I recompiled the kernel while running the 'faulty' kernel, but it compiled flawless. What could be the relation between apt and the crash. I haven't seen another thing crashing.
Without knowing much kernel stuff, it looks like a concurrency problem
in truncate().
Ehbe.... I'll have to look that up to see what you mean

- Allthough I only have 1 CPU, under SMP I had 2 CPU's. Maybe it's my lack og knowledge.
Are you using an Intel Pentium 4 with "Hyperthreading Technology"?
Thats what HT does: gives you two virtual CPUs.
Yes I am.
So what's the big advantage here? Should I try to use SMP to gain perfoprmance? (The box is allready spitting fast!)

Is there something like an endurance test that really tests system stability?
Compiling the kernel should be a pretty good test.
I did that, but nothing showed up. I've been recompiling kernels all day, but the system is still alive.

If you want a longer one, compile gcc:)

Seriously though, there's "crashme", which I don't think was intended
for quite this purpose, but for stress testing of another kind.
And memtest (several of them, in fact), bonnie, cpuburn, and I think
another important tool whose name escapes me..

You said the problem only occurs while using apt.  Is apt downloading
files, or forking dpkg, or ... ?
It happened right after downloading, whil reading the database or installing packages. I first thought of a network problem, but I ruled that one out (different controller, different route, different address, you name it, I tried it.)

Justin

Thanks for your thoughts,
yves



Reply to: