Re: apt-get crashed 2.6.10
Justin Pryzby wrote:
On Thu, Feb 24, 2005 at 10:39:02AM +0100, yves geunes wrote:
 
yves geunes wrote:
   
Justin Pryzby wrote:
     
On Thu, Feb 24, 2005 at 12:30:02AM +0100, yves geunes wrote:
       
 
Hi,
I installed sarge 2.4.27-1-386 and downloaded 2.6.10. I compiled it 
for 686 and RAID and installed it.
Whenever I run apt-get  (or dselect) under the 2.6.10 kernel, the 
system crashes and everything dies.
When I reboot to 2.4.27 apt works normal.
Does anybody have a hint? Can I screw up the kernel so that apt 
crashes the system?
         
 
Is it comparable to bug #296274?
       
No, I think it is a new one
     
Okay; debian tools causing kernel crashes seemed potentially related.
 
 
Oops: 0002 [#1]
PREEMPT SMP Modules linked in: md5 ipv6 af_packet eepro100 hw_random 
shpchp pci_hotplug intel_agp agpgart evdev ehci_hcd usbcore e100 mii 
e1000 dm_mod raid1 ide_cd cdrom rtc unix ext3 jbd ide_generic 
via82cxxx trm290 triflex slc90e66 sis5513 siimage serverworks sc1200 
rz1000 piix pdc202xx_old pdc202xx_new opti621 ns87415 hpt366 ide_disk 
hpt34x generic cy82c693 cs5530 cmd64x atiixp amd74xx alim15x3 aec62xx 
ide_core sd_mod
CPU:    1
EIP:    0060:[<f890610c>]    Not tainted VLI
EFLAGS: 00010202   (2.6.10) EIP is at 
ext3_block_truncate_page+0x12c/0x330 [ext3]
eax: 00000000   ebx: 00000000   ecx: 000002d0   edx: 00000b40
esi: 00000000   edi: fa7c14c0   ebp: c174f820   esp: f559bddc
ds: 007b   es: 007b   ss: 0068
Process dpkg (pid: 4709, threadinfo=f559a000 task=f78d3520)
Stack: f587d6a0 00001000 000000d2 00000000 00000000 f14314f4 00000b40 
000004c0       f1bf0ddc 00000001 00000000 f587d6a0 00000000 f8906a7d 
f587d6a0 c174f820       f1431598 000004c0 00000000 f1431598 c174f820 
f1431598 00000400 f1431430 Call Trace:
[<f8906a7d>] ext3_truncate+0x14d/0x5e0 [ext3]
[<f8906930>] ext3_truncate+0x0/0x5e0 [ext3]
[<c014858d>] vmtruncate+0xbd/0x150
[<c0173186>] inode_setattr+0x176/0x190
[<f8907bae>] ext3_setattr+0x13e/0x290 [ext3]
[<c01733b5>] notify_change+0x1b5/0x1f0
[<c0155dec>] do_truncate+0x6c/0xb0
[<c0158dd9>] fget+0x49/0x60
[<c015648c>] sys_ftruncate64+0xcc/0x130
[<c010318f>] syscall_call+0x7/0xb
       
I recompiled the kernel WITHOUT SMP. I had to run         dpkg 
--configure -a
I tried dselect afterwards and it ran fine. I'll test it further this 
afternoon.
Some strange things:
-The crash only happened using apt. I recompiled the kernel while 
running the 'faulty' kernel, but it compiled flawless.  What could be 
the relation between apt and the crash. I haven't seen  another thing 
crashing.
   
Without knowing much kernel stuff, it looks like a concurrency problem
in truncate().
 
Ehbe.... I'll have to look that up to see what you mean
 
- Allthough I only have 1 CPU, under SMP I had 2 CPU's. Maybe it's my 
lack og knowledge.
   
Are you using an Intel Pentium 4 with "Hyperthreading Technology"?
Thats what HT does: gives you two virtual CPUs.
 
Yes I am.
So what's the big advantage here? Should I try to use SMP to gain 
perfoprmance? (The box is allready spitting fast!)
 
Is there something like an endurance test that really tests system 
stability?
   
Compiling the kernel should be a pretty good test.  
I did that, but nothing showed up. I've been recompiling kernels all 
day, but the system is still alive.
If you want a longer one, compile gcc:)
Seriously though, there's "crashme", which I don't think was intended
for quite this purpose, but for stress testing of another kind.
And memtest (several of them, in fact), bonnie, cpuburn, and I think
another important tool whose name escapes me..
You said the problem only occurs while using apt.  Is apt downloading
files, or forking dpkg, or ... ?
 
It happened right after downloading, whil reading the database or 
installing packages. I first thought of a network problem, but I ruled 
that one out (different controller, different route, different address, 
you  name it, I tried it.)
Justin
 
Thanks for your thoughts,
yves
Reply to: