[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: server hangs :/



On Tue, Jul 05, 2005 at 12:50:15PM +0200, Wojciech Babicz wrote:
> I have a new Gigabyte GA-K8NSC-939 with amd64 3000+ and Kingston 2GB
> RAM (dual channel)
> 
> It has to be a router for network with NAT (and htb) with brandwidth
> about 8-10 Mbps.
> 
> I have my own kernel 2.6.12 with path-o-matic.
> 
> This router hangs after 1,5 hours, sometimes then after 2,5 hours, and sometimes
> after 10 hours....
> 
> In logs where is:
> 
> Jul  4 20:48:46 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 20:48:46 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 20:48:46 router kernel:        <4>warning: many lost ticks.
> Jul  4 20:48:46 router kernel: Your time source seems to be instable or some driver is hogging interupts

That seems to mention loosing timer ticks.  That is not good.

> Jul  4 20:48:46 router kernel: rip release_console_sem+0x3e/0xb0
> Jul  4 20:48:46 router kernel: <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 20:48:46 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 20:48:46 router kernel:
> Jul  4 20:48:46 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 20:48:46 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 20:48:46 router kernel:        <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 20:48:46 router kernel: scheduling while atomic: swapper/0x00000100/0
> 
> [...]
> Jul  4 22:20:45 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 22:20:45 router kernel:
> Jul  4 22:20:45 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 22:20:45 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 22:20:45 router kernel:        <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 22:20:45 router shutdown[23018]: shutting down for system halt
> Jul  4 22:20:45 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 22:20:45 router kernel:
> Jul  4 22:20:45 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 22:20:45 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 22:20:45 router kernel:        <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 22:20:45 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 22:20:45 router kernel:
> Jul  4 22:20:45 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 22:20:45 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 22:20:45 router kernel:        <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 22:20:45 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 22:20:45 router kernel:
> Jul  4 22:20:45 router kernel: Call Trace:<ffffffff8034953a>{schedule+122} <ffffffff8010b820>{default_idle+0}
> Jul  4 22:20:45 router kernel:        <ffffffff8010b972>{cpu_idle+82} <ffffffff804b280e>{start_kernel+350}
> Jul  4 22:20:45 router kernel:        <ffffffff804b226a>{x86_64_start_kernel+362}
> Jul  4 22:20:45 router kernel: scheduling while atomic: swapper/0x00000100/0
> Jul  4 22:20:45 router kernel:
> 
> 
> [...]
> 
> and as server starts where is (only IMHO suspicous entry):
> 
> Jul  4 22:38:41 router kernel: PCI: Probing PCI hardware (bus 00)
> Jul  4 22:38:41 router kernel: Boot video device is 0000:02:06.0
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
> Jul  4 22:38:41 router kernel: ACPI: Power Resource [ISAV] (on)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LNK3] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LUBA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LAPU] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LFIR] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [L3CM] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC1] (IRQs *16), disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC2] (IRQs *17), disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC3] (IRQs *18), disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC4] (IRQs *19), disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCI] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22) *0, disabled.
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APCS] (IRQs *23), disabled.
> 
> 
> and other:
> Jul  4 22:38:41 router kernel: PCI: Using ACPI for IRQ routing
> Jul  4 22:38:41 router kernel: PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
> 
> and other:
> Jul  4 22:38:41 router kernel: 8139too Fast Ethernet driver 0.9.27
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
> Jul  4 22:38:41 router kernel: ACPI: PCI Interrupt 0000:02:0a.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 18
> Jul  4 22:38:41 router kernel: eth1: RealTek RTL8139 at 0xffffc20000058000, 00:30:4f:3b:25:06, IRQ 18
> 
> What could be reason of server problems.

Was this machine ever stable in the past or is it a new install on a new
machine?

One posibility is a bug in the BIOS (or even hardware) making the timer
interrupt unreliable.  Could also be a bug in the kernel you are trying
to use.

If it is a bios bug, perhaps turning off acpi or apic will help.  There
are kernel commandline options to do that.

Does it happen with older kernels too?  Does it happen without the
patch-o-matic stuff applied?

Len Sorensen



Reply to: