[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#656196: [2.6.39 -> 3.0 regression] kernel stalls every few days (rcu_sched_state detected stall on CPU x)



Michael Below wrote:

> Same problem, see attached syslog.

Thanks.

[...]
> spamd[10981]: prefork: sysread(8) failed after 300 secs at /usr/share/perl5/Mail/SpamAssassin/SpamdForkScaling.pm line 654.
[...]
> INFO: rcu_sched detected stall on CPU 3 (t=74991 jiffies)

Grasping at straws: if you boot with idle=mwait appended to the kernel
command line, does that help?

Puzzled,
Jonathan

> sending NMI to all CPUs:
> NMI backtrace for cpu 3
> CPU 3 
> Modules linked in: powernow_k8 mperf cpufreq_conservative cpufreq_userspace cpufreq_powersave cpufreq_stats parport_pc ppdev lp parport binfmt_misc fuse smsc47b397 loop dm_crypt tpm_infineon arc4 snd_hda_codec_analog radeon ttm drm_kms_helper drm i2c_algo_bit power_supply rt73usb crc_itu_t snd_hda_intel snd_hda_codec rt2x00usb rt2x00lib snd_hwdep shpchp pci_hotplug snd_pcm_oss snd_mixer_oss snd_pcm mac80211 cfg80211 snd_seq_midi joydev snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc processor k10temp usbhid hid sp5100_tco usb_storage edac_core uas edac_mce_amd hp_wmi i2c_piix4 evdev i2c_core sparse_keymap pcspkr rfkill psmouse serio_raw button wmi thermal_sys tpm_tis tpm tpm_bios ext3 jbd mbcache dm_mod sr_mod sd_mod cdrom crc_t10dif ohci_hcd tg3 libphy floppy ahci libahci libata ehci_hcd scsi_mod usbcore usb_common [last unloaded: scsi_wait_scan]
>
> Pid: 0, comm: swapper/3 Not tainted 3.2.0-1-amd64 #1 Hewlett-Packard HP Compaq dc5850 Microtower/3029h
> RIP: 0010:[<ffffffff811afc59>]  [<ffffffff811afc59>] __const_udelay+0x17/0x20
[...]
> Call Trace:
>  <IRQ> 
>  [<ffffffff810248c1>] ? arch_trigger_all_cpu_backtrace+0x6c/0x7b
>  [<ffffffff810953f8>] ? __rcu_pending+0x82/0x337
>  [<ffffffff81011885>] ? arch_local_irq_save+0x5/0x13
>  [<ffffffff8106ba90>] ? tick_nohz_handler+0xd0/0xd0
>  [<ffffffff810959da>] ? rcu_check_callbacks+0x90/0xcc
>  [<ffffffff810526f7>] ? update_process_times+0x31/0x63
>  [<ffffffff8106bafa>] ? tick_sched_timer+0x6a/0x90
>  [<ffffffff81061e0e>] ? __run_hrtimer+0xac/0x135
>  [<ffffffff8106253e>] ? hrtimer_interrupt+0xdb/0x195
>  [<ffffffff8106af1b>] ? tick_do_broadcast.constprop.4+0x3f/0x85
[...]
> NMI backtrace for cpu 0
[...]
> Pid: 0, comm: swapper/0 Not tainted 3.2.0-1-amd64 #1 Hewlett-Packard HP Compaq dc5850 Microtower/3029h
> RIP: 0010:[<ffffffff8102b2c4>]  [<ffffffff8102b2c4>] native_safe_halt+0x2/0x3
[...]
> Call Trace:
>  [<ffffffff8101448c>] ? default_idle+0x47/0x7f
>  [<ffffffff81014583>] ? amd_e400_idle+0xbf/0xe4
>  [<ffffffff8100d25f>] ? cpu_idle+0xaf/0xf2
>  [<ffffffff816a9b3d>] ? start_kernel+0x3bd/0x3c8
>  [<ffffffff816a9140>] ? early_idt_handlers+0x140/0x140
>  [<ffffffff816a93c4>] ? x86_64_start_kernel+0x104/0x111
> Code: 89 e1 89 ee 48 c7 c7 6f 60 4c 81 31 c0 e8 16 ef 30 00 48 83 c4 18 89 d8 5b 5d 41 5c 41 5d c3 9c 58 c3 57 9d c3 fa c3 fb c3 fb f4 <c3> f4 c3 66 66 66 90 66 66 90 c3 66 66 66 90 66 66 90 c3 0f 06 
[...]
> NMI backtrace for cpu 2
[...]
> Pid: 0, comm: swapper/2 Not tainted 3.2.0-1-amd64 #1 Hewlett-Packard HP Compaq dc5850 Microtower/3029h
> RIP: 0010:[<ffffffff8102b2c4>]  [<ffffffff8102b2c4>] native_safe_halt+0x2/0x3
[...]
> Call Trace:
>  [<ffffffff8101448c>] ? default_idle+0x47/0x7f
>  [<ffffffff81014583>] ? amd_e400_idle+0xbf/0xe4
>  [<ffffffff8100d25f>] ? cpu_idle+0xaf/0xf2
>  [<ffffffff81332cda>] ? start_secondary+0x1d5/0x1db
> Code: 89 e1 89 ee 48 c7 c7 6f 60 4c 81 31 c0 e8 16 ef 30 00 48 83 c4 18 89 d8 5b 5d 41 5c 41 5d c3 9c 58 c3 57 9d c3 fa c3 fb c3 fb f4 <c3> f4 c3 66 66 66 90 66 66 90 c3 66 66 66 90 66 66 90 c3 0f 06 



Reply to: