[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#645771: marked as done (soft lockup - CPU#0 stuck for 235s! [md1_raid1:1038] )



Your message dated Sun, 07 Feb 2016 22:43:08 +0000
with message-id <E1aSY2u-0006H4-Q9@deadeye>
and subject line Closing bugs assigned to linux-2.6 package
has caused the Debian Bug report #645771,
regarding soft lockup - CPU#0 stuck for 235s! [md1_raid1:1038] 
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
645771: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=645771
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---

Package: linux-image

Version: 2.6.26-2-amd64

Severity: important

 

Hardware: A machine with a pair of 2.5” SATA disks, configured with three RAID1 devices (/dev/md1, /dev/md2, /dev/md3), reports a soft lockup as shown below.

 

The system is running Debian Linux 2.6.26-2-amd64, libc6 2.7. It has a custom card (requiring the mfb and memmgmnt drivers as shown in the log). This card shouldn’t affect the disks in any way.

 

The system contains a hardware watchdog; and the long soft lockup caused the watchdog to expire, resetting the system. On reboot, we weren’t able to detect any signs of trauma on the disks, so we’re left with a diagnosis of a software lockup in md1_raid1 which took the box down due to a watchdog expiry (i.e. this problem is critical for the system in question).

 

2011-07-03T00:57:02-07:00 testhost kernel: [2500677.138796] md: delaying data-check of md1 until md0 has finished (they share one or more physical units)

2011-07-03T00:57:02-07:00 testhost kernel: [2500677.142796] md: delaying data-check of md2 until md1 has finished (they share one or more physical units)

2011-07-03T00:57:02-07:00 testhost kernel: [2500677.142796] md: delaying data-check of md1 until md0 has finished (they share one or more physical units)

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] BUG: soft lockup - CPU#0 stuck for 235s! [md1_raid1:1038] <=========

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] Modules linked in: mfb_driver(P) memmgmnt_driver(P) acpi_cpufreq cpufreq_ondemand cpufreq_conservative cpufreq_stats freq_table cpufreq_powersave cpufreq_userspace ipv6 xt_limit xt_tcpudp xt_state ipt_LOG nf_conntrack_ftp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables coretemp dme1737 lm85 hwmon_vid i2c_i801 serio_raw i2c_core container pcspkr psmouse snd_pcm snd_timer snd soundcore snd_page_alloc button joydev evdev ext3 jbd mbcache raid1 md_mod sd_mod usbhid hid ff_memless ahci libata scsi_mod dock e1000e ehci_hcd uhci_hcd thermal processor fan thermal_sys [last unloaded: mfb_driver]

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] CPU 0:

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] Modules linked in: mfb_driver(P) memmgmnt_driver(P) acpi_cpufreq cpufreq_ondemand cpufreq_conservative cpufreq_stats freq_table cpufreq_powersave cpufreq_userspace ipv6 xt_limit xt_tcpudp xt_state ipt_LOG nf_conntrack_ftp iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables coretemp dme1737 lm85 hwmon_vid i2c_i801 serio_raw i2c_core container pcspkr psmouse snd_pcm snd_timer snd soundcore snd_page_alloc button joydev evdev ext3 jbd mbcache raid1 md_mod sd_mod usbhid hid ff_memless ahci libata scsi_mod dock e1000e ehci_hcd uhci_hcd thermal processor fan thermal_sys [last unloaded: mfb_driver]

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] Pid: 1038, comm: md1_raid1 Tainted: P          2.6.26-2-amd64 #1

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] RIP: 0010:[<ffffffffa005bb7a>]  [<ffffffffa005bb7a>] :scsi_mod:scsi_request_fn+0x2ad/0x395

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] RSP: 0018:ffff81020bcbfb18  EFLAGS: 00000202

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] RAX: ffff81020bc8c050 RBX: ffff81020c9f1800 RCX: ffff81020c9f1848

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] RDX: ffff81020c9f1848 RSI: ffff81020bc023f8 RDI: ffff81020bc8c050

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] RBP: ffff81020c9f1800 R08: 0000000000000000 R09: ffff8100379f4ac0

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] R10: ffff81020bc020d8 R11: ffff81020bc020b8 R12: 00001d4c0bc020b8

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] R13: ffff81020bc020b8 R14: ffff81020bc023f8 R15: ffff81020bc023f8

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] FS:  0000000000000000(0000) GS:ffffffff8053c000(0000) knlGS:0000000000000000

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] CR2: 0000000001a8bfc0 CR3: 000000020c9a8000 CR4: 00000000000006e0

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841]

 

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841] Call Trace:

2011-07-03T01:17:22-07:00 testhost kernel: [2502418.212841]  [<ffffffffa005bb7a>] ? :scsi_mod:scsi_request_fn+0x2ad/0x395

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8030b8d6>] ? elv_insert+0xf2/0x220

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8030e179>] ? __make_request+0x3af/0x3fb

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8030c9ef>] ? generic_make_request+0x2fe/0x339

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff802bddf2>] ? bio_alloc_bioset+0x89/0xd9

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8030ddc3>] ? submit_bio+0xdb/0xe2

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00e31d2>] ? :md_mod:write_page+0x199/0x2a7

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00e406d>] ? :md_mod:bitmap_unplug+0xae/0x18b

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8023d2f7>] ? del_timer+0x56/0x5f

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00efbc5>] ? :raid1:flush_pending_writes+0x56/0x8d

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00f0119>] ? :raid1:raid1d+0x6d/0xc8f

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff80428e9f>] ? thread_return+0x6b/0xac

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8023cf95>] ? lock_timer_base+0x26/0x4b

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8023d00b>] ? try_to_del_timer_sync+0x51/0x5a

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8023d020>] ? del_timer_sync+0xc/0x16

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8042916a>] ? schedule_timeout+0x92/0xad

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8023cc88>] ? process_timeout+0x0/0x5

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8042915d>] ? schedule_timeout+0x85/0xad

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00e1bc1>] ? :md_mod:md_thread+0xd7/0xed

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff80246221>] ? autoremove_wake_function+0x0/0x2e

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffffa00e1aea>] ? :md_mod:md_thread+0x0/0xed

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff802460fb>] ? kthread+0x47/0x74

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff802301e9>] ? schedule_tail+0x27/0x5c

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8020cf28>] ? child_rip+0xa/0x12

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff802460b4>] ? kthread+0x0/0x74

2011-07-03T01:17:23-07:00 testhost kernel: [2502418.212841]  [<ffffffff8020cf1e>] ? child_rip+0x0/0x12

 


--- End Message ---
--- Begin Message ---
Version: 3.4.1-1~experimental.1+rm

Debian 6.0 Long Term Support has now ended, and the 'linux-2.6' source
package will no longer be updated.  This bug is being closed on the
assumption that it does not affect the kernel versions in newer Debian
releases.

If you can still reproduce this bug in a newer release, please reopen
the bug report and reassign it to 'src:linux' and the affected version
of the package.  You can find the package version for the running
kernel by running:

    uname -v

or the versions of all installed kernel packages by running:

    dpkg -l 'linux-image-[34]*' | grep ^.i

and looking at the third column.

I apologise that we weren't able to provide a specific resolution for
this bug.

Ben.

-- 
Ben Hutchings - Debian developer, member of Linux kernel and LTS teams

--- End Message ---

Reply to: