Your message dated Sat, 08 May 2021 08:25:29 -0700 (PDT) with message-id <6096ad69.1c69fb81.8d5a4.d3f3@mx.google.com> and subject line Closing this bug (BTS maintenance for src:linux bugs) has caused the Debian Bug report #774702, regarding linux-image-3.16.0-4-amd64: Regression in topology for multi-NUMA-node Haswell Xeon CPUs to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@bugs.debian.org immediately.) -- 774702: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=774702 Debian Bug Tracking System Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: <submit@bugs.debian.org>
- Subject: linux-image-3.16.0-4-amd64: Regression in topology for multi-NUMA-node Haswell Xeon CPUs
- From: Mehdi Dogguy <mehdi@dogguy.org>
- Date: Tue, 06 Jan 2015 15:39:35 +0100
- Message-id: <be3325d6d18afcabbe24c431237b5242@dogguy.org>
Package: src:linux Version: 3.16.7-ckt2-1 Severity: normal Dear Maintainer,On a machine with 2 Intel Haswell Xeon E5-2697 v3 CPUs, we are observing a regression in how topology is detected. Using Wheezy, Linux detects 2 socketsand output the following text: ====><===============Jan 6 15:15:11 pocn001 kernel: [ 0.450629] Booting Node 0, Processors #1 Jan 6 15:15:11 pocn001 kernel: [ 0.455199] smpboot cpu 1: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 0.567069] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 0.573406] #2Jan 6 15:15:11 pocn001 kernel: [ 0.575160] smpboot cpu 2: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 0.686818] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 0.693158] #3Jan 6 15:15:11 pocn001 kernel: [ 0.694911] smpboot cpu 3: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 0.806473] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 0.812809] #4Jan 6 15:15:11 pocn001 kernel: [ 0.814562] smpboot cpu 4: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 0.926220] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 0.932548] #5Jan 6 15:15:11 pocn001 kernel: [ 0.934302] smpboot cpu 5: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.045959] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.052293] #6Jan 6 15:15:11 pocn001 kernel: [ 1.054047] smpboot cpu 6: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.165709] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.172099] Ok.Jan 6 15:15:11 pocn001 kernel: [ 1.174143] Booting Node 1, Processors #7 Jan 6 15:15:11 pocn001 kernel: [ 1.178712] smpboot cpu 7: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.289472] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.295830] #8Jan 6 15:15:11 pocn001 kernel: [ 1.297584] smpboot cpu 8: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.409242] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.415599] #9Jan 6 15:15:11 pocn001 kernel: [ 1.417354] smpboot cpu 9: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.529010] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.535350] #10Jan 6 15:15:11 pocn001 kernel: [ 1.537201] smpboot cpu 10: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.648655] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.654984] #11Jan 6 15:15:11 pocn001 kernel: [ 1.656835] smpboot cpu 11: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.768484] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.774815] #12Jan 6 15:15:11 pocn001 kernel: [ 1.776667] smpboot cpu 12: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 1.888219] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 1.894552] #13Jan 6 15:15:11 pocn001 kernel: [ 1.896403] smpboot cpu 13: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.008055] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.014445] Ok.Jan 6 15:15:11 pocn001 kernel: [ 2.016491] Booting Node 2, Processors #14 Jan 6 15:15:11 pocn001 kernel: [ 2.021156] smpboot cpu 14: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.131722] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.138096] #15Jan 6 15:15:11 pocn001 kernel: [ 2.139948] smpboot cpu 15: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.251343] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.257713] #16Jan 6 15:15:11 pocn001 kernel: [ 2.259564] smpboot cpu 16: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.371119] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.377469] #17Jan 6 15:15:11 pocn001 kernel: [ 2.379320] smpboot cpu 17: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.490874] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.497218] #18Jan 6 15:15:11 pocn001 kernel: [ 2.499070] smpboot cpu 18: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.610525] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.616866] #19Jan 6 15:15:11 pocn001 kernel: [ 2.618717] smpboot cpu 19: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.730272] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.736616] #20Jan 6 15:15:11 pocn001 kernel: [ 2.738468] smpboot cpu 20: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.850025] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.856412] Ok.Jan 6 15:15:11 pocn001 kernel: [ 2.858455] Booting Node 3, Processors #21 Jan 6 15:15:11 pocn001 kernel: [ 2.863122] smpboot cpu 21: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 2.973884] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 2.980261] #22Jan 6 15:15:11 pocn001 kernel: [ 2.982113] smpboot cpu 22: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.093568] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.099939] #23Jan 6 15:15:11 pocn001 kernel: [ 3.101791] smpboot cpu 23: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.213261] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.219631] #24Jan 6 15:15:11 pocn001 kernel: [ 3.221483] smpboot cpu 24: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.332984] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.339329] #25Jan 6 15:15:11 pocn001 kernel: [ 3.341181] smpboot cpu 25: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.452836] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.459188] #26Jan 6 15:15:11 pocn001 kernel: [ 3.461040] smpboot cpu 26: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.572499] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.578847] #27 Ok.Jan 6 15:15:11 pocn001 kernel: [ 3.581277] smpboot cpu 27: start_ip = 89000 Jan 6 15:15:11 pocn001 kernel: [ 3.692337] NMI watchdog enabled, takes one hw-pmu counter.Jan 6 15:15:11 pocn001 kernel: [ 3.698561] Brought up 28 CPUsJan 6 15:15:11 pocn001 kernel: [ 3.701962] Total of 28 processors activated (145597.28 BogoMIPS).====><=============== lscpu gives: ====><=============== # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 28 On-line CPU(s) list: 0-27 Thread(s) per core: 1 Core(s) per socket: 14 Socket(s): 2 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Stepping: 2 CPU MHz: 2601.000 BogoMIPS: 5199.94 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 17920K NUMA node0 CPU(s): 0-6 NUMA node1 CPU(s): 7-13 NUMA node2 CPU(s): 14-20 NUMA node3 CPU(s): 21-27 ====><===============Booting the same machine, or one with the exact same hardware, using Jessie's kernelleads to a different result: ====><===============Jan 6 13:58:55 pocn501 kernel: [ 0.444912] x86: Booting SMP configuration: Jan 6 13:58:55 pocn501 kernel: [ 0.449579] .... node #0, CPUs: #1 Jan 6 13:58:55 pocn501 kernel: [ 0.468345] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.Jan 6 13:58:55 pocn501 kernel: [ 0.477630] #2 #3 #4 #5 #6Jan 6 13:58:55 pocn501 kernel: [ 0.551311] .... node #1, CPUs: #7 Jan 6 13:58:55 pocn501 kernel: [ 0.567061] ------------[ cut here ]------------ Jan 6 13:58:55 pocn501 kernel: [ 0.572421] WARNING: CPU: 7 PID: 0 at /build/linux-CMiYW9/linux-3.16.7-ckt2/arch/x86/kernel/smpboot.c:310 topology_sane.isra.2+0x7b/0x90() Jan 6 13:58:55 pocn501 kernel: [ 0.586304] sched: CPU #7's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.Jan 6 13:58:55 pocn501 kernel: [ 0.597176] Modules linked in:Jan 6 13:58:55 pocn501 kernel: [ 0.600591] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt2-1 Jan 6 13:58:55 pocn501 kernel: [ 0.610011] Hardware name: IBM IBM NeXtScale nx360 M5 -[5465FT1]-/00KG122, BIOS -[THE104FUS-1.03]- 11/26/2014 Jan 6 13:58:55 pocn501 kernel: [ 0.621079] 0000000000000009 ffffffff81507263 ffff88046f9f7e58 ffffffff81065847 Jan 6 13:58:55 pocn501 kernel: [ 0.629367] 0000000000000001 ffff88046f9f7ea8 ffff88087fc12980 0000000000012980 Jan 6 13:58:55 pocn501 kernel: [ 0.637657] 000000000000a060 ffffffff810658ac ffffffff8170f760 ffff880400000030Jan 6 13:58:55 pocn501 kernel: [ 0.645948] Call Trace:Jan 6 13:58:55 pocn501 kernel: [ 0.648678] [<ffffffff81507263>] ? dump_stack+0x41/0x51 Jan 6 13:58:55 pocn501 kernel: [ 0.654607] [<ffffffff81065847>] ? warn_slowpath_common+0x77/0x90 Jan 6 13:58:55 pocn501 kernel: [ 0.661505] [<ffffffff810658ac>] ? warn_slowpath_fmt+0x4c/0x50 Jan 6 13:58:55 pocn501 kernel: [ 0.668112] [<ffffffff810027ae>] ? calibrate_delay+0xbe/0x910 Jan 6 13:58:55 pocn501 kernel: [ 0.674622] [<ffffffff8104236b>] ? topology_sane.isra.2+0x7b/0x90 Jan 6 13:58:55 pocn501 kernel: [ 0.681519] [<ffffffff81042844>] ? set_cpu_sibling_map+0x484/0x500 Jan 6 13:58:55 pocn501 kernel: [ 0.688515] [<ffffffff81042a04>] ? start_secondary+0x144/0x2d0 Jan 6 13:58:55 pocn501 kernel: [ 0.695123] ---[ end trace 7f2af1a99481016b ]---Jan 6 13:58:55 pocn501 kernel: [ 0.720515] #8 #9 #10 #11 #12 #13Jan 6 13:58:55 pocn501 kernel: [ 0.808491] .... node #2, CPUs: #14 #15 #16 #17 #18 #19 #20 Jan 6 13:58:55 pocn501 kernel: [ 1.011650] .... node #3, CPUs: #21 #22 #23 #24 #25 #26 #27 Jan 6 13:58:55 pocn501 kernel: [ 1.135087] x86: Booted up 4 nodes, 28 CPUs Jan 6 13:58:55 pocn501 kernel: [ 1.139961] smpboot: Total of 28 processors activated (145614.25 BogoMIPS)====><=============== and lscpu gives: ====><=============== # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 28 On-line CPU(s) list: 0-27 Thread(s) per core: 1 Core(s) per socket: 7 Socket(s): 4 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz Stepping: 2 CPU MHz: 1272.679 CPU max MHz: 3600,0000 CPU min MHz: 1200,0000 BogoMIPS: 5201.29 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 17920K NUMA node0 CPU(s): 0-6 NUMA node1 CPU(s): 7-13 NUMA node2 CPU(s): 14-20 NUMA node3 CPU(s): 21-27 ====><===============I attach relevant log file and bug script output for your convenience. Pleaselet me know if you need more details. Looking at recent changes in Linux 3.18, it might be resolved using: - cebf15eb09a2fd2fa73ee4faa9c4d2f813cf0f09 - 728e5653e6fdb2a0892e94a600aef8c9a036c7eb (We intend to test this during the week). Regards -- MehdiAttachment: kern_log_3.2.0-4-amd64.gz
Description: Binary dataAttachment: kern_log_3.16.0-4-amd64.gz
Description: Binary dataAttachment: reportbug-linux-image-3.2.0-4-amd64.gz
Description: Binary dataAttachment: reportbug-linux-image-3.16.0-4-amd64.gz
Description: Binary data
--- End Message ---
--- Begin Message ---
- To: 774702-done@bugs.debian.org
- Cc: 774702-submitter@bugs.debian.org
- Subject: Closing this bug (BTS maintenance for src:linux bugs)
- From: carnil@debian.org
- Date: Sat, 08 May 2021 08:25:29 -0700 (PDT)
- Message-id: <6096ad69.1c69fb81.8d5a4.d3f3@mx.google.com>
Hi This bug was filed for a very old kernel or the bug is old itself without resolution. If you can reproduce it with - the current version in unstable/testing - the latest kernel from backports please reopen the bug, see https://www.debian.org/Bugs/server-control for details. Regards, Salvatore
--- End Message ---