Bug#628444: iwlagn - "MAC is in deep sleep", cannot restore wifi operation
found 628444 linux-2.6/3.2.9-1
tags 628444 + upstream patch moreinfo
quit
Hi Dafydd,
Dafydd Harries wrote:
> I've been seeing similar problems with my "Intel Corporation Centrino
> Ultimate-N 6300".
>
> Like others, the problems seemed to start around 2.6.39.
Odd. What kernel did you use before then? (/var/log/dpkg.log might
tell.)
> Like othes, the card flakes out a day or two after booting, and a reboot
> always fixes the problem. Occasionally it stays working for longer.
>
> Like others, I've added RAM. But as far as I can recall the upgrade
> happened well before any poblems started appearing.
Interesting and useful.
> Any ASPM settings are at their default.
>
> I'll try wd_disable=1 as a workaround for now.
>
> Meenakshi, will the patch you mentioned be applied in 3.3?
Cc-ing her. The patch currently seems to be part of the wireless-next
tree but not davem's net tree.
> Below is a syslog excerpt from around the time of failue. It seems to
> support Meenakshi's suggestion that it's related to the queue getting
> stuck.
Well, that can be tested. Could you try the patch against current
"master"? It works like this:
0. Prerequisites:
apt-get install git build-essential
1. Get the kernel history, if you don't already have it:
git clone \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
2. Configure and build:
cd linux
git checkout origin/master
cp /boot/config-$(uname -r) .config; # current configuration
make localmodconfig; # optional: minimize configuration
make deb-pkg; # optionally with -j<num> for parallel build
dpkg -i ../<name of package>; # as root
reboot
... test test test ...
3. Hopefully it reproduces the problem. So try the attached patch:
git am -3sc <the patch>
make deb-pkg; # maybe with -j4
dpkg -i ../<name of package>; # as root
reboot
If it works, we can pass this to Dave with information about what
happened and your test result, to get the patch fast-tracked.
Thanks,
Jonathan
> Below is a syslog excerpt from around the time of failue. It seems to
> support Meenakshi's suggestion that it's related to the queue getting
> stuck.
[...]
> iwlwifi 0000:02:00.0: Queue 4 stuck for 2000 ms.
> iwlwifi 0000:02:00.0: Current read_ptr 112 write_ptr 115
> iwlwifi 0000:02:00.0: On demand firmware reload
> iwlwifi 0000:02:00.0: Command REPLY_QOS_PARAM failed: FW Error
> iwlwifi 0000:02:00.0: Failed to update QoS
> iwlwifi 0000:02:00.0: fw recovery, no hcmd send
> iwlwifi 0000:02:00.0: Error sending REPLY_RXON: enqueue_hcmd failed: -5
> iwlwifi 0000:02:00.0: Error clearing ASSOC_MSK on BSS (-5)
> iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF
> iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF
[...]
> ieee80211 phy0: Hardware restart was requested
> wpa_supplicant[1472]: CTRL-EVENT-DISCONNECTED bssid=00:50:7f:cb:4b:58 reason=4
> ieee80211 phy0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-2)
[....]
> iwlwifi 0000:02:00.0: Could not load the INST uCode section
> iwlwifi 0000:02:00.0: Failed to start RT ucode: -110
[...]
> iwlwifi 0000:02:00.0: MAC is in deep sleep!. CSR_GP_CNTRL = 0xFFFFFFFF
[...]
> I get some kind of OOPS but I'm guessing this is just because the driver can't
> communicate with the card when the module is being unloaded:
[...]
> WARNING: at /build/buildd-linux-2.6_3.2.9-1-amd64-KTPapN/linux-2.6-3.2.9/debian/build/source_amd64_none/drivers/net/wireless/iwlwifi/iwl-core.c:1330 iwlagn_mac_remove_interface+0x48/0xdd [iwlwifi]()
> Hardware name: 3249CTO
> Modules linked in: uvcvideo videodev v4l2_compat_ioctl32 media snd_usb_audio snd_usbmidi_lib pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) acpi_cpufreq mperf cpufreq_stats cpufreq_userspace cpu
> Mar 12 13:15:04 localhost kernel: sync_memcpy async_tx raid1 raid0 multipath linear md_mod sd_mod crc_t10dif usbhid hid ahci libahci ehci_hcd libata scsi_mod usbcore thermal thermal_sys usb_common e1000e [last unloaded: scsi_wait_scan]
> Mar 12 13:15:04 localhost kernel: [48290.674508] Pid: 1405, comm: NetworkManager Tainted: G O 3.2.0-2-amd64 #1
> Mar 12 13:15:04 localhost kernel: [48290.674511] Call Trace:
> Mar 12 13:15:04 localhost kernel: [48290.674520] [<ffffffff81046879>] ? warn_slowpath_common+0x78/0x8c
> Mar 12 13:15:04 localhost kernel: [48290.674531] [<ffffffffa03ea9af>] ? iwlagn_mac_remove_interface+0x48/0xdd [iwlwifi]
[...]
> Mar 12 13:15:04 localhost kernel: [48290.674647] [<ffffffff812a35a5>] ? netlink_rcv_skb+0x36/0x7a
[...]
> iwlwifi 0000:02:00.0: ctx->vif = (null), vif = ffff8801b1c72df0
> iwlwifi 0000:02:00.0: ID = 0: ctx = ffff8801b1a834b0 ctx->vif = (null)
From: Johannes Berg <johannes.berg@intel.com>
Date: Sun, 4 Mar 2012 08:50:46 -0800
Subject: iwlwifi: always monitor for stuck queues
commit 342bbf3fee2fa9a18147e74b2e3c4229a4564912 upstream.
If we only monitor while associated, the following
can happen:
- we're associated, and the queue stuck check
runs, setting the queue "touch" time to X
- we disassociate, stopping the monitoring,
which leaves the time set to X
- almost 2s later, we associate, and enqueue
a frame
- before the frame is transmitted, we monitor
for stuck queues, and find the time set to
X, although it is now later than X + 2000ms,
so we decide that the queue is stuck and
erroneously restart the device
It happens more with P2P because there we can
go between associated/unassociated frequently.
Cc: stable@vger.kernel.org
Reported-by: Ben Cahill <ben.m.cahill@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
drivers/net/wireless/iwlwifi/iwl-core.c | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git a/drivers/net/wireless/iwlwifi/iwl-core.c b/drivers/net/wireless/iwlwifi/iwl-core.c
index 7bcfa781e0b9..3abe9ede6990 100644
--- a/drivers/net/wireless/iwlwifi/iwl-core.c
+++ b/drivers/net/wireless/iwlwifi/iwl-core.c
@@ -1465,20 +1465,10 @@ void iwl_bg_watchdog(unsigned long data)
if (timeout == 0)
return;
- /* monitor and check for stuck cmd queue */
- if (iwl_check_stuck_queue(priv, priv->shrd->cmd_queue))
- return;
-
- /* monitor and check for other stuck queues */
- if (iwl_is_any_associated(priv)) {
- for (cnt = 0; cnt < hw_params(priv).max_txq_num; cnt++) {
- /* skip as we already checked the command queue */
- if (cnt == priv->shrd->cmd_queue)
- continue;
- if (iwl_check_stuck_queue(priv, cnt))
- return;
- }
- }
+ /* monitor and check for stuck queues */
+ for (cnt = 0; cnt < hw_params(priv).max_txq_num; cnt++)
+ if (iwl_check_stuck_queue(priv, cnt))
+ return;
mod_timer(&priv->watchdog, jiffies +
msecs_to_jiffies(IWL_WD_TICK(timeout)));
--
1.7.9.2
Reply to: