Re: iwlagn: associating with AP causes kernel hiccup



On Sun, Oct 19, 2008 at 5:18 PM, Andy Lutomirski <luto@xxxxxxxxxxxxx> wrote:
Richard Scherping wrote:

Tomas Winkler schrieb:

On Fri, Oct 17, 2008 at 5:27 PM, Richard Scherping
<richard-YwAN3MSemFZ6lmGzAMPh1A@xxxxxxxxxxxxxxxx> wrote:

When I associate with an AP, Linux 2.6.27 seems to "hang" for a few
seconds. During that time, all sound stops playing and keyboard and
mouse input is impossible.

I'm using Mandriva 2009.0 x86_64 with wpa_supplicant and Mandriva's
wireless network configuration tool (drakroam).

Actually I just found out that running # ifconfig wlan0 down
is enough to trigger the sound and mouse hanging for a few seconds.

And shortly after I wrote that, while associating while getting an IP
with dhclient when associating with a WPA encrypted AP, I got this
backtrace in my logs:
[...]

I have a similar problem here. No crash up to now, but the very same
"hang" for a few seconds on "ifconfig wlan0 down". Interestingly this does
only happen after a normal boot - once I did a suspend and resume (S3),
there is no hang anymore.

Hardware: Thinkpad T61p with Intel 4965 agn
Software: Debian Lenny x86_64 with vanilla 2.6.27 kernel

Driver in 2.6.27 is not stable, please try to reproduce this in
current wireless-testing.git.

I do not have the time to compile and test wireless-testing ATM, sorry.

In fact I am annoyed by the fact that iwlagn is "known to be unstable" in
a stable kernel release and that this even seems to be a totally normal
thing...


Amen.
Stable doesn't mean all components are stable, citation from Linus blog:
"It doesn't have to be perfect (and obviously no release ever is), but
it needs to be in reasonable shape"

The fact is that some critical patches were rejected as not
regressions in rc cycle and probably need to be pushed to the stable
version now or distribution will merge them.
We gave more priority for testing 32 bit version so it is more stable
then 64 bit which got much less in house testing and we've missed many
issues there. The driver doesn't get full exposure till it's get to
the public in stable version therefore no bugs are opened in the rc
cycle so also are not fixed in the stable version. and unfortunately
there is no much system testing at all for what get's into merging
window.
Second the whole mac80211 stack didn't address fully MQ rewrite so
it's a bit shaky as well and this will be fact also in 2.6.28.

This driver has been available and more-or-less working for ages.
What kernel am I supposed to run if I just want a stable system? Haven't
found one yet, other than distro kernels...

In any case, I've seen these complete system hiccups with iwl4965 and iwlagn
since at least 2.6.25 and through quite a few wireless-testing versions. I
bet that this, along with things like it, is the culprit:

Haven't seen you've filled bug for it.


In many, many functions:
spin_lock_irqsave(&priv->lock, flags);
...
ret = iwl_grab_nic_access(priv);

In iwl-io.h (2.6.26.something):

This code is here from version 2.6.18 at least was just moved around.

static inline int _iwl_grab_nic_access(struct iwl_priv *priv)
{
...
ret = _iwl_poll_bit(priv, CSR_GP_CNTRL,
CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN,
(CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY |
CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP), 50);
...
}

static inline int _iwl_poll_bit(struct iwl_priv *priv, u32 addr,
u32 bits, u32 mask, int timeout)
{
int i = 0;

do {
if ((_iwl_read32(priv, addr) & mask) == (bits & mask))
return i;
mdelay(10);
i += 10;
} while (i < timeout);

return -ETIMEDOUT;
}

Polling the hardware waiting for firmware to do something *with IRQs
disabled*? I'd really rather the drivers on my system didn't do this.

I'd attempt to fix this myself, but I have no clue what the locking rules
are supposed to be.

Locking need to be really revised but till now I didn't see show
stoppers issues so it didn't get priority

Would I be out of line for wishing the iwlwifi developers
Patches are always welcome

would fix
longstanding issues (latency and maybe horkage after resume, although the
latter seems much improved lately) before adding fancy new things?

There are also problem in mac80211 it self and we did as well some
work to improve it a bit.
Tomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: if_ath breaks s3 suspend/resume [Was: ACPI S3 wakeup problem (beeeeeeeeep)]
    ... Now with this sysctl enabled I hear a continuing beep, but the hard drive stays powered down, also the LCD keeps unlit. ... but sometimes I get a hang during bootup shortly after ath0 is brought up. ... For the hang I'm not sure how to reproduce it. ... not a driver problem. ...
    (freebsd-current)
  • [opensuse] Suspend resume ( hibernate ) not working with my new Thinkpad t61p
    ... I have a new thinkpad t51p courtesy of my boss and installing suse 10.3 ... driver working (out of the box it hung and had to be hard booted into ... If you let the module default it will load some bullshit module and hang ... get it so that it would display on my large external monitor. ...
    (SuSE)
  • Re: BISECTED: 2.6.29-rc2 regression: hibernation hang on eeepc-701
    ... It doesn't hang if I use the shutdown method (either 'echo ... I have 4 PCI devices without a kernel driver. ... 8086:2792 Mobile 915 Express Graphics Controller ...
    (Linux-Kernel)
  • Re: Epson Stylus CX7800 Problem
    ... I would say it was a BIOS issue. ... Cari Windows Client - Printing & Imaging ... core driver is not ... "hang" a system such as windows. ...
    (microsoft.public.windowsxp.print_fax)
  • Re: How to check for memory leaks
    ... J.R. Mauro wrote: ... Is this the driver in the staging tree or a different one? ... This is the driver in wireless-testing, which is essentially the same as the one ...
    (Linux-Kernel)