Re: 2.6.18-mm2 - oops in cache_alloc_refill()



On Fri, 29 Sep 2006 20:01:54 -0400
Valdis.Kletnieks@xxxxxx wrote:

On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:

(Adding a bunch of people to the cc: list now that I have a clue what is
going on....)

I'd expect it's the same bug - slab data structures have gone bad.

*bing*! We have a winner. A quick check showed the kernel wasn't built with
slab debugging enabled, so I turned on the more obvious options, and got
rewarded with a traceback..

doh. I'd assumed that CONFIG_DEBUG_SLAB was enabled :(

Again: how come nobody else is hitting this? Something's different.

gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
Core extras-development). Kernel is still a 2.6.18 with *only* the
origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
a change in the code since 01/03/2004 - hopefully there's been no unintentional
API change on the kernel side since then...

Here's the traceback I got:

slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
[<c0103ad2>] dump_trace+0x64/0x1cd
[<c0103c4d>] show_trace_log_lvl+0x12/0x25
[<c010415f>] show_trace+0xd/0x10
[<c01041fc>] dump_stack+0x19/0x1b
[<c014c796>] __slab_error+0x17/0x1c
[<c014cdac>] cache_free_debugcheck+0xaf/0x230
[<c014d43e>] kfree+0x59/0x8c
[<c02dc04a>] ioctl_standard_call+0x1da/0x218
[<c02dc275>] wireless_process_ioctl+0x55/0x312
[<c02d3750>] dev_ioctl+0x45f/0x49a
[<c02c92aa>] sock_ioctl+0x1b3/0x1c6
[<c0160322>] do_ioctl+0x22/0x67
[<c01605a5>] vfs_ioctl+0x23e/0x251
[<c01605ff>] sys_ioctl+0x47/0x64
[<c0102cd3>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

Leftover inexact backtrace:

=======================
de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.

Repeated, over and over, just about once a second.

A quick strace of gkrellm finds these likely ioctl's causing the problem:

% grep ioctl /tmp/foo2 | sort -u | more
ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0

Yes. The main thing which those WE-21 patches do is to shorten the size of
various buffers which are used in wireless ioctls.

Since I'm using an orinoco-based card, these 2 look like the most likely
candidates. WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
stable for me.

The WE-21 patches weren't in Jeff's tree for -mm1 or for -mm2. They
appeared there transiently then quickly went mainline. They _might_ have
been in the wireless git tree, although I often drop that due to git woes.
But that hasn't happened recently....

I'll let somebody else argue over what path these took that
I never tripped over them in an earlier -mm before they hit Linus's tree...

commit baef186519c69b11cf7e48c26e75feb1e6173baa
Author: John W. Linville <linville@xxxxxxxxxxxxx>
Date: Fri Sep 8 16:04:05 2006 -0400

[PATCH] WE-21 support (core API)

This is version 21 of the Wireless Extensions. Changelog :
o finishes migrating the ESSID API (remove the +1)
o netdev->get_wireless_stats is no more
o long/short retry

This is a redacted version of a patch originally submitted by Jean
Tourrilhes. I removed most of the additions, in order to minimize
future support requirements for nl80211 (or other WE successor).

CC: Jean Tourrilhes <jt@xxxxxxxxxx>
Signed-off-by: John W. Linville <linville@xxxxxxxxxxxxx>

commit eeec9f1a931262d69811135092c8447d6dccc3e6
Author: Jean Tourrilhes <jt@xxxxxxxxxx>
Date: Tue Aug 29 18:02:31 2006 -0700

[PATCH] WE-21 for orinoco

Signed-off-by: Jean Tourrilhes <jt@xxxxxxxxxx>
Signed-off-by: John W. Linville <linville@xxxxxxxxxxxxx>


Try reverting those?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • KB958644 keeps kicking me off line.
    ... Is anyone else having trouble with XP's latest critical service patch, ... On my wireless laptop last night, a Dell 1505 running XP media SP3, as soon ... broken up connection that lasted a minute or so, ... I tried rebooting several times and never got on again. ...
    (microsoft.public.windowsupdate)
  • [RFT] hp-wmi: improved rfkill support for wifi
    ... confused about hard v.s. soft blocks on the wifi, so I fixed it based on ... The wireless is toggled by a hardware button. ... There are some other side-effects which the patch should fix. ... Without the patch, it seemed that hp-wmi ...
    (Linux-Kernel)
  • Re: KB958644 keeps kicking me off line.
    ... E.g. that's the type of message a third-party firewall might produce ... On issue II, my main machine running win xp64 SPII, the KB958644 patch was ... I run a cable modem with a Linksys WRT54GS wireless router, ... I tried rebooting several times and never got on again. ...
    (microsoft.public.windowsupdate)
  • Re: [ath9k-devel] ath9k: massive unexplained latency in 2.6.27 (rc5, rc6, probably others)
    ... I was using wireless normally this evening. ... following patch seems to be fixing the issue. ... However this patch can be applied on top of latest wireless testing ... It was removed in the following commit ...
    (Linux-Kernel)
  • Re: Wireless Zero Configuration
    ... XPe SP2 will allow you to use the Wirelss Provisioning Service API to ... configure wireless. ... > I guess you would need an utility or control panel applet from your WLAN ...
    (microsoft.public.windowsxp.embedded)

Loading