Re: [E1000-devel] BUG: bad unlock balance detected! e1000e
- From: "Jeff Kirsher" <jeffrey.t.kirsher@xxxxxxxxx>
- Date: Wed, 10 Dec 2008 16:37:46 -0800
On Tue, Dec 9, 2008 at 3:56 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
On Wed, 10 Dec 2008 00:43:46 +0100
Frederik Deweerdt <frederik.deweerdt@xxxxxxxx> wrote:
On Tue, Dec 09, 2008 at 03:08:01PM -0800, Andrew Morton wrote:
On Tue, 9 Dec 2008 12:03:37 +0100Yep, sorry.
Frederik Deweerdt <frederik.deweerdt@xxxxxxxx> wrote:
It some error checking is missing in e1000e: debug contention on NVM
SWFLAG
On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote:
Hi
During occasional scan of message log - I've found out this BUG which
happened on Dec3 with the -rc7 from that day.
(So if it's now fixed in current git feel free to ignore :))
My machine T61 - C2D, 2GB, 64bit kernel - message appeared during
shutdown and was actually not noticed by me...
NetworkManager: <WARN> nm_signal_handler(): Caught signal 15,
shutting down normally.
NetworkManager: <info> (eth0): now unmanaged
NetworkManager: <info> (eth0): device state change: 3 -> 1
NetworkManager: <info> (eth0): cleaning up...
NetworkManager: <info> (eth0): taking down device.
=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
(top-posting repaired. Please don't do that!!!).
I don't think so, e1000_acquire_swflag_ich8lan() locks and
Hello Zdenek,
This could be due to 717d438d1fde94decef874b9808379d1f4523453
"e1000e: debug contention on NVM SWFLAG"
Error handling is missing from e1000_reset_hw_ich8lan so it may happen
that we don't acquire the nvm_mutex if the card times out.
Adding Thomas to CC.
yup. 2.6.27 needs fixing also.
Like this?
e1000_release_swflag_ich8lan() unlocks.
urgh, OK, I made the mistake of reading the comments.
I think it is more along the
lines of:
diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index 523b971..f971b83 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -1892,7 +1892,13 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw)
*/
ctrl |= E1000_CTRL_PHY_RST;
}
+
ret_val = e1000_acquire_swflag_ich8lan(hw);
+ if (ret_val) {
+ hw_dbg(hw, "Failed to acquire NVM swflag");
+ return ret_val;
+ }
+
hw_dbg(hw, "Issuing a global reset to ich8lan");
ew32(CTRL, (ctrl | E1000_CTRL_RST));
msleep(20);
But I'm not sure we should cancel the ongoing reset if the card times
out...
Yes, something like that. Or something like
--- a/drivers/net/e1000e/ich8lan.c~a
+++ a/drivers/net/e1000e/ich8lan.c
@@ -1940,12 +1940,14 @@ static s32 e1000_reset_hw_ich8lan(struct
ctrl |= E1000_CTRL_PHY_RST;
}
ret_val = e1000_acquire_swflag_ich8lan(hw);
- hw_dbg(hw, "Issuing a global reset to ich8lan\n");
- ew32(CTRL, (ctrl | E1000_CTRL_RST));
- msleep(20);
+ if (!ret_val) {
+ hw_dbg(hw, "Issuing a global reset to ich8lan\n");
+ ew32(CTRL, (ctrl | E1000_CTRL_RST));
+ msleep(20);
- /* release the swflag because it is not reset by hardware reset */
- e1000_release_swflag_ich8lan(hw);
+ /* release the swflag because it is not reset by hardware reset */
+ e1000_release_swflag_ich8lan(hw);
+ }
ret_val = e1000e_get_auto_rd_done(hw);
if (ret_val) {
_
Dunno. It's e1000-developer-summoning-dance time.
Actually, if we time out trying to acquire the swflag, we still want
to reset the part because we are most likely in an unrecoverable
state.
So I would suggest the following
--- a/drivers/net/e1000e/ich8lan.c~a
+++ a/drivers/net/e1000e/ich8lan.c
@@ -1940,9 +1940,10 @@ static s32 e1000_reset_hw_ich8lan(struct
ctrl |= E1000_CTRL_PHY_RST;
}
ret_val = e1000_acquire_swflag_ich8lan(hw);
hw_dbg(hw, "Issuing a global reset to ich8lan\n");
ew32(CTRL, (ctrl | E1000_CTRL_RST));
msleep(20);
+ if (!ret_val) {
-
- /* release the swflag because it is not reset by hardware reset */
- e1000_release_swflag_ich8lan(hw);
+ /* release the swflag because it is not reset by
hardware reset */
+ e1000_release_swflag_ich8lan(hw);
+ }
Of course, we will want to add a comment to the fact that we still
want to reset the part, even if we have not acquired the lock because
we are in an unrecoverable state.
I can provide a patch in a few minutes.
--
Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- BUG: bad unlock balance detected! e1000e
- From: Zdenek Kabelac
- Re: BUG: bad unlock balance detected! e1000e
- From: Frederik Deweerdt
- Re: BUG: bad unlock balance detected! e1000e
- From: Andrew Morton
- Re: BUG: bad unlock balance detected! e1000e
- From: Frederik Deweerdt
- Re: BUG: bad unlock balance detected! e1000e
- From: Andrew Morton
- BUG: bad unlock balance detected! e1000e
- Prev by Date: Re: [PATCH 1/1] Userspace I/O (UIO): Add support for userspace DMA (corrected)
- Next by Date: Re: [RFC][PATCH 1/3] CGroups: Add a per-subsystem hierarchy_mutex
- Previous by thread: Re: BUG: bad unlock balance detected! e1000e
- Next by thread: [PATCH 00/45] KVM Updates for 2.6.29 (Part 1 of 3)
- Index(es):
Relevant Pages
|