Re: memory corruptor in .18rc1-git



On Thu, 13 Jul 2006 18:13:30 -0400
Dave Jones <davej@xxxxxxxxxx> wrote:

Three times in the last week, I've had a box running the -git-du-jour
spontaneously reboot. It just happened again, this time I had a serial
console hooked up, but it rebooted before transferring much data.
The one thing it did spew however was "List corruption. prev->ne",
which came from the patch below which I had in my tree.
(The latter half is likely irrelevant, and came from chasing a different bug)

Things in common at all three times it happened..
reading email, and listening to oggs with rhythmbox.
Another ALSA bug maybe ?

I've up'd the speed of the serial console, in the hope that more chars
make it over the wire before reboot should this happen again.

Are you using SMP? We have a known slab locking bug.

There have been a couple of slab.c patches committed today, but neither of
them appear to actually fix the bug.

The below should fix it, and testing this (disable lockdep) would be
useful.

It's going to take a bit of work to unpickle it all now.

diff -puN mm/slab.c~revert-slabc-lockdep-locking-change mm/slab.c
--- a/mm/slab.c~revert-slabc-lockdep-locking-change
+++ a/mm/slab.c
@@ -3100,16 +3100,7 @@ static void free_block(struct kmem_cache
if (slabp->inuse == 0) {
if (l3->free_objects > l3->free_limit) {
l3->free_objects -= cachep->num;
- /*
- * It is safe to drop the lock. The slab is
- * no longer linked to the cache. cachep
- * cannot disappear - we are using it and
- * all destruction of caches must be
- * serialized properly by the user.
- */
- spin_unlock(&l3->list_lock);
slab_destroy(cachep, slabp);
- spin_lock(&l3->list_lock);
} else {
list_add(&slabp->list, &l3->slabs_free);
}
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages