Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected



Linus Torvalds wrote:

On Mon, 25 Aug 2008, Linus Torvalds wrote:
Could you make your kernel image available somewhere, and we can take a
look at it? Some versions of gcc are total pigs when it comes to stack
usage, and your exact configuration matters too. But yes, module loading
is a bad case, for me "sys_init_module()" contains

subq $392, %rsp #,

which is probably mostly because of the insane inlining gcc does (ie it
will likely have inlined every single function in that file that is only
called once, and then it will make all local variables of all those
functions alive over the whole function and allocate stack-space for them
ALL AT THE SAME TIME).

Mine has:

Dump of assembler code for function sys_init_module:
0xffffffff802688c0 <sys_init_module+0>: push %rbp
0xffffffff802688c1 <sys_init_module+1>: mov %rsp,%rbp
0xffffffff802688c4 <sys_init_module+4>: sub $0x1c0,%rsp
0xffffffff802688cb <sys_init_module+11>: mov %r12,-0x20(%rbp)
0xffffffff802688cf <sys_init_module+15>: mov %rdi,%r12

so 448 bytes.

The kernel is up at: http://free.linux.hp.com/~adb/bug.11342/vmlinux (if
you would let me know when you are through with it so I can free up some
space there I'd appreciate it...)

By doing the patch you provided, sys_init_module now looks like:

Dump of assembler code for function sys_init_module:
0xffffffff8026aa20 <sys_init_module+0>: push %rbp
0xffffffff8026aa21 <sys_init_module+1>: mov %rsp,%rbp
0xffffffff8026aa24 <sys_init_module+4>: sub $0x20,%rsp
0xffffffff8026aa28 <sys_init_module+8>: mov %r14,0x18(%rsp)
0xffffffff8026aa2d <sys_init_module+13>: mov %rdi,%r14


So only 32 bytes. (But of course, load_module() exists, and now has
0x1d0 (464) bytes...)

With the patch you provide, I /was/ able to repeatedly boot OK (latest
tree, and I also ran the patch against the 26.27.rc3-based kernel I was
having problems with initially, and that booted OK as well).

Alan


I bet this one-liner will probably make your kernel work. It's not a full
solution, but it will make the module-loading path lose _all_ of the above
stack slots by just not inlining "load_module()" - the stack slots will
still be used when the module is _loaded_, but by the time we actually
callt he ->init function they will have been released since it's not all
in the same crazy function any more.

I _seriously_ believe that we were better off back when gcc only inlined
what we told it to inline, and never inlined on its own. The gcc inlining
logic is pure and utter sh*t in an environment like the kernel where stack
space is a valuable resource.

Anyway, Alan, even if this solves your particular problem, I'd still like
to see your kernel image, so that I can hunt for other problems like
this..

Linus

---
kernel/module.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 08864d2..9db1191 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1799,7 +1799,7 @@ static void *module_alloc_update_bounds(unsigned long size)

/* Allocate and load the module: note that size of section 0 is always
zero, and we rely on this for optional sections. */
-static struct module *load_module(void __user *umod,
+static noinline struct module *load_module(void __user *umod,
unsigned long len,
const char __user *uargs)
{
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [patch] measurements, numbers about CONFIG_OPTIMIZE_INLINING=y impact
    ... gcc _does_ have a perfectly fine notion of how heavy-weight ... an "asm" statement is: just count it as a single instruction (and count ... I doubt that it's the inline asm that was the biggest reason ... size increase/decrease of the core kernel, ...
    (Linux-Kernel)
  • Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected
    ... not since gcc stack allocation is so horrid that it won't re-use stack ... (that's why we disable unit-at-a-time only for gcc 3.4 on i386). ... That was with gcc-4.3.0, and no, there were hardly any inlining issues ... and the average kernel developer is unlikely to do it as ...
    (Linux-Kernel)
  • [PATCH] i386/gcc bug with do_test_wp_bit
    ... Playing around with gcc 3.3.3, I compiled a 2.6 series kernel for i386 ... not boot, since do_test_wp_bitmust not exist in the __init section. ... When compiling a Linux 2.6 kernel using gcc 3.3.3, gcc was able to inline the function do_test_wp_bit. ...
    (Linux-Kernel)
  • CONFIG_OPTIMIZE_INLINING fun
    ... I'm wondering whether we need this patch: ... This option determines if the kernel forces gcc to inline the functions ...
    (Linux-Kernel)
  • [S390] stacktrace bug.
    ... The latest kernel 2.6.19-rc1 triggers a bug in the s390 specific stack ... trace code when compiled with gcc 3.4. ...
    (Linux-Kernel)