CONFIG_PREEMPT causes corruption of application's FPU stack



I am running the Einstein@home application (version 4.35,
http://einstein.phys.uwm.edu).This application does lots of computations
mostly with FPU and SSE instructions.
After I started experimenting with real-time optimized kernels the
application began to crash with floating point errors like in the
following message:

APP DEBUG: Application caught signal 8.

FPU status word ffffa0e1, flags: ERR_SUMM STACK_FAULT PRECISION INVALID
Obtained 6 stack frames for this thread.
Use gdb command: 'info line *0xADDRESS' to print corresponding line
numbers.
einstein_S5R3_4.35_i686-pc-linux-gnu[0x8069e7e]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x818d436]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x805db8f]
einstein_S5R3_4.35_i686-pc-linux-gnu[0x806b11c]
/lib/libc.so.6(__libc_start_main+0xe0)[0xb7e14fe0]
einstein_S5R3_4.35_i686-pc-linux-gnu(shmat+0x59)[0x804bda1]
Stack trace of LAL functions in worker thread:
GetSemiCohToplist at line 3177 of
file /home/bema/einsteinathome/HierarchicalSearch/EaH_build_release_einstein_S5R3_4.35/extra_sources/lalapps-CVS/src/pulsar/hough/src2/HierarchicalSearch.c
At lowest level status code = 0, description: NO LAL ERROR REGISTERED
called boinc_finish

I tracked this down to a single kernel configuration option. If
CONFIG_PREEMPT is set to 'y' the application will start crashing. If
CONFIG_PREEMPT is replaced by CONFIG_PREEMPT_VOLUNTARY, the application
will run without errors.

The problem is reproducible in so far as the error always occurs when
CONFIG_PREEMPT is set, but the time to the first occurrence varies greatly
from some minutes up to more than 10 CPU hours.

I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. I verified
the problem on the following kernel versions:

openSUSE 2.6.22.17-0.1-default
openSUSE 2.6.23.17-ccj64-rt
kernel.org 2.6.26-rc1
kernel.org 2.6.26-rc2-git5

My CPU is an Intel Core2Duo 6420, running two of the Einstein applications
in 32-bit mode. From a discussion on the Einstein message boards I know
that other user of the application are also affected.

Please let me know if you need any additional information to track this
down.
Jürgen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: CONFIG_PREEMPT causes corruption of applications FPU stack
    ... I found this error first on an openSUSE kernel 2.6.22.17-0.1-rt. ... My CPU is an Intel Core2Duo 6420, running two of the Einstein applications ... From a discussion on the Einstein message boards I know ...
    (Linux-Kernel)
  • Re: Fwd: [opensuse] Want To Try OpenSUSE - But Struggling With Live DVD
    ... All I really want to do is install OpenSuSE on an empty ext3 partition ... i.e. on my tyan with the HT1000 chip set, ... not work because the kernel does't support it. ...
    (SuSE)
  • Re: OpenSUSE 11.0 and AGP video card drivers?
    ... With OpenSUSE 11.0 I first had an ATI 9700 pro hooked up to my Acer ... again to install the nvidia drivers. ... available for my kernel version. ... NVidias 'nvidia' driver ...
    (alt.os.linux.suse)
  • Re: [opensuse] Xen install stuck
    ... system/virtualization the offer to "Install Hypervisor and Tools" ... opensuse, ubuntu) and do much searching and reading on the web and discovered that simply put, Xen is not in a good state right now on Any OS. ... Unsupported by either Xensource or the main kernel developers that is, the dist may or may no provide some support for their own builds in some cases. ... but uses not the latest developments in the virtualization and so the network, disk, memory, & cpu performance are somewhat less than what the latest form uses. ...
    (SuSE)
  • Re: Opensuse 10.3 und cdrtools-2.01.01a36 - Probleme bekannt?
    ... Nachdem es bei Opensuse 10.3 aber nur mehr sdx und nicht mehr hdx gibt, ... dass das den cdrtools mit dem neuen Kernel gut gefällt;-) ... Ich habe von den selbstkompilierten Original-cdrtools geschrieben. ...
    (de.comp.hardware.laufwerke.brenner)