Re: [opensuse] Novell Bugzilla - At it Again - Bugs Apparently Dismissed Without Sufficient Investigation



On Fri, Apr 4, 2008 at 9:20 AM, Anders Johansson <ajh@xxxxxxxxxx> wrote:

You seem to be misunderstanding what "mce" is. A machine check exception is
the hardware itself telling you that something has gone badly wrong. There is
no interpretation involved in the software. The software just logs the
message

If the mce says it is a hardware problem, you can count on its being a
hardware problem

Anders


No you can't count on that Anders.

Do some research on MCE errors and you will find these
errors are often reported when there is absolutely nothing wrong with the
machine. In fact DELL had a huge thread on their internal blog about the
reporting of mce errors from linux users upon the arrival of core 2
duo machines.
They were more than a little miffed getting calls because some developer of
the mce package with a swollen head put in language insisting it was hardware
when others clearly demonstrated you could get to that part of the code
with no hardware error at all.

Its quite possible for software bugs to hoze things so badly that the
mce modules think there was an error.

Further, part of the mce software's job is to filter out the bogus MCE errors.
(or so says someone who shall remain nameless but who's email
address is ak@xxxxxxx ). Now if the software's job is to filter out
bogus mc events that is a defacto assertion that lots of these events
are bogus.

I've seen these in the past as well. Mine had to do with runaway
keys, and the clue was the bit about TSC. Dual cores can get their
timers to disagree to the point that it forces a failure. You would often
see this with speed-step or power-now enabled, but simply locking
the machine at high-power setting would avoid the problem.
For me the nohpet command line kernel parameter was required under suse 10.1.
That solved all my instances. But that was on a core-2-duo.




--
----------JSA---------
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx



Relevant Pages

  • Re: nobody mentions crashes
    ... assure you MCE crashes a lot less than BTV3 or Sage. ... The other unscheduled restart was hardware related when my ... windows updates and the remote recording beta programs. ...
    (microsoft.public.windows.mediacenter)
  • Re: MCE Compatability with Pro Audio Software/Hardware
    ... OK, so why do these devices work on XP Home SP2 and XP Pro SP2, but not MCE? ... You may have a real gripe about Firewire (the problem, as I recall, is ... how exactly does that affect these Pro audio pieces of hardware, ...
    (microsoft.public.windows.mediacenter)
  • Re: x86/mce merge, integration hickup + crash, design thoughts
    ... If there's ASCII logging it should be separate from normal printk. ... MCE exceptions themselves cannot generally printk ... HARDWARE ERROR. ... Please contact your hardware vendor ...
    (Linux-Kernel)
  • Re: Do I really need a new graphics card?
    ... All Decoders use hardware acceleration if it is available. ... MCE appears to stutter and Media Player does not when the card can not keep ... > decoder that came with the HDTV Wonder. ...
    (microsoft.public.windows.mediacenter)
  • Re: HP 873n Big Problems
    ... After your destructive resotre of MCE 2002 and the MS updates do you then ... Virus Definition updates. ... destructive recovery when there should be no unknown drivers. ... however the hardware testing should have dectected a problem here. ...
    (microsoft.public.windows.mediacenter)