Re: [ RFC, PATCH - 1/2, v2 ] x86-microcode: refactor microcode output messages



On Thu, Nov 05, 2009 at 07:40:53PM +0100, Dmitry Adamushko wrote:
2009/11/5 Andreas Herrmann <herrmann.der.user@xxxxxxxxxxxxxx>:
The patches don't properly work here.

(1) For instance I got following log entries when doing
   suspend/resume, doing CPU offline/online test and reloading the
   module:

To avoid possible misunderstandings, I'd like to clarify the output below.

 microcode: original microcode versions...
 microcode: CPU0-3: patch_level=0x1000065

So this is the 1st time you have loaded a module.

 platform microcode: firmware: requesting amd-ucode/microcode_amd.bin
 ...
 microcode: CPU0-1,3: patch_level=0x1000083

before or after loading a module? CPU2 is down, isn't it?

No, no CPU was offline at this moment. They all were brought back
online after some CPU hotplug and/or suspend/resume tests.

 microcode: CPU2-3: patch_level=0x1000065

Both messages showed up after same ucode-update process.

same question as above.

Same answer as above all CPUs are online.

Here, either CPUs 0 and 1 are down or have a
different version. Both above messages don't make sense taken together

See, and that's the problem.

(CPU3 belongs to both sets) unless summarize_cpu_info() is utterly
broken.

I didn't check that yet.

 Microcode Update Driver: v2.00 <tigran@xxxxxxxxxxxxxxxxxxxx>, Peter Oruba

The patch levels are:

 # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done
 PATCH_LEVEL          = 0x0000000001000083
 PATCH_LEVEL          = 0x0000000001000083
 PATCH_LEVEL          = 0x0000000001000065
 PATCH_LEVEL          = 0x0000000001000065

this is after your test has been stopped and all the CPUs are up, right?

Yes.

(2) During suspend/resume the ucode is not updated:

 hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done
 PATCH_LEVEL          = 0x0000000001000083
 PATCH_LEVEL          = 0x0000000001000083
 PATCH_LEVEL          = 0x0000000001000083
 PATCH_LEVEL          = 0x0000000001000083
 hadburg linux # pm-suspend
 hadburg linux # for i in `seq 0 3`; do lsmsr -c $i PATCH_LEVEL; done
 PATCH_LEVEL          = 0x0000000001000065
 PATCH_LEVEL          = 0x0000000001000065
 PATCH_LEVEL          = 0x0000000001000065
 PATCH_LEVEL          = 0x0000000001000065


That used to work w/o your patches. Didn't have time to look why this
is now failing. You've changed mc_cpu_callback() -- most likely that
is causing this regression.

Hmm, cpu-event-callbacks seem to be working on my (Intel) setup. I
have enabled pr_debug messages and also did a little trick to allow
ucode of the same version to be loaded (my cpu is of the recent ucode
by itself) and I can see cpu-callback events for both resuming and
cpu-up cases.

(firstly, upgraded with microcode_ctl as I only have a .dat file)

suspend-resume
...
[ 584.506371] microcode: CPU1 removed
[ 584.516018] microcode: CPU0 updated to revision 0x57, date = 2007-03-15
[ 584.597326] microcode: CPU1 updated upon resume
[ 584.597562] microcode: CPU1 updated to revision 0x57, date = 2007-03-15
[ 584.597565] microcode: CPU1 added
...

and now cpu1 : down -> up

[ 1616.932249] microcode: CPU1 removed
[ 1633.942502] platform microcode: firmware: requesting intel-ucode/06-0f-02
[ 1633.954638] microcode: data file intel-ucode/06-0f-02 load failed
[ 1633.954642] microcode: CPU1 added


as I understand, you don't see " platform microcode: firmware:
requesting intel-ucode" messages upon 'upping' a cpu, do you?

Sure, no intel-ucode messages as I tested with AMD CPUs ;-)
But otherwise no, no messages.

sure, my test is somewhat limited... anyway, first of all I'd like to
get a clear understanding of your logs. Thanks for yout test btw. :-))

I'll send you full logs asap.


Regards,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: How does this make you feel?
    ... >>>primitives to implement, say, a memcpy just as efficiently as microcode ... > The work is offloaded from the programmer in any case - this type of code ... library macros need updating for new CPU products, ... And designing such instruction such that they don't ...
    (comp.arch)
  • [ RFC, PATCH - 1/2 ] x86-microcode: refactor microcode output messages
    ... this is in response to Mike's patch "Limit the number of microcode ... simplify 'struct ucode_cpu_info' and related operational logic. ... -static int collect_cpu_info(int cpu) ... static void microcode_fini_cpu ...
    (Linux-Kernel)
  • Re: oprofile + hibernation = badness
    ... It would be helpful to check if the CPU ... CPU0 attaching NULL sched-domain. ... CPU1 attaching NULL sched-domain. ... APIC error on CPU1: 00 ...
    (Linux-Kernel)
  • [patch 10/11] [PATCH 10/11] x86: Major refactoring.
    ... However, that is exclusive, there is only one vendor specific module ... A CPU vendor check makes sure only the corect ... you will be able to update the microcode on ... static void microcode_init_cpu ...
    (Linux-Kernel)
  • Re: oprofile + hibernation = badness
    ... It would be helpful to check if the CPU ... CPU0 attaching NULL sched-domain. ... CPU1 attaching NULL sched-domain. ... reproduced easily with CPU hotplug. ...
    (Linux-Kernel)