Re: [PATCH 0/4] Linux Kernel Markers



Hi Richard,

* Richard J Moore (richardj_moore@xxxxxxxxxx) wrote:


Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote on 20/12/2006
23:52:16:

Hi,

You will find, in the following posts, the latest revision of the Linux
Kernel
Markers. Due to the need some tracing projects (LTTng, SystemTAP) has of
this
kind of mechanism, it could be nice to consider it for mainstream
inclusion.

The following patches apply on 2.6.20-rc1-git7.

Signed-off-by : Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>

Mathiue, FWIW I like this idea. A few years ago I implemented something
similar, but that had no explicit clients. Consequently I made my hooks
code more generalized than is needed in practice. I do remember that Karim
reworked the LTT instrumentation to use hooks and it worked fine.


Yes, I think some features you implemented in GKHI, like chained calls to
multiple probes, should be implemented in a "probe management module" which
would be built on top of the marker infrastructure. One of my goal is to
concentrate on having the core right so that, afterward, building on top of it
will be easy.

You've got the same optimizations for x86 by modifying an instruction's
immediate operand and thus avoiding a d-cache hit. The only real caveat is
the need to avoid the unsynchronised cross modification erratum. Which
means that all processors will need to issue a serializing operation before
executing a Marker whose state is changed. How is that handled?


Good catch. I thought that modifying only 1 byte would spare us from this
errata, but looking at it in detail tells me than it's not the case.

I see three different ways to address the problem :
1 - Adding some synchronization code in the marker and using
synchronize_sched().
2 - Using an IPI to make other CPUs busy loop while we change the code and then
execute a serializing instruction (iret, cpuid...).
3 - First write an int3 instead of the instruction's first byte. The handler
would do the following :
int3_handler :
single-step the original instruction.
iret

Secondly, we call an IPI that does a smp_processor_id() on each CPU and
wait for them to complete. It will make sure we execute a synchronizing
instruction on every CPU even if we do not execute the trap handler.

Then, we write the new 2 bytes instruction atomically instead of the int3
and immediate value.


I exclude (1) because of the performance impact, (2) because it does not deal
with NMIs. It leaves (3). Does it make sense ?


One additional thing we did, which might be useful at some future point,
was adding a /proc interface. We reflected the current instrumentation
though /proc and gave the status of each hook. We even talked about being
able to enable or disabled instrumentation by writing to /proc but I don't
think we ever implemented this.


Adding a /proc output to list the active probes and their
callback will be tribial to add to the markers. I think the probe management
module should have its /proc file too to list the chains of connected handlers
once we get there.

It's high time we settled the issue of instrumentation. It gets my vote,

Good luck!

Richard


Thanks,

Mathieu

- -
Richard J Moore
IBM Linux Technology Centre


--
OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: CLI Exception Handling
    ... It is practically quite difficult to determine (in the handler block) from which IL instruction the exception was thrown. ... >> really can do is make a decision to continue searching or execute the ...
    (microsoft.public.dotnet.framework.clr)
  • Re: Kind of new: function implementation questions, MASM
    ... >>if the instruction with less stuff is faster than the instruction with more stuff ... In JPEG, there are basically markers. ... The marker is a marker code, ... timestamp count values into an array so that later the array contents can be analyzed ...
    (comp.lang.asm.x86)
  • Re: programming language
    ... you will find the source code to my bf interpreter. ... instruction_pointer is the index of the instruction currently being executed in the instruction array. ... execute() is where the action happens. ... executegets a pointer to a bf_vm, where it executes one instruction, increments the instruction pointer of the bf_vm so that it points to the next instruction (or does a loop), and returns. ...
    (comp.programming)
  • Re: 2.6.22 -mm merge plans
    ... markers infrastructure as soon as it hits mainline). ... All these companies would be really happy to have a marker ... "kprobes remains a vital foundation for SystemTap. ... XMC-safe instruction patching. ...
    (Linux-Kernel)
  • Re: Load ordering by introducing dependencies
    ... instruction, but the time when the processor "commits" the store (since ... If you're talking about the actual write to memory, you're right, it ... The PowerPC spec doesn't specify in with order the above stores will be ... doesn't execute instructions in random orders. ...
    (comp.lang.asm.x86)