Re: ptrace API extensions for BTS



On Friday 07 December 2007 10:11:04 Metzger, Markus T wrote:
Roland, Andi,

I would like to discuss the ptrace user interface for the BTS extension.
In previous emails,
Andi suggested a stream-like interface, but is also OK with an
array-like interface (as far as I understood).
Roland is dubious about the ptrace API additions.

I would like to settle the discussion and find an interface that
everybody can agree to, so I can implement that interface and we can
move forward with the patch.

The most efficient interface would be zero copy with tracer user process
supplying memory that is pinned (get_user_pages()) subject to the
mlock rlimit. Then kernel telling the CPU to directly log into
that.

Kernel buffers would be only needed for the per CPU kernel
logging.

Then the only information that would need to be passed with
system calls would be wakeup, tail position and perhaps a wrapping
counter.

Regarding 1, we currently provide scheduling timestamps, which are arch

That's actually broken because you don't log the CPU number.
sched_clock() without the CPU number associated is meaningless
on systems without synchronized, pstate invariant TSC
[that is older Intel systems or some larger current systems]

And even if you log the CPU number it is unclear how user space
would make sense of that. It can't generally, even the kernel
can't. Perhaps better to just not supply any time stamps for this.

Even on systems that don't have unsync TSC problem above
it can be tricky to convert the TSC into real time. Right now
we don't report the TSC frequency for once. Usually it tends
to be at highest p state but finding that out is also
difficult and unreliable (rounding errors) and might not
always be true in the future. Anyways could be solved
by reporting that separately in /proc/cpuinfo, but given all
the other problems I have my doubts it is really worth it. I would
suggest dropping the time stamp.

Additional architectures may want to (re)use and extend the x86 bts
record, or they may want to invent their own format. In the former case,

I think that's actually not a good goal. If the code is so complicated
that it makes sense sharing then you did something wrong :)

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Your worst project?
    ... 500 KHz instead of the native 1 MHz for the CPU. ... then I picked up an IMI 5MB hard disk with a controller whose ... external interface looked a bit like SASI, ... starting with a Sun 2/120, and up through a current pair of Sun Blade ...
    (rec.crafts.metalworking)
  • e1000_intr in request_irq faults in 2.6.20-git
    ... Bringing up loopback interface: ... DMA zone: ... ACPI: ... CPU 0 irqstacks, hard=c05c8000 soft=c05c0000 ...
    (Linux-Kernel)
  • Re: bsnmpd daemon eating all cpu
    ... BG>I am using dell poweredge sc440 router with one xeon dual core cpu. ... For test purposes i have downed lan interface and bsnmpd stopped to use ... # different from the empty string). ... # To enable read access only the read community string must be set. ...
    (freebsd-current)
  • Re: rdmsr from userspace
    ... privilege level and thus is not permitted from userland. ... I don't like interface of that device, ... CPU tweaking for bugs workaround without patching the kernel; ... I would prefer to have a microcode driver than a microcode ...
    (freebsd-hackers)
  • cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP]
    ... That's bad -- duplicate interface. ... Basically the biggest problem with cpufreq interface is that cpufreq has ... the kernel if we have sysfs interface exported by PowerOP in place - you ... interface with cpu freq drivers, ...
    (Linux-Kernel)