races when reserving an event in the unified trace buffer



Hi Steve,

I am doing some load testing with our kernel tracing prototype
that uses the unified trace buffer for managing its data. I sometimes
saw kernel stack dump caused by the following checking in
function __rb_reserve_next:
if (unlikely(next_page == cpu_buffer->commit_page)) {
WARN_ON_ONCE(1);
goto out_unlock;
}
The comments above the code say the problem is caused by
"an interrupt storm that made it all the way around the buffer".
But I think there is race here that a single interrupt can cause
the check to fail. Suppose this is what happens:
An event is traced and calls __rb_reserve_next. Right after it
gets the current tail_page (line tail_page = cpu_buffer->tail_page;),
an interrupt happens that is also traced. The interrupt also takes
the same tail_page. The interrupt event moves the tail_page
forward if the tail_page is full. Note that the interrupt event gets
the old 'write' value because the first event has not updated that yet.
So the interrupt event may also update the commit_page if it is
the same as the tail_page. As a result, the above check would
fail after the interrupt finishes and the first event resumes its execution.

I have seen the problem happens frequently under heavy loads
on a multi-core machine. Interestingly, I also saw the above
warning that might actually be caused by an interrupt storm.
I was using 64k buffer size and am not sure whether it is possible
for so many interrupts to happen in a short time window.

I think we can use the time_stamp to distinguish the two cases.
Also, in either case, it seems bad to leave the tail_page->write with
an invalid value because it can cause problem when a reader
reads the page. Here is my proposed fix for the problem:

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7f69cfe..1500f78 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -982,8 +982,11 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
* it all the way around the buffer, bail, and warn
* about it.
*/
- if (unlikely(next_page == cpu_buffer->commit_page)) {
+ if (unlikely(next_page == cpu_buffer->commit_page) &&
+ tail_page->time_stamp > next_page->time_stamp) {
WARN_ON_ONCE(1);
+ if (tail <= BUF_PAGE_SIZE)
+ local_set(&tail_page->write, tail);
goto out_unlock;
}


Jiaying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: problem with dyn. extending/shrinking array
    ... I have an interrupt function that kicks in when I can receive a byte ... // buffer was allocated with calloc ... measurement at a rate of once a second, you can use a timer interrupt ... If you have a clock with millisecond ...
    (comp.lang.c)
  • Re: races when reserving an event in the unified trace buffer
    ... I am doing some load testing with our kernel tracing prototype ... "an interrupt storm that made it all the way around the buffer". ... The interrupt event moves the tail_page ...
    (Linux-Kernel)
  • Re: [PATCH 1/2] misc: add CARMA DATA-FPGA Access Driver
    ... * There are three lists which are used to keep track of the different states ... * are cleared, the interrupt is de-asserted. ... * Data Buffer Allocation Helpers ... the driver's private data structure ...
    (Linux-Kernel)
  • Re: 2.6.11-rc1-mm1
    ... the only way this will fail is if an interrupt occurred ... index value needs to get reloaded from memory each time around the loop. ... > Frankly this is legacy code for when ltt only supported one trace buffer, ... As Roman suggested, relayfs ...
    (Linux-Kernel)
  • Re: Handling SSC interrupts in Applesoft BASIC
    ... He has written an assembly-language program that reads and writes to the ... A little background on the concept of an interrupt. ... the initial interrupt handler does some ... No idea where the rest of the 256 char buffer is (John's ...
    (comp.sys.apple2.programmer)