• Steven Rostedt (VMware)'s avatar
    tracing: Do no increment trace_clock_global() by one · 89529d8b
    Steven Rostedt (VMware) authored
    The trace_clock_global() tries to make sure the events between CPUs is
    somewhat in order. A global value is used and updated by the latest read
    of a clock. If one CPU is ahead by a little, and is read by another CPU, a
    lock is taken, and if the timestamp of the other CPU is behind, it will
    simply use the other CPUs timestamp.
    
    The lock is also only taken with a "trylock" due to tracing, and strange
    recursions can happen. The lock is not taken at all in NMI context.
    
    In the case where the lock is not able to be taken, the non synced
    timestamp is returned. But it will not be less than the saved global
    timestamp.
    
    The problem arises because when the time goes "backwards" the time
    returned is the saved timestamp plus 1. If the lock is not taken, and the
    plus one to the timestamp is returned, there's a small race that can cause
    the time to go backwards!
    
    	CPU0				CPU1
    	----				----
    				trace_clock_global() {
    				    ts = clock() [ 1000 ]
    				    trylock(clock_lock) [ success ]
    				    global_ts = ts; [ 1000 ]
    
    				    <interrupted by NMI>
     trace_clock_global() {
        ts = clock() [ 999 ]
        if (ts < global_ts)
    	ts = global_ts + 1 [ 1001 ]
    
        trylock(clock_lock) [ fail ]
    
        return ts [ 1001]
     }
    				    unlock(clock_lock);
    				    return ts; [ 1000 ]
    				}
    
     trace_clock_global() {
        ts = clock() [ 1000 ]
        if (ts < global_ts) [ false 1000 == 1000 ]
    
        trylock(clock_lock) [ success ]
        global_ts = ts; [ 1000 ]
        unlock(clock_lock)
    
        return ts; [ 1000 ]
     }
    
    The above case shows to reads of trace_clock_global() on the same CPU, but
    the second read returns one less than the first read. That is, time when
    backwards, and this is not what is allowed by trace_clock_global().
    
    This was triggered by heavy tracing and the ring buffer checker that tests
    for the clock going backwards:
    
     Ring buffer clock went backwards: 20613921464 -> 20613921463
     ------------[ cut here ]------------
     WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0
     Modules linked in:
     [..]
     [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463
       [20613915818] PAGE TIME STAMP
       [20613915818] delta:0
       [20613915819] delta:1
       [20613916035] delta:216
       [20613916465] delta:430
       [20613916575] delta:110
       [20613916749] delta:174
       [20613917248] delta:499
       [20613917333] delta:85
       [20613917775] delta:442
       [20613917921] delta:146
       [20613918321] delta:400
       [20613918568] delta:247
       [20613918768] delta:200
       [20613919306] delta:538
       [20613919353] delta:47
       [20613919980] delta:627
       [20613920296] delta:316
       [20613920571] delta:275
       [20613920862] delta:291
       [20613921152] delta:290
       [20613921464] delta:312
       [20613921464] delta:0 TIME EXTEND
       [20613921464] delta:0
    
    This happened more than once, and always for an off by one result. It also
    started happening after commit aafe104a was added.
    
    Cc: stable@vger.kernel.org
    Fixes: aafe104a ("tracing: Restructure trace_clock_global() to never block")
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    89529d8b
trace_clock.c 4.22 KB