• Stephane Eranian's avatar
    perf_events: Fix event scheduling issues introduced by transactional API · 90151c35
    Stephane Eranian authored
    The transactional API patch between the generic and model-specific
    code introduced several important bugs with event scheduling, at
    least on X86. If you had pinned events, e.g., watchdog,  and were
    over-committing the PMU, you would get bogus counts. The bug was
    showing up on Intel CPU because events would move around more
    often that on AMD. But the problem also existed on AMD, though
    harder to expose.
    
    The issues were:
    
     - group_sched_in() was missing a cancel_txn() in the error path
    
     - cpuc->n_added was not properly maintained, leading to missing
       actions in hw_perf_enable(), i.e., n_running being 0. You cannot
       update n_added until you know the transaction has succeeded. In
       case of failed transaction n_added was not adjusted back.
    
     - in case of failed transactions, event_sched_out() was called
       and eventually invoked x86_disable_event() to touch the HW reg.
       But with transactions, on X86, event_sched_in() does not touch
       HW registers, it simply collects events into a list. Thus, you
       could end up calling x86_disable_event() on a counter which
       did not correspond to the current event when idx != -1.
    
    The patch modifies the generic and X86 code to avoid all those problems.
    
    First, we keep track of the number of events added last. In case the
    transaction fails, we substract them from n_added. This approach is
    necessary (as opposed to delaying updates to n_added) because not all
    event updates use the transaction API, e.g., single events.
    
    Second, we encapsulate the event_sched_in() and event_sched_out() in
    group_sched_in() inside the transaction. That makes the operations
    symmetrical and you can also detect that you are inside a transaction
    and skip the HW reg access by checking cpuc->group_flag.
    
    With this patch, you can now overcommit the PMU even with pinned
    system-wide events present and still get valid counts.
    Signed-off-by: default avatarStephane Eranian <eranian@google.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <1274796225.5882.1389.camel@twins>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    90151c35
perf_event.c 133 KB