• Mike Travis's avatar
    x86/UV: Update UV support for external NMI signals · 0d12ef0c
    Mike Travis authored
    The current UV NMI handler has not been updated for the changes
    in the system NMI handler and the perf operations.  The UV NMI
    handler reads an MMR in the UV Hub to check to see if the NMI
    event was caused by the external 'system NMI' that the operator
    can initiate on the System Mgmt Controller.
    
    The problem arises when the perf tools are running, causing
    millions of perf events per second on very large CPU count
    systems.  Previously this was okay because the perf NMI handler
    ran at a higher priority on the NMI call chain and if the NMI
    was a perf event, it would stop calling other NMI handlers
    remaining on the NMI call chain.
    
    Now the system NMI handler calls all the handlers on the NMI
    call chain including the UV NMI handler.  This causes the UV NMI
    handler to read the MMRs at the same millions per second rate.
    This can lead to significant performance loss and possible
    system failures.  It also can cause thousands of 'Dazed and
    Confused' messages being sent to the system console.  This
    effectively makes perf tools unusable on UV systems.
    
    To avoid this excessive overhead when perf tools are running,
    this code has been optimized to minimize reading of the MMRs as
    much as possible, by moving to the NMI_UNKNOWN notifier chain.
    This chain is called only when all the users on the standard
    NMI_LOCAL call chain have been called and none of them have
    claimed this NMI.
    
    There is an exception where the NMI_LOCAL notifier chain is
    used.  When the perf tools are in use, it's possible that the UV
    NMI was captured by some other NMI handler and then either
    ignored or mistakenly processed as a perf event.  We set a
    per_cpu ('ping') flag for those CPUs that ignored the initial
    NMI, and then send them an IPI NMI signal.  The NMI_LOCAL
    handler on each cpu does not need to read the MMR, but instead
    checks the in memory flag indicating it was pinged.  There are
    two module variables, 'ping_count' indicating how many requested
    NMI events occurred, and 'ping_misses' indicating how many stray
    NMI events.  These most likely are perf events so it shows the
    overhead of the perf NMI interrupts and how many MMR reads were avoided.
    
    This patch also minimizes the reads of the MMRs by having the
    first cpu entering the NMI handler on each node set a per HUB
    in-memory atomic value.  (Having a per HUB value avoids sending
    lock traffic over NumaLink.)  Both types of UV NMIs from the SMI
    layer are supported.
    Signed-off-by: default avatarMike Travis <travis@sgi.com>
    Reviewed-by: default avatarDimitri Sivanich <sivanich@sgi.com>
    Reviewed-by: default avatarHedi Berriche <hedi@sgi.com>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
    Cc: Jason Wessel <jason.wessel@windriver.com>
    Link: http://lkml.kernel.org/r/20130923212500.353547733@asylum.americas.sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    0d12ef0c
x2apic_uv_x.c 25.9 KB