• Srinivas Pandruvada's avatar
    x86/mce/therm_throt: Optimize notifications of thermal throttle · f6656208
    Srinivas Pandruvada authored
    Some modern systems have very tight thermal tolerances. Because of this
    they may cross thermal thresholds when running normal workloads (even
    during boot). The CPU hardware will react by limiting power/frequency
    and using duty cycles to bring the temperature back into normal range.
    
    Thus users may see a "critical" message about the "temperature above
    threshold" which is soon followed by "temperature/speed normal". These
    messages are rate-limited, but still may repeat every few minutes.
    
    This issue became worse starting with the Ivy Bridge generation of
    CPUs because they include a TCC activation offset in the MSR
    IA32_TEMPERATURE_TARGET. OEMs use this to provide alerts long before
    critical temperatures are reached.
    
    A test run on a laptop with Intel 8th Gen i5 core for two hours with a
    workload resulted in 20K+ thermal interrupts per CPU for core level and
    another 20K+ interrupts at package level. The kernel logs were full of
    throttling messages.
    
    The real value of these threshold interrupts, is to debug problems with
    the external cooling solutions and performance issues due to excessive
    throttling.
    
    So the solution here is the following:
    
      - In the current thermal_throttle folder, show:
        - the maximum time for one throttling event and,
        - the total amount of time the system was in throttling state.
    
      - Do not log short excursions.
    
      - Log only when, in spite of thermal throttling, the temperature is rising.
      On the high threshold interrupt trigger a delayed workqueue that
      monitors the threshold violation log bit (THERM_STATUS_PROCHOT_LOG). When
      the log bit is set, this workqueue callback calculates three point moving
      average and logs a warning message when the temperature trend is rising.
    
      When this log bit is clear and temperature is below threshold
      temperature, then the workqueue callback logs a "Normal" message. Once a
      high threshold event is logged, the logging is rate-limited.
    
    With this patch on the same test laptop, no warnings are printed in the logs
    as the max time the processor could bring the temperature under control is
    only 280 ms.
    
    This implementation is done with the inputs from Alan Cox and Tony Luck.
    
     [ bp: Touchups. ]
    Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: bberg@redhat.com
    Cc: ckellner@redhat.com
    Cc: hdegoede@redhat.com
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: linux-edac <linux-edac@vger.kernel.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: x86-ml <x86@kernel.org>
    Link: https://lkml.kernel.org/r/20191111214312.81365-1-srinivas.pandruvada@linux.intel.com
    f6656208
therm_throt.c 20.7 KB