• Nícolas F. R. A. Prado's avatar
    thermal: core: Don't update trip points inside the hysteresis range · cf3986f8
    Nícolas F. R. A. Prado authored
    When searching for the trip points that need to be set, the nearest
    higher trip point's temperature is used for the high trip, while the
    nearest lower trip point's temperature minus the hysteresis is used for
    the low trip. The issue with this logic is that when the current
    temperature is inside a trip point's hysteresis range, both high and low
    trips will come from the same trip point. As a consequence instability
    can still occur like this:
    * the temperature rises slightly and enters the hysteresis range of a
      trip point
    * polling happens and updates the trip points to the hysteresis range
    * the temperature falls slightly, exiting the hysteresis range, crossing
      the trip point and triggering an IRQ, the trip points are updated
    * repeat
    
    So even though the current hysteresis implementation prevents
    instability from happening due to IRQs triggering on the same
    temperature value, both ways, it doesn't prevent it from happening due
    to an IRQ on one way and polling on the other.
    
    To properly implement a hysteresis behavior, when inside the hysteresis
    range, don't update the trip points. This way, the previously set trip
    points will stay in effect, which will in a way remember the previous
    state (if the temperature signal came from above or below the range) and
    therefore have the right trip point already set.
    
    The exception is if there was no previous trip point set, in which case
    a previous state doesn't exist, and so it's sensible to allow the
    hysteresis range as trip points.
    
    The following logs show the current behavior when running on a real
    machine:
    
    [  202.524658] thermal thermal_zone0: new temperature boundaries: -2147483647 < x < 40000
       203.562817: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=36986 temp=37979
    [  203.562845] thermal thermal_zone0: new temperature boundaries: 37000 < x < 40000
       204.176059: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=37979 temp=40028
    [  204.176089] thermal thermal_zone0: new temperature boundaries: 37000 < x < 100000
       205.226813: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=40028 temp=38652
    [  205.226842] thermal thermal_zone0: new temperature boundaries: 37000 < x < 40000
    
    And with this patch applied:
    
    [  184.933415] thermal thermal_zone0: new temperature boundaries: -2147483647 < x < 40000
       185.981182: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=36986 temp=37872
       186.744685: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=37872 temp=40058
    [  186.744716] thermal thermal_zone0: new temperature boundaries: 37000 < x < 100000
       187.773284: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=40058 temp=38698
    
    Fixes: 060c034a ("thermal: Add support for hardware-tracked trip points")
    Signed-off-by: default avatarNícolas F. R. A. Prado <nfraprado@collabora.com>
    Reviewed-by: default avatarAngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Co-developed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    cf3986f8
thermal_trip.c 4.65 KB