1. 14 Dec, 2022 9 commits
  2. 02 Dec, 2022 1 commit
    • Rafael J. Wysocki's avatar
      Merge Intel thermal control drivers changes for v6.2 · 7d4b19ab
      Rafael J. Wysocki authored
       - Add Raptor Lake-S support to the intel_tcc_cooling driver (Zhang
         Rui).
      
       - Make the intel_tcc_cooling driver detect TCC locking (Zhang Rui).
      
       - Address Coverity warning in intel_hfi_process_event() (Ricardo Neri).
      
       - Prevent accidental clearing of HFI in the package thermal interrupt
         status (Srinivas Pandruvada).
      
       - Protect the clearing of status bits in MSR_IA32_PACKAGE_THERM_STATUS
         and MSR_IA32_THERM_STATUS (Srinivas Pandruvada).
      
       - Allow the HFI interrupt handler to ACK an event for the same
         timestamp (Srinivas Pandruvada).
      
      * thermal-intel:
        thermal: intel: hfi: ACK HFI for the same timestamp
        thermal: intel: Protect clearing of thermal status bits
        thermal: intel: Prevent accidental clearing of HFI status
        thermal: intel: intel_tcc_cooling: Add TCC cooling support for RaptorLake-S
        thermal: intel: intel_tcc_cooling: Detect TCC lock bit
        thermal: intel: hfi: Improve the type of hfi_features::nr_table_pages
      7d4b19ab
  3. 25 Nov, 2022 1 commit
  4. 23 Nov, 2022 3 commits
    • Srinivas Pandruvada's avatar
      thermal: intel: hfi: ACK HFI for the same timestamp · c0e3acdc
      Srinivas Pandruvada authored
      Some processors issue more than one HFI interrupt with the same
      timestamp. Each interrupt must be acknowledged to let the hardware issue
      new HFI interrupts. But this can't be done without some additional flow
      modification in the existing interrupt handling.
      
      For background, the HFI interrupt is a package level thermal interrupt
      delivered via a LVT. This LVT is common for both the CPU and package
      level interrupts. Hence, all CPUs receive the HFI interrupts. But only
      one CPU should process interrupt and others simply exit by issuing EOI
      to LAPIC.
      
      The current HFI interrupt processing flow:
      
        1. Receive Thermal interrupt
        2. Check if there is an active HFI status in MSR_IA32_THERM_STATUS
        3. Try and get spinlock, one CPU will enter spinlock and others
           will simply return from here to issue EOI.
          (Let's assume CPU 4 is processing interrupt)
        4. Check the stored time-stamp from the HFI memory time-stamp
        5. if same
        6.      ignore interrupt, unlock and return
        7. Copy the HFI message to local buffer
        8. unlock spinlock
        9. ACK HFI interrupt
       10. Queue the message for processing in a work-queue
      
      It is tempting to simply acknowledge all the interrupts even if they
      have the same timestamp. This may cause some interrupts to not be
      processed.
      
      Let's say CPU5 is slightly late and reaches step 4 while CPU4 is
      between steps 8 and 9.
      
      Currently we simply ignore interrupts with the same timestamp. No
      issue here for CPU5. When CPU4 acknowledges the interrupt, the next
      HFI interrupt can be delivered.
      
      If we acknowledge interrupts with the same timestamp (at step 6), there
      is a race condition. Under the same scenario, CPU 5 will acknowledge
      the HFI interrupt. This lets hardware generate another HFI interrupt,
      before CPU 4 start executing step 9. Once CPU 4 complete step 9, it
      will acknowledge the newly arrived HFI interrupt, without actually
      processing it.
      
      Acknowledge the interrupt when holding the spinlock. This avoids
      contention of the interrupt acknowledgment.
      
      Updated flow:
      
        1. Receive HFI Thermal interrupt
        2. Check if there is an active HFI status in MSR_IA32_THERM_STATUS
        3. Try and get spin-lock
           Let's assume CPU 4 is processing interrupt
        4.1 Read MSR_IA32_PACKAGE_THERM_STATUS and check HFI status bit
        4.2	If hfi status is 0
        4.3		unlock spinlock
        4.4		return
        4.5 Check the stored time-stamp from the HFI memory time-stamp
        5. if same
        6.1      ACK HFI Interrupt,
        6.2	unlock spinlock
        6.3	return
        7. Copy the HFI message to local buffer
        8. ACK HFI interrupt
        9. unlock spinlock
       10. Queue the message for processing in a work-queue
      
      To avoid taking the lock unnecessarily, intel_hfi_process_event() checks
      the status of the HFI interrupt before taking the lock. If CPU5 is late,
      when it starts processing the interrupt there are two scenarios:
      
       a) CPU4 acknowledged the HFI interrupt before CPU5 read
          MSR_IA32_THERM_STATUS. CPU5 exits.
      
       b) CPU5 reads MSR_IA32_THERM_STATUS before CPU4 has acknowledged the
          interrupt. CPU5 will take the lock if CPU4 has released it. It then
          re-reads MSR_IA32_THERM_STATUS. If there is not a new interrupt,
          the HFI status bit is clear and CPU5 exits. If a new HFI interrupt
          was generated it will find that the status bit is set and it will
          continue to process the interrupt. In this case even if timestamp
          is not changed, the ACK can be issued as this is a new interrupt.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Reviewed-by: default avatarRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Tested-by: Arshad, Adeel<adeel.arshad@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      c0e3acdc
    • Srinivas Pandruvada's avatar
      thermal: intel: Protect clearing of thermal status bits · 930d06bf
      Srinivas Pandruvada authored
      The clearing of the package thermal status is done by Read-Modify-Write
      operation. This may result in clearing of some new status bits which are
      being or about to be processed.
      
      For example, while clearing of HFI status, after read of thermal status
      register, a new thermal status bit is set by the hardware. But during
      write back, the newly generated status bit will be set to 0 or cleared.
      So, it is not safe to do read-modify-write.
      
      Since thermal status Read-Write bits can be set to only 0 not 1, it is
      safe to set all other bits to 1 which are not getting cleared.
      
      Create a common interface for clearing package thermal status bits. Use
      this interface to replace existing code to clear thermal package status
      bits.
      
      It is safe to call from different CPUs without protection as there is no
      read-modify-write. Also wrmsrl results in just single instruction. For
      example while CPU 0 and CPU 3 are clearing bit 1 and 3 respectively. If
      CPU 3 wins the race, it will write 0x4000aa2, then CPU 1 will write
      0x4000aa8. The bits which are not part of clear are set to 1. The default
      mask for bits, which can be written here is 0x4000aaa.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Reviewed-by: default avatarRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      930d06bf
    • Srinivas Pandruvada's avatar
      thermal: intel: Prevent accidental clearing of HFI status · 6fe1e64b
      Srinivas Pandruvada authored
      When there is a package thermal interrupt with PROCHOT log, it will be
      processed and cleared. It is possible that there is an active HFI event
      status, which is about to get processed or getting processed. While
      clearing PROCHOT log bit, it will also clear HFI status bit. This means
      that hardware is free to update HFI memory.
      
      When clearing a package thermal interrupt, some processors will generate
      a "general protection fault" when any of the read only bit is set to 1.
      
      The driver maintains a mask of all read-write bits which can be set.
      
      This mask doesn't include HFI status bit. This bit will also be cleared,
      as it will be assumed read-only bit. So, add HFI status bit 26 to the
      mask.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Reviewed-by: default avatarRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6fe1e64b
  5. 14 Nov, 2022 9 commits
    • Guenter Roeck's avatar
      thermal/core: Protect thermal device operations against thermal device removal · b778b4d7
      Guenter Roeck authored
      Thermal device operations may be called after thermal zone device removal.
      After thermal zone device removal, thermal zone device operations must
      no longer be called. To prevent such calls from happening, ensure that
      the thermal device is registered before executing any thermal device
      operations.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      b778b4d7
    • Guenter Roeck's avatar
      thermal/core: Remove thermal_zone_set_trips() · 91b3aafc
      Guenter Roeck authored
      Since no callers of thermal_zone_set_trips() are left, remove the function.
      Document __thermal_zone_set_trips() instead. Explicitly state that the
      thermal zone lock must be held when calling the function, and that the
      pointer to the thermal zone must be valid.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      91b3aafc
    • Guenter Roeck's avatar
      thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex · 05eeee2b
      Guenter Roeck authored
      Protect access to thermal operations against thermal zone removal by
      acquiring the thermal zone device mutex. After acquiring the mutex, check
      if the thermal zone device is registered and abort the operation if not.
      
      With this change, we can call __thermal_zone_device_update() instead of
      thermal_zone_device_update() from trip_point_temp_store() and from
      emul_temp_store(). Similar, we can call __thermal_zone_set_trips() instead
      of thermal_zone_set_trips() from trip_point_hyst_store().
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      05eeee2b
    • Guenter Roeck's avatar
      thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex · ea37bec5
      Guenter Roeck authored
      In preparation to protecting access to thermal operations against thermal
      zone device removal, protect hwmon accesses to thermal zone operations
      with the thermal zone mutex. After acquiring the mutex, ensure that the
      thermal zone device is registered before proceeding.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ea37bec5
    • Guenter Roeck's avatar
      thermal/core: Introduce locked version of thermal_zone_device_update · 1c439dec
      Guenter Roeck authored
      In thermal_zone_device_set_mode(), the thermal zone mutex is released only
      to be reacquired in the subsequent call to thermal_zone_device_update().
      
      Introduce __thermal_zone_device_update(), which is similar to
      thermal_zone_device_update() but has to be called with the thermal device
      mutex held. Call the new function from thermal_zone_device_set_mode()
      to avoid the extra thermal device mutex release/acquire sequence in that
      function.
      
      With the new function in place, re-implement thermal_zone_device_update()
      as wrapper around __thermal_zone_device_update() to acquire and release
      the thermal device mutex.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1c439dec
    • Guenter Roeck's avatar
      thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp · ed97d10a
      Guenter Roeck authored
      All callers of __thermal_zone_get_temp() already validated the
      thermal zone parameters. Move validation to thermal_zone_get_temp()
      where it is actually needed. Also add kernel documentation for
      __thermal_zone_get_temp(), listing the requirement that the
      function must be called with validated parameters and with thermal
      device mutex held.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ed97d10a
    • Guenter Roeck's avatar
      thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp · 1c6b3006
      Guenter Roeck authored
      Calls to thermal_zone_get_temp() are not protected against thermal zone
      device removal. As result, it is possible that the thermal zone operations
      callbacks are no longer valid when thermal_zone_get_temp() is called.
      This may result in crashes such as
      
      BUG: unable to handle page fault for address: ffffffffc04ef420
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
      PGD 5d60e067 P4D 5d60e067 PUD 5d610067 PMD 110197067 PTE 0
      Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 1 PID: 3209 Comm: cat Tainted: G        W         5.10.136-19389-g615abc6eb807 #1 02df41ac0b12f3a64f4b34245188d8875bb3bce1
      Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.92.0 11/27/2018
      RIP: 0010:thermal_zone_get_temp+0x26/0x73
      Code: 89 c3 eb d3 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 85 ff 74 50 48 89 fb 48 81 ff 00 f0 ff ff 77 44 48 8b 83 98 03 00 00 <48> 83 78 10 00 74 36 49 89 f6 4c 8d bb d8 03 00 00 4c 89 ff e8 9f
      RSP: 0018:ffffb3758138fd38 EFLAGS: 00010287
      RAX: ffffffffc04ef410 RBX: ffff98f14d7fb000 RCX: 0000000000000000
      RDX: ffff98f17cf90000 RSI: ffffb3758138fd64 RDI: ffff98f14d7fb000
      RBP: ffffb3758138fd50 R08: 0000000000001000 R09: ffff98f17cf90000
      R10: 0000000000000000 R11: ffffffff8dacad28 R12: 0000000000001000
      R13: ffff98f1793a7d80 R14: ffff98f143231708 R15: ffff98f14d7fb018
      FS:  00007ec166097800(0000) GS:ffff98f1bbd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffc04ef420 CR3: 000000010ee9a000 CR4: 00000000003506e0
      Call Trace:
       temp_show+0x31/0x68
       dev_attr_show+0x1d/0x4f
       sysfs_kf_seq_show+0x92/0x107
       seq_read_iter+0xf5/0x3f2
       vfs_read+0x205/0x379
       __x64_sys_read+0x7c/0xe2
       do_syscall_64+0x43/0x55
       entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      if a thermal device is removed while accesses to its device attributes
      are ongoing.
      
      The problem is exposed by code in iwl_op_mode_mvm_start(), which registers
      a thermal zone device only to unregister it shortly afterwards if an
      unrelated failure is encountered while accessing the hardware.
      
      Check if the thermal zone device is registered after acquiring the
      thermal zone device mutex to ensure this does not happen.
      
      The code was tested by triggering the failure in iwl_op_mode_mvm_start()
      on purpose. Without this patch, the kernel crashes reliably. The crash
      is no longer observed after applying this and the preceding patches.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1c6b3006
    • Guenter Roeck's avatar
      thermal/core: Delete device under thermal device zone lock · 30b2ae07
      Guenter Roeck authored
      Thermal device attributes may still be opened after unregistering
      the thermal zone and deleting the thermal device.
      
      Currently there is no protection against accessing thermal device
      operations after unregistering a thermal zone. To enable adding
      such protection, protect the device delete operation with the
      thermal zone device mutex. This requires splitting the call to
      device_unregister() into its components, device_del() and put_device().
      Only the first call can be executed under mutex protection, since
      put_device() may result in releasing the thermal zone device memory.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      30b2ae07
    • Guenter Roeck's avatar
      thermal/core: Destroy thermal zone device mutex in release function · d35f29ed
      Guenter Roeck authored
      Accesses to thermal zones, and with it the thermal zone device mutex,
      are still possible after the thermal zone device has been unregistered.
      For example, thermal_zone_get_temp() can be called from temp_show()
      in thermal_sysfs.c if the sysfs attribute was opened before the thermal
      device was unregistered.
      
      Move the call to mutex_destroy from thermal_zone_device_unregister()
      to thermal_release() to ensure that it is only destroyed after it is
      guaranteed to be no longer accessed.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      d35f29ed
  6. 09 Nov, 2022 2 commits
  7. 28 Oct, 2022 2 commits
  8. 25 Oct, 2022 2 commits
  9. 23 Oct, 2022 9 commits
  10. 22 Oct, 2022 2 commits