-
Chris Wilson authored
We monitor the health of the system via periodic heartbeat pulses. The pulses also provide the opportunity to perform garbage collection. However, we interpret an incomplete pulse (a missed heartbeat) as an indication that the system is no longer responsive, i.e. hung, and perform an engine or full GPU reset. Given that the preemption granularity can be very coarse on a system, we let the sysadmin override our legacy timeouts which were "optimised" for desktop applications. The heartbeat interval can be adjusted per-engine using, /sys/class/drm/card?/engine/*/heartbeat_interval_ms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-7-chris@chris-wilson.co.uk
9a40bddd