• Michel Thierry's avatar
    drm/i915: Modify error handler for per engine hang recovery · 142bc7d9
    Michel Thierry authored
    This is a preparatory patch which modifies error handler to do per engine
    hang recovery. The actual patch which implements this sequence follows
    later in the series. The aim is to prepare existing recovery function to
    adapt to this new function where applicable (which fails at this point
    because core implementation is lacking) and continue recovery using legacy
    full gpu reset.
    
    A helper function is also added to query the availability of engine
    reset. A subsequent patch will add the capability to query which type
    of reset is present (engine -> full -> no-reset) via the get-param
    ioctl.
    
    It has been decided that the error events that are used to notify user of
    reset will only be sent in case if full chip reset. In case of just
    single (or multiple) engine resets, userspace won't be notified by these
    events.
    
    Note that this implementation of engine reset is for i915 directly
    submitting to the ELSP, where the driver manages the hang detection,
    recovery and resubmission. With GuC submission these tasks are shared
    between driver and firmware; i915 will still responsible for detecting a
    hang, and when it does it will have to request GuC to reset that Engine and
    remind the firmware about the outstanding submissions. This will be
    added in different patch.
    
    v2: rebase, advertise engine reset availability in platform definition,
    add note about GuC submission.
    v3: s/*engine_reset*/*reset_engine*/. (Chris)
    Handle reset as 2 level resets, by first going to engine only and fall
    backing to full/chip reset as needed, i.e. reset_engine will need the
    struct_mutex.
    v4: Pass the engine mask to i915_reset. (Chris)
    v5: Rebase, update selftests.
    v6: Rebase, prepare for mutex-less reset engine.
    v7: Pass reset_engine mask as a function parameter, and iterate over the
    engine mask for reset_engine. (Chris)
    v8: Use i915.reset >=2 in has_reset_engine; remove redundant reset
    logging; add a reset-engine-in-progress flag to prevent concurrent
    resets, and avoid dual purposing of reset-backoff. (Chris)
    v9: Support reset of different engines in parallel (Chris)
    v10: Handle reset-engine flag locking better (Chris)
    v11: Squash in reporting of per-engine-reset availability.
    
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Signed-off-by: default avatarIan Lister <ian.lister@intel.com>
    Signed-off-by: default avatarTomas Elf <tomas.elf@intel.com>
    Signed-off-by: default avatarArun Siluvery <arun.siluvery@linux.intel.com>
    Signed-off-by: default avatarMichel Thierry <michel.thierry@intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/20170615201828.23144-4-michel.thierry@intel.comReviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Link: http://patchwork.freedesktop.org/patch/msgid/20170620095751.13127-5-chris@chris-wilson.co.uk
    142bc7d9
i915_pci.c 15.7 KB