drm/i915: Uninterruptibly drain the timelines on unwedging

On wedging, we mark all executing requests as complete and all pending requests completed as soon as they are ready. Before unwedging though we wish to flush those pending requests prior to restoring default execution, and so we must wait. Do so uninterruptibly as we do not provide the EINTR gracefully back to userspace in this case but persists in keeping the permanently wedged state without restarting the syscall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-4-chris@chris-wilson.co.uk

drm/i915: Uninterruptibly drain the timelines on unwedging
On wedging, we mark all executing requests as complete and all pending requests completed as soon as they are ready. Before unwedging though we wish to flush those pending requests prior to restoring default execution, and so we must wait. Do so uninterruptibly as we do not provide the EINTR gracefully back to userspace in this case but persists in keeping the permanently wedged state without restarting the syscall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-4-chris@chris-wilson.co.uk
13e87536 · Chris Wilson · 0eb6a3f7 · 13e87536
Commit 13e87536 authored Feb 08, 2019 by Chris Wilson
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 20 deletions

drivers/gpu/drm/i915/i915_reset.c drivers/gpu/drm/i915/i915_reset.c +8 -20

No files found.
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -862,7 +862,6 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
 	struct i915_timeline *tl;
-	bool ret = false;
 	if (!test_bit(I915_WEDGED, &error->flags))
 		return true;
@@ -887,30 +886,20 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	mutex_lock(&i915->gt.timelines.mutex);
 	list_for_each_entry(tl, &i915->gt.timelines.active_list, link) {
 		struct i915_request *rq;
-		long timeout;
 		rq = i915_active_request_get_unlocked(&tl->last_request);
 		if (!rq)
 			continue;
 		/*
-		 * We can't use our normal waiter as we want to
+		 * All internal dependencies (i915_requests) will have
-		 * avoid recursively trying to handle the current
+		 * been flushed by the set-wedge, but we may be stuck waiting
-		 * reset. The basic dma_fence_default_wait() installs
+		 * for external fences. These should all be capped to 10s
-		 * a callback for dma_fence_signal(), which is
+		 * (I915_FENCE_TIMEOUT) so this wait should not be unbounded
-		 * triggered by our nop handler (indirectly, the
+		 * in the worst case.
-		 * callback enables the signaler thread which is
-		 * woken by the nop_submit_request() advancing the seqno
-		 * and when the seqno passes the fence, the signaler
-		 * then signals the fence waking us up).
 		 */
-		timeout = dma_fence_default_wait(&rq->fence, true,
+		dma_fence_default_wait(&rq->fence, false, MAX_SCHEDULE_TIMEOUT);
-						 MAX_SCHEDULE_TIMEOUT);
 		i915_request_put(rq);
-		if (timeout < 0) {
-			mutex_unlock(&i915->gt.timelines.mutex);
-			goto unlock;
-		}
 	}
 	mutex_unlock(&i915->gt.timelines.mutex);
@@ -931,11 +920,10 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
 	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
-	ret = true;
-unlock:
 	mutex_unlock(&i915->gpu_error.wedge_mutex);
-	return ret;
+	return true;
 }
 static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)