• Linus Torvalds's avatar
    drm/i915: Check for NULL i915_vma in intel_unpin_fb_obj() · 39cb2c9a
    Linus Torvalds authored
    I've seen this trigger twice now, where the i915_gem_object_to_ggtt()
    call in intel_unpin_fb_obj() returns NULL, resulting in an oops
    immediately afterwards as the (inlined) call to i915_vma_unpin_fence()
    tries to dereference it.
    
    It seems to be some race condition where the object is going away at
    shutdown time, since both times happened when shutting down the X
    server.  The call chains were different:
    
     - VT ioctl(KDSETMODE, KD_TEXT):
    
        intel_cleanup_plane_fb+0x5b/0xa0 [i915]
        drm_atomic_helper_cleanup_planes+0x6f/0x90 [drm_kms_helper]
        intel_atomic_commit_tail+0x749/0xfe0 [i915]
        intel_atomic_commit+0x3cb/0x4f0 [i915]
        drm_atomic_commit+0x4b/0x50 [drm]
        restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper]
        drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper]
        drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
        intel_fbdev_set_par+0x18/0x70 [i915]
        fb_set_var+0x236/0x460
        fbcon_blank+0x30f/0x350
        do_unblank_screen+0xd2/0x1a0
        vt_ioctl+0x507/0x12a0
        tty_ioctl+0x355/0xc30
        do_vfs_ioctl+0xa3/0x5e0
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x13/0x94
    
     - i915 unpin_work workqueue:
    
        intel_unpin_work_fn+0x58/0x140 [i915]
        process_one_work+0x1f1/0x480
        worker_thread+0x48/0x4d0
        kthread+0x101/0x140
    
    and this patch purely papers over the issue by adding a NULL pointer
    check and a WARN_ON_ONCE() to avoid the oops that would then generally
    make the machine unresponsive.
    
    Other callers of i915_gem_object_to_ggtt() seem to also check for the
    returned pointer being NULL and warn about it, so this clearly has
    happened before in other places.
    
    [ Reported it originally to the i915 developers on Jan 8, applying the
      ugly workaround on my own now after triggering the problem for the
      second time with no feedback.
    
      This is likely to be the same bug reported as
    
         https://bugs.freedesktop.org/show_bug.cgi?id=98829
         https://bugs.freedesktop.org/show_bug.cgi?id=99134
    
      which has a patch for the underlying problem, but it hasn't gotten to
      me, so I'm applying the workaround. ]
    
    Cc: Daniel Vetter <daniel.vetter@intel.com>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Imre Deak <imre.deak@intel.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    39cb2c9a
intel_display.c 485 KB