Commits · 7741b547b6e000b08e20667bb3bef22e1a362661 · Kirill Smelkov / linux

10 Oct, 2017 9 commits

drm/i915: Preallocate our mmu notifier workequeu to unbreak cpu hotplug deadlock · 7741b547

Daniel Vetter authored Oct 09, 2017

4.14-rc1 gained the fancy new cross-release support in lockdep, which
seems to have uncovered a few more rules about what is allowed and
isn't.

This one here seems to indicate that allocating a work-queue while
holding mmap_sem is a no-go, so let's try to preallocate it.

Of course another way to break this chain would be somewhere in the
cpu hotplug code, since this isn't the only trace we're finding now
which goes through msr_create_device.

Full lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
4.14.0-rc1-CI-CI_DRM_3118+ #1 Tainted: G     U
------------------------------------------------------
prime_mmap/1551 is trying to acquire lock:
 (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8109dbb7>] apply_workqueue_attrs+0x17/0x50

but task is already holding lock:
 (&dev_priv->mm_lock){+.+.}, at: [<ffffffffa01a7b2a>] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&dev_priv->mm_lock){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_nested+0x1b/0x20
       i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]
       i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
       drm_ioctl_kernel+0x69/0xb0
       drm_ioctl+0x2f9/0x3d0
       do_vfs_ioctl+0x94/0x670
       SyS_ioctl+0x41/0x70
       entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #5 (&mm->mmap_sem){++++}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __might_fault+0x68/0x90
       _copy_to_user+0x23/0x70
       filldir+0xa5/0x120
       dcache_readdir+0xf9/0x170
       iterate_dir+0x69/0x1a0
       SyS_getdents+0xa5/0x140
       entry_SYSCALL_64_fastpath+0x1c/0xb1

-> #4 (&sb->s_type->i_mutex_key#5){++++}:
       down_write+0x3b/0x70
       handle_create+0xcb/0x1e0
       devtmpfsd+0x139/0x180
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #3 ((complete)&req.done){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       wait_for_common+0x58/0x210
       wait_for_completion+0x1d/0x20
       devtmpfs_create_node+0x13d/0x160
       device_add+0x5eb/0x620
       device_create_groups_vargs+0xe0/0xf0
       device_create+0x3a/0x40
       msr_device_create+0x2b/0x40
       cpuhp_invoke_callback+0xa3/0x840
       cpuhp_thread_fun+0x7a/0x150
       smpboot_thread_fn+0x18a/0x280
       kthread+0x152/0x190
       ret_from_fork+0x27/0x40

-> #2 (cpuhp_state){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpuhp_issue_call+0x10b/0x170
       __cpuhp_setup_state_cpuslocked+0x134/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_writeback_init+0x43/0x67
       pagecache_init+0x3d/0x42
       start_kernel+0x3a8/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #1 (cpuhp_state_mutex){+.+.}:
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       __mutex_lock+0x86/0x9b0
       mutex_lock_nested+0x1b/0x20
       __cpuhp_setup_state_cpuslocked+0x52/0x2a0
       __cpuhp_setup_state+0x46/0x60
       page_alloc_init+0x28/0x30
       start_kernel+0x145/0x3fc
       x86_64_start_reservations+0x2a/0x2c
       x86_64_start_kernel+0x6d/0x70
       verify_cpu+0x0/0xfb

-> #0 (cpu_hotplug_lock.rw_sem){++++}:
       check_prev_add+0x430/0x840
       __lock_acquire+0x1420/0x15e0
       lock_acquire+0xb0/0x200
       cpus_read_lock+0x3d/0xb0
       apply_workqueue_attrs+0x17/0x50
       __alloc_workqueue_key+0x1d8/0x4d9
       i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
       i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
       drm_ioctl_kernel+0x69/0xb0
       drm_ioctl+0x2f9/0x3d0
       do_vfs_ioctl+0x94/0x670
       SyS_ioctl+0x41/0x70
       entry_SYSCALL_64_fastpath+0x1c/0xb1

other info that might help us debug this:

Chain exists of:
  cpu_hotplug_lock.rw_sem --> &mm->mmap_sem --> &dev_priv->mm_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev_priv->mm_lock);
                               lock(&mm->mmap_sem);
                               lock(&dev_priv->mm_lock);
  lock(cpu_hotplug_lock.rw_sem);

 *** DEADLOCK ***

2 locks held by prime_mmap/1551:
 #0:  (&mm->mmap_sem){++++}, at: [<ffffffffa01a7b18>] i915_gem_userptr_init__mmu_notifier+0x138/0x270 [i915]
 #1:  (&dev_priv->mm_lock){+.+.}, at: [<ffffffffa01a7b2a>] i915_gem_userptr_init__mmu_notifier+0x14a/0x270 [i915]

stack backtrace:
CPU: 4 PID: 1551 Comm: prime_mmap Tainted: G     U          4.14.0-rc1-CI-CI_DRM_3118+ #1
Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
Call Trace:
 dump_stack+0x68/0x9f
 print_circular_bug+0x235/0x3c0
 ? lockdep_init_map_crosslock+0x20/0x20
 check_prev_add+0x430/0x840
 __lock_acquire+0x1420/0x15e0
 ? __lock_acquire+0x1420/0x15e0
 ? lockdep_init_map_crosslock+0x20/0x20
 lock_acquire+0xb0/0x200
 ? apply_workqueue_attrs+0x17/0x50
 cpus_read_lock+0x3d/0xb0
 ? apply_workqueue_attrs+0x17/0x50
 apply_workqueue_attrs+0x17/0x50
 __alloc_workqueue_key+0x1d8/0x4d9
 ? __lockdep_init_map+0x57/0x1c0
 i915_gem_userptr_init__mmu_notifier+0x1fb/0x270 [i915]
 i915_gem_userptr_ioctl+0x222/0x2c0 [i915]
 ? i915_gem_userptr_release+0x140/0x140 [i915]
 drm_ioctl_kernel+0x69/0xb0
 drm_ioctl+0x2f9/0x3d0
 ? i915_gem_userptr_release+0x140/0x140 [i915]
 ? __do_page_fault+0x2a4/0x570
 do_vfs_ioctl+0x94/0x670
 ? entry_SYSCALL_64_fastpath+0x5/0xb1
 ? __this_cpu_preempt_check+0x13/0x20
 ? trace_hardirqs_on_caller+0xe3/0x1b0
 SyS_ioctl+0x41/0x70
 entry_SYSCALL_64_fastpath+0x1c/0xb1
RIP: 0033:0x7fbb83c39587
RSP: 002b:00007fff188dc228 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: ffffffff81492963 RCX: 00007fbb83c39587
RDX: 00007fff188dc260 RSI: 00000000c0186473 RDI: 0000000000000003
RBP: ffffc90001487f88 R08: 0000000000000000 R09: 00007fff188dc2ac
R10: 00007fbb83efcb58 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 00000000c0186473 R15: 00007fff188dc2ac
 ? __this_cpu_preempt_check+0x13/0x20

Note that this also has the minor benefit of slightly reducing the
critical section where we hold mmap_sem.

v2: Set ret correctly when we raced with another thread.

v3: Use Chris' diff. Attach the right lockdep splat.

v4: Repaint in Tvrtko's colors (aka don't report ENOMEM if we race and
some other thread managed to not also get an ENOMEM and successfully
install the mmu notifier. Note that the kernel guarantees that small
allocations succeed, so this never actually happens).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sasha Levin <alexander.levin@verizon.com>
Cc: Marta Lofstedt <marta.lofstedt@intel.com>
Cc: Tejun Heo <tj@kernel.org>
References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3180/shard-hsw3/igt@prime_mmap@test_userptr.html
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102939Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009164401.16035-1-daniel.vetter@ffwll.ch

7741b547

drm/i915/bios: don't pass bdb to parsers that don't parse VBT directly · 0ead5f81

Jani Nikula authored Sep 28, 2017

Hint that you're not supposed to look at VBT in these functions.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b82c326be8c796a70bdc2bd1c479bbb6159f5cb0.1506586821.git.jani.nikula@intel.com

0ead5f81

drm/i915/bios: parse SDVO device mapping from pre-parsed child devices · 0ebdabe6

Jani Nikula authored Sep 28, 2017

We parse and store the child devices in
parse_general_definitions(). There is no need to parse the VBT block
again for SDVO device mapping. Do the same as we do in
parse_ddi_ports().

We no longer have access to child device size at this stage, but we also
don't need to worry about reading past the child device anymore. Instead
of a child device size check, do a mild optimization by limiting the
parsing to gens 3 through 7.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c918d4173dd38a165295f1270cb16c2c01bd8cd1.1506586821.git.jani.nikula@intel.com

0ebdabe6

drm/i915/bios: merge parse_device_mapping() into parse_general_definitions() · b3ca1f43

Jani Nikula authored Sep 28, 2017

They're both parsing the same block, and there's no need for them to be
split. The former also benefits from the range checks in the latter.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/64a292606ecbb0b8602e6c5523c5746573ec3944.1506586821.git.jani.nikula@intel.com

b3ca1f43

drm/i915/bios: cleanup comments and useless return · 53f6b243

Jani Nikula authored Sep 28, 2017

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4a95fb9d23d980830e3158d3c57258e6539965ce.1506586821.git.jani.nikula@intel.com

53f6b243

drm/i915/bios: remove an unnecessary temp variable · 127704f5

Jani Nikula authored Sep 28, 2017

Prepare for merging parse_device_mapping() into
parse_general_definitions(). No functional changes.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f1c3621e2622f4debdfb4a2f5c1959845754ac04.1506586821.git.jani.nikula@intel.com

127704f5

drm/i915/bios: don't initialize fields based on vbt version · 2d936f1c

Jani Nikula authored Sep 28, 2017

In theory, these might clobber information for older VBT versions.

We might have to store the BDB version for later parsing, but currently
all code accessing these fields will only use them on newer platforms
with new enough BDB versions.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0232d9cb258e8f83c4180cdb8aad1459a312ec2a.1506586821.git.jani.nikula@intel.com

2d936f1c

drm/i915/bios: refactor parse general definitions · a87145ca

Jani Nikula authored Sep 28, 2017

Early return on failures. Rename the variable for later merging with
parse_device_mappings().
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/785abb904a572752fec68d90d34efeb67774dc1f.1506586821.git.jani.nikula@intel.com

a87145ca

drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel · 348e4058

Jani Nikula authored Sep 28, 2017

While technically CHV isn't DDI, we do look at the VBT based DDI port
info for HDMI DDC pin and DP AUX channel. (We call these "alternate",
but they're really just something that aren't platform defaults.)

In commit e4ab73a1 ("drm/i915: Respect alternate_ddc_pin for all DDI
ports") Ville writes, "IIRC there may be CHV system that might actually
need this."

I'm not sure why there couldn't be even more platforms that need this,
but start conservative, and parse the info for CHV in addition to DDI.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100553Reported-by: Marek Wilczewski <mw@3cte.pl>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d0815082cb98487618429b62414854137049b888.1506586821.git.jani.nikula@intel.com

348e4058

09 Oct, 2017 18 commits

drm/i915: avoid division by zero on cnl_calc_wrpll_link · 0e005888

Paulo Zanoni authored Oct 05, 2017

If for some unexpected reason the registers all read zero it's better
to WARN and return instead of dividing by zero and completely freezing
the machine.

I don't expect this to happen in the wild with the current code, but I
accidentally triggered the division by zero while doing some debugging
in an unusual environment.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171005213842.11423-2-paulo.r.zanoni@intel.com

0e005888

drm/i915: add the BXT and CNL DPLL registers to pipe_config_compare · 2de38138

Paulo Zanoni authored Sep 22, 2017

Looks like we were missing them.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170922205343.16006-2-paulo.r.zanoni@intel.com

2de38138

drm/i915: s/sg_mask/sg_page_sizes/ · 84e8978e

Matthew Auld authored Oct 09, 2017

It's a little unclear what the sg_mask actually is, so prefer the more
meaningful name of sg_page_sizes.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

84e8978e

drm/i915: Early rejection of mappable GGTT pin attempts for large bo · 43ae70d9

Chris Wilson authored Oct 09, 2017

Currently, we reject attempting to pin a large bo into the mappable
aperture, but only after trying to create the vma. Under debug kernels,
repeatedly creating and freeing that vma for an oversized bo consumes
one-third of the runtime for pwrite/pread tests as it is spent on
kmalloc/kfree tracking. If we move the rejection to before creating that
vma, we lose some accuracy of checking against the fence_size as opposed
to object size, though the fence can never be smaller than the object.
Note that the vma creation itself will reject an attempt to create a vma
larger than the GTT so we can remove one redundant test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-7-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

43ae70d9

drm/i915: Avoid evicting user fault mappable vma for pread/pwrite · a3259ca9

Chris Wilson authored Oct 09, 2017

Both pread/pwrite GTT paths provide a fast fallback in case we cannot
map the whole object at a time. Currently, we use the fallback for very
large objects and for active objects that would require remapping, but
we can also add active fault mappable objects to the list that we want
to avoid evicting. The rationale is that such fault mappable objects are
in active use and to evict requires tearing down the CPU PTE and forcing
a page fault on the next access; more costly, and intefers with other
processes, than our per-page GTT fallback.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-6-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

a3259ca9

drm/i915: Try a minimal attempt to insert the whole object for relocations · 3c755c5b

Chris Wilson authored Oct 09, 2017

As we have a lightweight fallback to insert a single page into the
aperture, try to avoid any heavier evictions when attempting to insert
the entire object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-5-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

3c755c5b

drm/i915: Check PIN_NONFAULT overlaps in evict_for_node · f34a93bb

Chris Wilson authored Oct 09, 2017

If the caller says that he doesn't want to evict any other faulting
vma, honour that flag. The logic was used in evict_something, but not
the more specific evict_for_node, now being used as a preliminary probe
since commit 606fec95 ("drm/i915: Prefer random replacement before
eviction search").

Fixes: 606fec95 ("drm/i915: Prefer random replacement before eviction search")
Fixes: 82118877 ("drm/i915: Choose not to evict faultable objects from the GGTT")
References: https://bugs.freedesktop.org/show_bug.cgi?id=102490Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-4-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

f34a93bb

drm/i915: Track user GTT faulting per-vma · a65adaf8

Chris Wilson authored Oct 09, 2017

We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

a65adaf8

drm/i915: Consolidate get_fence with pin_fence · 3bd40735

Chris Wilson authored Oct 09, 2017

Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

3bd40735

drm/i915: Pin fence for iomap · b4563f59

Chris Wilson authored Oct 09, 2017

Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.

We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk

b4563f59

drm/i915: Provide an assert for when we expect forcewake to be held · 67e64564

Chris Wilson authored Oct 09, 2017

Add assert_forcewakes_active() (the complementary function to
assert_forcewakes_inactive) that documents the requirement of a
function for its callers to be holding the forcewake ref (i.e. the
function is part of a sequence over which RC6 must be prevented).

One such example is during ringbuffer reset, where RC6 must be held
across the whole reinitialisation sequence.

v2: Include debug information in the WARN so we know which fw domain is
missing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-5-chris@chris-wilson.co.uk

67e64564

drm/i915/selftests: Hold the rpm wakeref for the reset tests · ff97d3ae

Chris Wilson authored Oct 09, 2017

The lowlevel reset functions expect the caller to be holding the rpm
wakeref for the device access across the reset. We were not explicitly
doing this in the sefltest, so for simplicity acquire the wakeref for
the duration of all subtests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-4-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

ff97d3ae

drm/i915: Hold forcewake for the duration of reset+restart · 1749d90f

Chris Wilson authored Oct 09, 2017

Resetting the engine requires us to hold the forcewake wakeref to
prevent RC6 trying to happen in the middle of the reset sequence. The
consequence of an unwanted RC6 event in the middle is that random state
is then saved to the powercontext and restored later, which may
overwrite the mmio state we need to preserve (e.g. PD_DIR_BASE in the
legacy ringbuffer reset_ring_common()).

This was noticed in the live_hangcheck selftests when Haswell would
sporadically fail to restart during igt_reset_queue().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-3-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

1749d90f

drm/i915/selftests: Pretty print engine state when requests fail to start · 95a19ab4

Chris Wilson authored Oct 09, 2017

During hangcheck testing, we try to execute requests following the GPU
reset, and in particular want to try and debug when those fail.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-2-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

95a19ab4

drm/i915: Make i915_engine_info pretty printer to standalone · f636edb2

Chris Wilson authored Oct 09, 2017

We can use drm_printer to hide the differences between printk and
seq_printf, and so make the i915_engine_info pretty printer able to be
called from different contexts and not just debugfs. For instance, I
want to use the pretty printer to debug kselftests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-1-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

f636edb2

drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT · bef27bdb

Chris Wilson authored Oct 09, 2017

We only apply the hugepage PD redirection inside the ppGTT, so during
i915_vma_insert() we want to exclude the GGTT from the additional
alignment constraints (thereby avoiding the extra GTT pressure from
fragmentation). Add an assert to document that intention alongside the
comment.

v2: After discussion with Matthew, make it a blanket GGTT ban
(previously we allowed the expansion for appgtt, and so indirectly
ggtt). There are issues we need to fix before allowing the current
appgtt to be used with hugepages, and if we do, we probably want more
care over when to expand/align, as the mappable aperture inside the ggtt
is precious.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009092019.20747-1-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

bef27bdb

drm/i915: Use intel_get_pipe_timings() and intel_mode_from_pipe_config() in intel_crtc_mode_get() · d0d37254

Ville Syrjälä authored Apr 01, 2016

Eliminate the duplicate code for pipe timing readout in
intel_crtc_mode_get() by using the functions we use for the normal state
readout.

v2: Store dotclock in adjusted_mode instead of the final mode

Cc: dri-devel@lists.freedesktop.org
Cc: Rob Kramer <rob@solution-space.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1459536530-17754-1-git-send-email-ville.syrjala@linux.intel.com

d0d37254

drm/i915: Read timings from the correct transcoder in intel_crtc_mode_get() · e30a154b

Ville Syrjälä authored Apr 01, 2016

intel_crtc->config->cpu_transcoder isn't yet filled out when
intel_crtc_mode_get() gets called during output probing, so we should
not use it there. Instead intel_crtc_mode_get() figures out the correct
transcoder on its own, and that's what we should use.

If the BIOS boots LVDS on pipe B, intel_crtc_mode_get() would actually
end up reading the timings from pipe A instead (since PIPE_A==0),
which clearly isn't what we want.

It looks to me like this may have been broken by
commit eccb140b ("drm/i915: hw state readout&check support for cpu_transcoder")
as that one removed the early initialization of cpu_transcoder from
intel_crtc_init().

Cc: stable@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: Rob Kramer <rob@solution-space.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Rob Kramer <rob@solution-space.com>
Fixes: eccb140b ("drm/i915: hw state readout&check support for cpu_transcoder")
References: https://lists.freedesktop.org/archives/dri-devel/2016-April/104142.htmlSigned-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1459525046-19425-1-git-send-email-ville.syrjala@linux.intel.com

e30a154b

07 Oct, 2017 13 commits

drm/i915: enable platform support for 2M pages · a883241c

Matthew Auld authored Oct 06, 2017

For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-22-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-21-chris@chris-wilson.co.uk

a883241c

drm/i915: enable platform support for 64K pages · f1f3f982

Matthew Auld authored Oct 06, 2017

For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-21-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-20-chris@chris-wilson.co.uk

f1f3f982

drm/i915: disable platform support for vGPU huge gtt pages · da9fe3f3

Matthew Auld authored Oct 06, 2017

Currently gvt gtt handling doesn't support huge page entries, so disable
for now.

v2: remove useless 48b PPGTT check
Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-20-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-19-chris@chris-wilson.co.uk

da9fe3f3

drm/i915/selftests: mix huge pages · 7924d9d4

Matthew Auld authored Oct 06, 2017

Try to mix sg page sizes for 4K, 64K and 2M pages.

v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-19-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-18-chris@chris-wilson.co.uk

7924d9d4

drm/i915/selftests: huge page tests · 4049866f

Matthew Auld authored Oct 06, 2017

v2: mock test page support configurations and add MI_STORE_DWORD test

v3: run all mockable huge page tests on all platforms via the mock_device

v4: add pin_update regression test
    various improvements suggested by Chris

v5: fix issues reported by kbuild
    test single sg spanning multiple page sizes
    don't explode when running the live-tests through the appgtt

v6: lots of improvements from Chris

v7: run on each engine for igt_write_huge
    add simple tmpfs fallback test

v8: size_t is bad
    don't break the i386 build
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-17-chris@chris-wilson.co.uk

4049866f

drm/i915/debugfs: include some gtt page size metrics · 7393b7ee

Matthew Auld authored Oct 06, 2017

Good to know, mostly for debugging purposes.

v2: some improvements from Chris
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-17-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-16-chris@chris-wilson.co.uk

7393b7ee

drm/i915: accurate page size tracking for the ppgtt · d9ec12f8

Matthew Auld authored Oct 06, 2017

Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-16-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-15-chris@chris-wilson.co.uk

d9ec12f8

drm/i915: support 64K pages for the 48b PPGTT · 17a00cf7

Matthew Auld authored Oct 06, 2017

Support inserting 64K pages into the 48b PPGTT.

v2: check for 64K scratch

v3: we should only have to re-adjust maybe_64K at every sg interval
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-15-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-14-chris@chris-wilson.co.uk

17a00cf7

drm/i915: add support for 64K scratch page · aa095871

Matthew Auld authored Oct 06, 2017

Before we can fully enable 64K pages, we need to first support a 64K
scratch page if we intend to support the case where we have object sizes
< 2M, since any scratch PTE must also point to a 64K region.  Without
this our 64K usage is limited to objects which completely fill the
page-table, and therefore don't need any scratch.

v2: add reminder about why 48b PPGTT
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-14-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-13-chris@chris-wilson.co.uk

aa095871

drm/i915: support 2M pages for the 48b PPGTT · 0a03852e

Matthew Auld authored Oct 06, 2017

Support inserting 2M gtt pages into the 48b PPGTT.

v2: sanity check sg->length against page_size

v3: don't recalculate rem on each loop
    whitespace breakup
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-13-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-12-chris@chris-wilson.co.uk

0a03852e

drm/i915: disable GTT cache for 2M pages · 8cb09836

Matthew Auld authored Oct 06, 2017

When SW enables the use of 2M/1G pages, it must disable the GTT cache.

v2: don't disable for Cherryview which doesn't even support 48b PPGTT!

v3: explicitly check that the system does support 2M/1G pages

v4: split WA and decision logic
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-12-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-11-chris@chris-wilson.co.uk

8cb09836

drm/i915: enable IPS bit for 64K pages · 9a6330cf

Matthew Auld authored Oct 06, 2017

Before we can enable 64K pages through the IPS bit, we must first enable
it through MMIO, otherwise the page-walker will simply ignore it.

v2: add comment mentioning that 64K is BDW+

v3: move to more suitable home
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-11-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-10-chris@chris-wilson.co.uk

9a6330cf

drm/i915: align 64K objects to 2M · 855822be

Matthew Auld authored Oct 06, 2017

We can't mix 64K and 4K pte's in the same page-table, so for now we
align 64K objects to 2M to avoid any potential mixing. This is
potentially wasteful but in reality shouldn't be too bad since this only
applies to the virtual address space of a 48b PPGTT.

v2: don't separate logically connected ops
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.auld@intel.comSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-9-chris@chris-wilson.co.uk

855822be