Commits · 5ea3998d56346975c2701df18fb5b6e3ab5c8d9e · Kirill Smelkov / linux

11 Feb, 2019 3 commits

Merge tag 'drm-intel-next-2019-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next · 5ea3998d

Dave Airlie authored Feb 11, 2019

UAPI Changes:

- Expose RPCS (SSEU) configuration to userspace for Ice Lake
in order to allow userspace to reconfigure the subslice config
per context basis. (Tvrtko, Lionel)

Driver Changes:

- Execbuf and preemption improvements including selftests (Chris)
- Rename HAS_GMCH_DISPLAY/HAS_GMCH (Rodrigo)
- Debugfs error handling fix for robustness (Greg)
- Improve reg_rw traces (Ville)
- Push clear_intel_crtc_state onto the heap (Chris)
- Watermark fixes for Ice Lake (Ville)
- Fix enable count array size and bounds checking (Tvrtko)
- MST Fixes (Lyude)
- Prevent race and handle error on I915_GEM_MMAP (Joonas)
- Initial rework for an full atomic gamma mode (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208165000.GA30314@intel.com

5ea3998d

Merge tag 'drm/tegra/for-5.1-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next · 38f070eb

Dave Airlie authored Feb 11, 2019

drm/tegra: Changes for v5.1-rc1

This set of changes starts of with some refactoring of the CEC support
to make it reusable on Tegra210 and later. Following are a couple of
fixes for HDMI audio support (via HDA).

The bulk here is a set of preparatory patches working towards enabling
Tegra186 support for host1x and VIC. Additional patches will be needed
to fully enable this, but they're not quite ready yet.

To round things off, this also adds support for configuring the SOR
crossbar using device tree, and fixes a couple of job-related issues in
the host1x code.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208144721.25830-1-thierry.reding@gmail.com

38f070eb

Merge tag 'du-next-20190208' of git://linuxtv.org/pinchartl/media into drm-next · 0ad7fb7c

Dave Airlie authored Feb 11, 2019

Renesas display drivers changes for v5.1 (2nd part):

- R8A7744 LVDS support
- DPAD0 output support on D3/E3
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208003355.GG10386@pendragon.ideasonboard.com

0ad7fb7c

08 Feb, 2019 8 commits

Merge tag 'exynos-drm-next-for-v5.1' of... · 1e92a226

Dave Airlie authored Feb 08, 2019

Merge tag 'exynos-drm-next-for-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next

- Add rotator support for s5pv210
  . With this patch series, s5pv210 SoC can use rotator module but
    only NV12 and XRGB8888 formats are supported.
- Modify e-mail address
  . It changes email address of scaler module author.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/079a9586-9d85-7d38-2658-ce312b6d71e8@samsung.com

1e92a226

drm: rcar-du: Turn LVDS clock output on/off for DPAD0 output on D3/E3 · a6cc417d

Laurent Pinchart authored Jan 17, 2019

On the D3 and E3 SoCs the LVDS PLL clock output provides the dot clock
to the DU channels, even when the LVDS outputs are not in use. Enable
and disable the LVDS clock output when enabling or disabling a CRTC
connected to the DPAD0 output.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

a6cc417d

drm: rcar-du: lvds: Add API to enable/disable clock output · 02f2b300

Laurent Pinchart authored Jan 17, 2019

On the D3 and E3 platforms, the LVDS internal PLL supplies the pixel
clock to the DU. This works automatically for LVDS outputs as the LVDS
encoder is enabled through the bridge API, enabling the internal PLL and
clock output. However, when using the DU DPAD output with the LVDS
outputs turned off, the LVDS PLL needs to be controlled manually. Add an
API to do so, to be called by the DU driver.

The drivers/gpu/drm/rcar-du/ directory has to be treated as obj-y
unconditionally, as the LVDS driver could be built-in while the DU
driver is compiled as a module.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

02f2b300

drm: rcar-du: lvds: Don't fail probe if output is not connected on D3/E3 · 6e1f8557

Laurent Pinchart authored Jan 17, 2019

On the D3 and E3 SoCs the LVDS encoder has an extended internal PLL and
supplies a clock to the DU. That clock is used not only for the LVDS
outputs but also for the DPAD output. The LVDS encoder thus needs to be
available to the DU even when its output is disabled. Don't fail probe
in that case on D3 and E3.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

6e1f8557

drm: rcar-du: Simplify encoder registration · 5aebc852

Laurent Pinchart authored Jan 17, 2019

Before the driver fully moved to drm_bridge and drm_panel, it was
necessary to parse DT and locate encoder and connector nodes. The
connector node is now unused and can be removed as a parameter to
rcar_du_encoder_init(). As a consequence rcar_du_encoders_init_one() can
be greatly simplified, removing most of the DT parsing.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

5aebc852

drm: rcar-du: lvds: Add r8a7744 support · fc59d7d4

Biju Das authored Jan 22, 2019

The LVDS encoders on RZ/G1N SoC is similar to RZ/G1M. Add support for
RZ/G1N (R8A7744) SoC to the LVDS encoder driver.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>

fc59d7d4

dt-bindings: display: renesas: lvds: Document r8a7744 bindings · 8a2fe6c0

Biju Das authored Jan 22, 2019

Document the RZ/G1N (R8A7744) LVDS bindings.
Signed-off-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Rob Herring <robh@kernel.org>

8a2fe6c0

drm: rcar-du: add missing of_node_put · 4c6d8fc2

Julia Lawall authored Jan 14, 2019

Add an of_node_put when the result of of_graph_get_remote_port_parent is
not available.

Add a second of_node_put if no encoder is selected (encoder remains NULL).

The semantic match that finds the first problem is as follows
(http://coccinelle.lip6.fr):

// <smpl>
@r exists@
local idexpression e;
expression x;
@@
e = of_graph_get_remote_port_parent(...);
... when != x = e
    when != true e == NULL
    when != of_node_put(e)
    when != of_fwnode_handle(e)
(
return e;
|
*return ...;
)
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>

4c6d8fc2

07 Feb, 2019 29 commits

drm/i915: Update DRIVER_DATE to 20190207 · c09d3916
Rodrigo Vivi authored Feb 07, 2019
```
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
```
c09d3916

drm/i915: Move LUT programming to happen after vblank waits · 051a6d8d

Ville Syrjälä authored Feb 05, 2019

The LUTs are single buffered so we should program them after
the double buffered pipe updates have been latched by the
hardware.

We'll also fix up the IPS vs. split gamma w/a to do the IPS
disable like everyone else. Note that this is currently dead
code as we don't use the split gamma mode on HSW, but that
will be fixed up shortly.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-7-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

051a6d8d

drm/i915: Split color mgmt based on single vs. double buffered registers · 4d8ed54c

Ville Syrjälä authored Feb 05, 2019

Split the color management hooks along the single vs. double
buffered registers line. Of the currently programmed registers
GAMMA_MODE and the ilk+ pipe CSC are double buffered, the
LUTS and CHV CGM block are single buffered.

The double buffered register will be programmed during the
normal pipe update with evasion, and also during pipe enable
so that the settings will already be correct when the pipe
starts up before the planes are enabled.

The single buffered registers are currently programmed before
the vblank evade. Which is totally wrong, but we'll correct
that later.

v2: Add some docs to explain the two vfuncs (Matt,Uma)
    Rebase
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-6-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

4d8ed54c

drm/i915: Pull GAMMA_MODE write out from haswell_load_luts() · 87cefd57

Ville Syrjälä authored Feb 05, 2019

For bdw+ let's move the GAMMA_MODE write for the legacy LUT
mode into the .load_luts() funciton directly, rather than
relying on haswell_load_luts(). We'll be getting rid of
haswell_load_luts() entirely soon, and it's anyway cleaner
to have the GAMMA_MODE write in a single place.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-5-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

87cefd57

drm/i915: Constify the state arguments to the color management stuff · 23b03a27

Ville Syrjälä authored Feb 05, 2019

Pass the crtc state etc. as const to the color management commit
functions. And while at it polish some of the local variables.

v2: Rebase
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-4-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

23b03a27

drm/i915: Precompute gamma_mode · 5f4f3e38

Ville Syrjälä authored Feb 05, 2019

We shouldn't be computing gamma mode during the commit phase.
Move it to the check phase.

v2: Reword comments a bit (Matt)
    Rebase
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-3-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

5f4f3e38

drm/i915: Split the gamma/csc enable bits from the plane_ctl() function · 7eb31a0b

Ville Syrjälä authored Feb 05, 2019

On g4x+ the pipe gamma enable bit for the primary plane affects
the pipe bottom color as well. The same for the pipe csc enable
bit on ilk+. Thus we must configure those bits correctly even
when the primary plane is disabled.

To make the feasible let's split those settings from the
plane_ctl() function into a seprate funciton that we can
call from the ->disable_plane() hook as well.

For consistency we'll do that on all the plane types. While
that has no real benefits at this time, it'll become useful
when we start to control the pipe gamma/csc enable bits
dynamically when we overhaul the color management code.

On pre-g4x there doesn't appear to be any way to gamma
correct the pipe bottom color, but sticking to the same
pattern doesn't hurt. And it'll still help us to do
crtc state readout correctly for the pipe gamma enable
bit for the color management overhaul.

An alternative apporach would be to still precompute these
bits into plane_state->ctl, but that would require that we
run through the plane check even when the plane isn't logically
enabled on any crtc. Currently that condition causes us to
short circuit the entire thing and not call ->check_plane().
There would also be some chicken and egg problems with
->check_plane() vs. crtc color state check that would
requite splitting certain things into multiple steps.
So all in all this seems like the easier route.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-2-ville.syrjala@linux.intel.comReviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

7eb31a0b

gpu: host1x: Continue CDMA execution starting with a next job · 79930baf

Dmitry Osipenko authored Aug 07, 2018

Currently gathers of a hung job are getting NOP'ed and a restarted CDMA
executes the NOP'ed gathers. There shouldn't be a reason to not restart
CDMA execution starting with a next job, avoiding the unnecessary churning
with gathers NOP'ing.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

79930baf

gpu: host1x: Don't complete a completed job · 5d6f0436

Dmitry Osipenko authored Aug 07, 2018

There is a chance that the last job has been completed at the time of
CDMA timeout handler invocation. In this case there is no need to complete
the completed job.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

5d6f0436

gpu: host1x: Cancel only job that actually got stuck · e8bad659

Dmitry Osipenko authored Aug 07, 2018

Host1x doesn't have information about jobs inter-dependency, that is
something that will become available once host1x will get a proper
jobs scheduler implementation. Currently a hang job causes other unrelated
jobs to be canceled, that is a relic from downstream driver which is
irrelevant to upstream. Let's cancel only the hanging job and not to touch
other jobs in queue.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

e8bad659

drm/tegra: sor: Support device tree crossbar configuration · 6d6c815d

Thierry Reding authored Jan 25, 2019

The crossbar configuration is usually the same across all designs for a
given SoC generation. But sometimes there are designs that require some
other configuration.

Implement support for parsing the crossbar configuration from a device
tree. If the crossbar configuration is not present in the device tree,
fall back to the default crossbar configuration.
Signed-off-by: Thierry Reding <treding@nvidia.com>

6d6c815d

dt-bindings: display: tegra: Support SOR crossbar configuration · 6c2b3881

Thierry Reding authored Jan 25, 2019

The SOR has a crossbar that can map each lane of the SOR to each of the
SOR pads. The mapping is usually the same across designs for a specific
SoC generation, but every now and then there's a design that doesn't.

Allow the crossbar configuration to be specified in device tree to make
it possible to support these designs.
Signed-off-by: Thierry Reding <treding@nvidia.com>

6c2b3881

drm/tegra: vic: Support stream ID register programming · f3779cb1

Thierry Reding authored Feb 01, 2019

The version of VIC found in Tegra186 and later incorporates improvements
with regards to context isolation. As part of those improvements, stream
ID registers were added that allow to specify separate stream IDs for
the Falcon microcontroller and the VIC memory interface.

While it is possible to also set the stream ID dynamically at runtime to
allow userspace contexts to be completely separated, this commit doesn't
implement that yet. Instead, the static VIC stream ID is programmed when
the Falcon is booted. This ensures that memory accesses by the Falcon or
the VIC are properly translated via the SMMU.
Signed-off-by: Thierry Reding <treding@nvidia.com>

f3779cb1

drm/tegra: vic: Do not clear driver data · 3ff41673

Thierry Reding authored Feb 01, 2019

Upon driver failure, the driver core will take care of clearing the
driver data, so there's no need to do so explicitly in the driver.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

3ff41673

drm/tegra: Restrict IOVA space to DMA mask · 02be8e4f

Thierry Reding authored Feb 01, 2019

On Tegra186 and later, the ARM SMMU provides an input address space that
is 48 bits wide. However, memory clients can only address up to 40 bits.
If the geometry is used as-is, allocations of IOVA space can end up in a
region that cannot be addressed by the memory clients.

To fix this, restrict the IOVA space to the DMA mask of the host1x
device. Note that, technically, the IOVA space needs to be restricted to
the intersection of the DMA masks for all clients that are attached to
the IOMMU domain. In practice using the DMA mask of the host1x device is
sufficient because all host1x clients share the same DMA mask.
Signed-off-by: Thierry Reding <treding@nvidia.com>

02be8e4f

drm/tegra: Setup shared IOMMU domain after initialization · b9f8b09c

Thierry Reding authored Feb 01, 2019

Move initialization of the shared IOMMU domain after the host1x device
has been initialized. At this point all the Tegra DRM clients have been
attached to the shared IOMMU domain.

This is important because Tegra186 and later use an ARM SMMU, for which
the driver defers setting up the geometry for a domain until a device is
attached to it. This is to ensure that the domain is properly set up for
a specific ARM SMMU instance, which is unknown at allocation time.
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>

b9f8b09c

drm/tegra: vic: Load firmware on demand · 77a0b09d

Thierry Reding authored Feb 01, 2019

Loading the firmware requires an allocation of IOVA space to make sure
that the VIC's Falcon microcontroller can read the firmware if address
translation via the SMMU is enabled.

However, the allocation currently happens at a time where the geometry
of an IOMMU domain may not have been initialized yet. This happens for
example on Tegra186 and later where an ARM SMMU is used. Domains which
are created by the ARM SMMU driver postpone the geometry setup until a
device is attached to the domain. This is because IOMMU domains aren't
attached to a specific IOMMU instance at allocation time and hence the
input address space, which defines the geometry, is not known yet.

Work around this by postponing the firmware load until it is needed at
the time where a channel is opened to the VIC. At this time the shared
IOMMU domain's geometry has been properly initialized.

As a byproduct this allows the Tegra DRM to be created in the absence
of VIC firmware, since the VIC initialization no longer fails if the
firmware can't be found.

Based on an earlier patch by Dmitry Osipenko <digetx@gmail.com>.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

77a0b09d

drm/tegra: Store parent pointer in Tegra DRM clients · 8e5d19c6

Thierry Reding authored Feb 01, 2019

Tegra DRM clients need access to their parent, so store a pointer to it
upon registration. It's technically possible to get at this by going via
the host1x client's parent and getting the driver data, but that's quite
complicated and not very transparent. It's much more straightforward and
natural to let the children know about their parent.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>

8e5d19c6

gpu: host1x: Optimize CDMA push buffer memory usage · e1f338c0

Thierry Reding authored Feb 01, 2019

The host1x CDMA push buffer is terminated by a special opcode (RESTART)
that tells the CDMA to wrap around to the beginning of the push buffer.
To accomodate the RESTART opcode, an extra 4 bytes are allocated on top
of the 512 * 8 = 4096 bytes needed for the 512 slots (1 slot = 2 words)
that are used for other commands passed to CDMA. This requires that two
memory pages are allocated, but most of the second page (4092 bytes) is
never used.

Decrease the number of slots to 511 so that the RESTART opcode fits
within the page. Adjust the push buffer wraparound code to take into
account push buffer sizes that are not a power of two.
Signed-off-by: Thierry Reding <treding@nvidia.com>

e1f338c0

gpu: host1x: Use correct semantics for HOST1X_CHANNEL_DMAEND · 0e43b8da

Thierry Reding authored Feb 01, 2019

The HOST1X_CHANNEL_DMAEND is an offset relative to the value written to
the HOST1X_CHANNEL_DMASTART register, but it is currently treated as an
absolute address. This can cause SMMU faults if the CDMA fetches past a
pushbuffer's IOMMU mapping.

Properly setting the DMAEND prevents the CDMA from fetching beyond that
address and avoid such issues. This is currently not observed because a
whole (almost) page of essentially scratch space absorbs any excessive
prefetching by CDMA. However, changing the number of slots in the push
buffer can trigger these SMMU faults.
Signed-off-by: Thierry Reding <treding@nvidia.com>

0e43b8da

gpu: host1x: Support 40-bit addressing on Tegra186 · 8de896eb

Thierry Reding authored Feb 01, 2019

The host1x and clients instantiated on Tegra186 support addressing 40
bits of memory.
Signed-off-by: Thierry Reding <treding@nvidia.com>

8de896eb

gpu: host1x: Restrict IOVA space to DMA mask · 38fabcc9

Thierry Reding authored Feb 01, 2019

On Tegra186 and later, the ARM SMMU provides an input address space that
is 48 bits wide. However, memory clients can only address up to 40 bits.
If the geometry is used as-is, allocations of IOVA space can end up in a
region that is not addressable by the memory clients.

To fix this, restrict the IOVA space to the DMA mask of the host1x
device.
Signed-off-by: Thierry Reding <treding@nvidia.com>

38fabcc9

gpu: host1x: Support 40-bit addressing · 67a82dbc

Thierry Reding authored Feb 01, 2019

Tegra186 and later support 40 bits of address space. Additional
registers need to be programmed to store the full 40 bits of push
buffer addresses.

Since command stream gathers can also reside in buffers in a 40-bit
address space, a new variant of the GATHER opcode is also introduced.
It takes two parameters: the first parameter contains the lower 32
bits of the address and the second parameter contains bits 32 to 39.
Signed-off-by: Thierry Reding <treding@nvidia.com>

67a82dbc

gpu: host1x: Introduce support for wide opcodes · 5a5fccbd

Thierry Reding authored Feb 01, 2019

The CDMA push buffer can currently only handle opcodes that take a
single word parameter. However, the host1x implementation on Tegra186
and later supports opcodes that require multiple words as parameters.

Unfortunately the way the push buffer is structured, these wide opcodes
cannot simply be composed of two regular opcodes because that could
result in the wide opcode being split across the end of the push buffer
and the final RESTART opcode required to wrap the push buffer around
would break the wide opcode.

One way to fix this would be to remove the concept of slots to simplify
push buffer operations. However, that's not entirely trivial and should
be done in a separate patch. For now, simply use a different function
to push four-word opcodes into the push buffer. Technically only three
words are pushed, with the fourth word used as padding to preserve the
2-word alignment required by the slots abstraction. The fourth word is
always a NOP opcode.

Additional care must be taken when the end of the push buffer is
reached. If a four-word opcode doesn't fit into the push buffer without
being split by the boundary, NOP opcodes will be introduced and the new
wide opcode placed at the beginning of the push buffer.
Signed-off-by: Thierry Reding <treding@nvidia.com>

5a5fccbd

gpu: host1x: Program the channel stream ID · de5469c2

Thierry Reding authored Feb 01, 2019

When processing command streams, make sure the host1x's stream ID is
programmed for the channel so that addresses are properly translated
through the SMMU.
Signed-off-by: Thierry Reding <treding@nvidia.com>

de5469c2

drm/i915: Don't set update_wm_post on g4x+ · 440e84a5

Ville Syrjälä authored Feb 06, 2019

update_wm_post is meant for pre-g4x only. Don't ever set
it on g4x+.

The only effect of a bogus update_wm_post on g4x+ could
be that we clear the legacy_cursor_update flag in
intel_atomic_commit(). Since legacy_cursor_update is
only set for legacy cursor updates (as the name suggests)
and we only set update_wm_post for a modeset the two
cases should never occur at the same time. But let's
be consistent in setting update_wm_post so we don't
end up confusing so many people.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206185433.8116-1-ville.syrjala@linux.intel.comReviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

440e84a5

drm/i915: Hack and slash, throttle execbuffer hogs · d6f328bf

Chris Wilson authored Feb 07, 2019

Apply backpressure to hogs that emit requests faster than the GPU can
process them by waiting for their ring to be less than half-full before
proceeding with taking the struct_mutex.

This is a gross hack to apply throttling backpressure, the long term
goal is to remove the struct_mutex contention so that each client
naturally waits, preferably in an asynchronous, nonblocking fashion
(pipelined operations for the win), for their own resources and never
blocks another client within the driver at least. (Realtime priority
goals would extend to ensuring that resource contention favours high
priority clients as well.)

This patch only limits excessive request production and does not attempt
to throttle clients that block waiting for eviction (either global GTT or
system memory) or any other global resources, see above for the long term
goal.

No microbenchmarks are harmed (to the best of my knowledge).

Testcase: igt/gem_exec_schedule/pi-ringfull-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207071829.5574-1-chris@chris-wilson.co.uk

d6f328bf

drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set · ebfb6977

Joonas Lahtinen authored Feb 07, 2019

Add err goto label and use it when VMA can't be established or changes
underneath.

v2:
- Dropping Fixes: as it's indeed impossible to race an object to the
  error address. (Chris)
v3:
- Use IS_ERR_VALUE (Chris)
Reported-by: Adam Zabrocki <adamza@microsoft.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v2
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-2-joonas.lahtinen@linux.intel.com

ebfb6977

drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set · 5c4604e7

Joonas Lahtinen authored Feb 07, 2019

Make sure the underlying VMA in the process address space is the
same as it was during vm_mmap to avoid applying WC to wrong VMA.

A more long-term solution would be to have vm_mmap_locked variant
in linux/mmap.h for when caller wants to hold mmap_sem for an
extended duration.

v2:
- Refactor the compare function

Fixes: 1816f923 ("drm/i915: Support creation of unbound wc user mappings for objects")
Reported-by: Adam Zabrocki <adamza@microsoft.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-1-joonas.lahtinen@linux.intel.com

5c4604e7