Commit 71706590 authored by Akash Goel's avatar Akash Goel Committed by Tvrtko Ursulin

drm/i915: Use SSE4.1 movntdqa based memcpy for sampling GuC log buffer

To ensure that we always get the up-to-date data from log buffer, its
better to access the buffer through an uncached CPU mapping. Also the way
buffer is accessed from GuC & Host side, manually doing cache flush may
not be effective always if cached CPU mapping is used. In order to avoid
any performance drop & have fast reads from the GuC log buffer, used SSE4.1
movntdqa based memcpy function i915_memcpy_from_wc, as copying using
movntqda from WC type memory is almost as fast as reading from WB memory.
This way log buffer sampling time will not get increased and so would be
able to deal with the flush interrupt storm when GuC is generating logs at
a very high rate.
Ideally SSE 4.1 should be present on all chipsets supporting GuC based
submisssions, but if not then logging will not be enabled.

v2: Rebase.

v3: Squash the WC type vmalloc mapping patch with this patch. (Chris)
Suggested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: default avatarAkash Goel <akash.goel@intel.com>
Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
parent 685534ef
...@@ -1148,18 +1148,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc) ...@@ -1148,18 +1148,16 @@ static void guc_read_update_log_buffer(struct intel_guc *guc)
/* Just copy the newly written data */ /* Just copy the newly written data */
if (read_offset > write_offset) { if (read_offset > write_offset) {
memcpy(dst_data, src_data, write_offset); i915_memcpy_from_wc(dst_data, src_data, write_offset);
bytes_to_copy = buffer_size - read_offset; bytes_to_copy = buffer_size - read_offset;
} else { } else {
bytes_to_copy = write_offset - read_offset; bytes_to_copy = write_offset - read_offset;
} }
memcpy(dst_data + read_offset, i915_memcpy_from_wc(dst_data + read_offset,
src_data + read_offset, bytes_to_copy); src_data + read_offset, bytes_to_copy);
src_data += buffer_size; src_data += buffer_size;
dst_data += buffer_size; dst_data += buffer_size;
/* FIXME: invalidate/flush for log buffer needed */
} }
if (log_buf_snapshot_state) if (log_buf_snapshot_state)
...@@ -1219,8 +1217,11 @@ static int guc_log_create_extras(struct intel_guc *guc) ...@@ -1219,8 +1217,11 @@ static int guc_log_create_extras(struct intel_guc *guc)
return 0; return 0;
if (!guc->log.buf_addr) { if (!guc->log.buf_addr) {
/* Create a vmalloc mapping of log buffer pages */ /* Create a WC (Uncached for read) vmalloc mapping of log
vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WB); * buffer pages, so that we can directly get the data
* (up-to-date) from memory.
*/
vaddr = i915_gem_object_pin_map(guc->log.vma->obj, I915_MAP_WC);
if (IS_ERR(vaddr)) { if (IS_ERR(vaddr)) {
ret = PTR_ERR(vaddr); ret = PTR_ERR(vaddr);
DRM_ERROR("Couldn't map log buffer pages %d\n", ret); DRM_ERROR("Couldn't map log buffer pages %d\n", ret);
...@@ -1263,6 +1264,16 @@ static void guc_log_create(struct intel_guc *guc) ...@@ -1263,6 +1264,16 @@ static void guc_log_create(struct intel_guc *guc)
vma = guc->log.vma; vma = guc->log.vma;
if (!vma) { if (!vma) {
/* We require SSE 4.1 for fast reads from the GuC log buffer and
* it should be present on the chipsets supporting GuC based
* submisssions.
*/
if (WARN_ON(!i915_memcpy_from_wc(NULL, NULL, 0))) {
/* logging will not be enabled */
i915.guc_log_level = -1;
return;
}
vma = guc_allocate_vma(guc, size); vma = guc_allocate_vma(guc, size);
if (IS_ERR(vma)) { if (IS_ERR(vma)) {
/* logging will be off */ /* logging will be off */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment