• Akash Goel's avatar
    drm/i915: Support creation of unbound wc user mappings for objects · 1816f923
    Akash Goel authored
    This patch provides support to create write-combining virtual mappings of
    GEM object. It intends to provide the same funtionality of 'mmap_gtt'
    interface without the constraints and contention of a limited aperture
    space, but requires clients handles the linear to tile conversion on their
    own. This is for improving the CPU write operation performance, as with such
    mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
    to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
    flush after update from CPU side, when object is passed onto GPU.  This
    type of mapping is specially useful in case of sub-region update,
    i.e. when only a portion of the object is to be updated. Using a CPU mmap
    in such cases would normally incur a clflush of the whole object, and
    using a GTT mmapping would likely require eviction of an active object or
    fence and thus stall. The write-combining CPU mmap avoids both.
    
    To ensure the cache coherency, before using this mapping, the GTT domain
    has been reused here. This provides the required cache flush if the object
    is in CPU domain or synchronization against the concurrent rendering.
    Although the access through an uncached mmap should automatically
    invalidate the cache lines, this may not be true for non-temporal write
    instructions and also not all pages of the object may be updated at any
    given point of time through this mapping.  Having a call to get_pages in
    set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
    Broaden application of set-domain(GTT)', would guarantee the clflush and
    so there will be no cachelines holding the data for the object before it
    is accessed through this map.
    
    The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
    extended with a new flags field (defaulting to 0 for existent users). In
    order for userspace to detect the extended ioctl, a new parameter
    I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface.
    
    v2: Fix error handling, invalid flag detection, renaming (ickle)
    
    v3: Rebase to latest drm-intel-nightly codebase
    
    The new mmapping is exercised by igt/gem_mmap_wc,
    igt/gem_concurrent_blit and igt/gem_gtt_speed.
    
    Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
    Signed-off-by: default avatarAkash Goel <akash.goel@intel.com>
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
    Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
    1816f923
i915_dma.c 32 KB