• Chris Wilson's avatar
    drm/i915: Async GPU relocation processing · 7dd4f672
    Chris Wilson authored
    If the user requires patching of their batch or auxiliary buffers, we
    currently make the alterations on the cpu. If they are active on the GPU
    at the time, we wait under the struct_mutex for them to finish executing
    before we rewrite the contents. This happens if shared relocation trees
    are used between different contexts with separate address space (and the
    buffers then have different addresses in each), the 3D state will need
    to be adjusted between execution on each context. However, we don't need
    to use the CPU to do the relocation patching, as we could queue commands
    to the GPU to perform it and use fences to serialise the operation with
    the current activity and future - so the operation on the GPU appears
    just as atomic as performing it immediately. Performing the relocation
    rewrites on the GPU is not free, in terms of pure throughput, the number
    of relocations/s is about halved - but more importantly so is the time
    under the struct_mutex.
    
    v2: Break out the request/batch allocation for clearer error flow.
    v3: A few asserts to ensure rq ordering is maintained
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
    7dd4f672
i915_gem_execbuffer.c 68.6 KB