• Changbin Du's avatar
    drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ · f846c8de
    Changbin Du authored
    There are lots of POSTING_READ alongside each mmio write Op. While
    actually this is not necessary. It just bring too much latency since
    PCIe read Op is very slow which is of non-posted transaction.
    
    For PCIe device, the mem transaction for strong ordering rules are:
      o PCIe mmio write sequence is FIFO. Posted request cannot
        pass previous posted request.
      o PCIe mmio read will not go ahead of previous write.
    
    Intel graphics doesn't support RO, so we can apply above rules. In
    our case, we only need one POSTING_READ at last. This can remove
    half of mmio read Op and then the average ring switch performance
    is nearly doubled.
             Before       After
    cycles  ~970000      ~550000
    Signed-off-by: default avatarChangbin Du <changbin.du@intel.com>
    Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
    f846c8de
render.c 10.7 KB