drivers/gpu/drm/i915/gvt/render.c · f846c8de64ced9965e04cc9ae1922036175e395b · nexedi / linux

drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ · f846c8de

Changbin Du authored Jun 23, 2017

There are lots of POSTING_READ alongside each mmio write Op. While
actually this is not necessary. It just bring too much latency since
PCIe read Op is very slow which is of non-posted transaction.

For PCIe device, the mem transaction for strong ordering rules are:
  o PCIe mmio write sequence is FIFO. Posted request cannot
    pass previous posted request.
  o PCIe mmio read will not go ahead of previous write.

Intel graphics doesn't support RO, so we can apply above rules. In
our case, we only need one POSTING_READ at last. This can remove
half of mmio read Op and then the average ring switch performance
is nearly doubled.
         Before       After
cycles  ~970000      ~550000
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

f846c8de

render.c 10.7 KB

Replace render.c