drm/i915/gvt: Optimize ring siwtch 2x faster by removing unnecessary POSTING_READ
There are lots of POSTING_READ alongside each mmio write Op. While actually this is not necessary. It just bring too much latency since PCIe read Op is very slow which is of non-posted transaction. For PCIe device, the mem transaction for strong ordering rules are: o PCIe mmio write sequence is FIFO. Posted request cannot pass previous posted request. o PCIe mmio read will not go ahead of previous write. Intel graphics doesn't support RO, so we can apply above rules. In our case, we only need one POSTING_READ at last. This can remove half of mmio read Op and then the average ring switch performance is nearly doubled. Before After cycles ~970000 ~550000 Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Showing
Please register or sign in to comment