• Joao Martins's avatar
    iommufd: Add a flag to skip clearing of IOPTE dirty · 60984813
    Joao Martins authored
    VFIO has an operation where it unmaps an IOVA while returning a bitmap with
    the dirty data. In reality the operation doesn't quite query the IO
    pagetables that the PTE was dirty or not. Instead it marks as dirty on
    anything that was mapped, and doing so in one syscall.
    
    In IOMMUFD the equivalent is done in two operations by querying with
    GET_DIRTY_IOVA followed by UNMAP_IOVA. However, this would incur two TLB
    flushes given that after clearing dirty bits IOMMU implementations require
    invalidating their IOTLB, plus another invalidation needed for the UNMAP.
    To allow dirty bits to be queried faster, add a flag
    (IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR) that requests to not clear the dirty
    bits from the PTE (but just reading them), under the expectation that the
    next operation is the unmap. An alternative is to unmap and just
    perpectually mark as dirty as that's the same behaviour as today. So here
    equivalent functionally can be provided with unmap alone, and if real dirty
    info is required it will amortize the cost while querying.
    
    There's still a race against DMA where in theory the unmap of the IOVA
    (when the guest invalidates the IOTLB via emulated iommu) would race
    against the VF performing DMA on the same IOVA. As discussed in [0], we are
    accepting to resolve this race as throwing away the DMA and it doesn't
    matter if it hit physical DRAM or not, the VM can't tell if we threw it
    away because the DMA was blocked or because we failed to copy the DRAM.
    
    [0] https://lore.kernel.org/linux-iommu/20220502185239.GR8364@nvidia.com/
    
    Link: https://lore.kernel.org/r/20231024135109.73787-10-joao.m.martins@oracle.comSigned-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
    Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    Reviewed-by: default avatarKevin Tian <kevin.tian@intel.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    60984813
hw_pagetable.c 6.29 KB