• Jean-Philippe Brucker's avatar
    iommu: Remove iommu_sva_ops::mm_exit() · edcc40d2
    Jean-Philippe Brucker authored
    After binding a device to an mm, device drivers currently need to
    register a mm_exit handler. This function is called when the mm exits,
    to gracefully stop DMA targeting the address space and flush page faults
    to the IOMMU.
    
    This is deemed too complex for the MMU release() notifier, which may be
    triggered by any mmput() invocation, from about 120 callsites [1]. The
    upcoming SVA module has an example of such complexity: the I/O Page
    Fault handler would need to call mmput_async() instead of mmput() after
    handling an IOPF, to avoid triggering the release() notifier which would
    in turn drain the IOPF queue and lock up.
    
    Another concern is the DMA stop function taking too long, up to several
    minutes [2]. For some mmput() callers this may disturb other users. For
    example, if the OOM killer picks the mm bound to a device as the victim
    and that mm's memory is locked, if the release() takes too long, it
    might choose additional innocent victims to kill.
    
    To simplify the MMU release notifier, don't forward the notification to
    device drivers. Since they don't stop DMA on mm exit anymore, the PASID
    lifetime is extended:
    
    (1) The device driver calls bind(). A PASID is allocated.
    
      Here any DMA fault is handled by mm, and on error we don't print
      anything to dmesg. Userspace can easily trigger errors by issuing DMA
      on unmapped buffers.
    
    (2) exit_mmap(), for example the process took a SIGKILL. This step
        doesn't happen during normal operations. Remove the pgd from the
        PASID table, since the page tables are about to be freed. Invalidate
        the IOTLBs.
    
      Here the device may still perform DMA on the address space. Incoming
      transactions are aborted but faults aren't printed out. ATS
      Translation Requests return Successful Translation Completions with
      R=W=0. PRI Page Requests return with Invalid Request.
    
    (3) The device driver stops DMA, possibly following release of a fd, and
        calls unbind(). PASID table is cleared, IOTLB invalidated if
        necessary. The page fault queues are drained, and the PASID is
        freed.
    
      If DMA for that PASID is still running here, something went seriously
      wrong and errors should be reported.
    
    For now remove iommu_sva_ops entirely. We might need to re-introduce
    them at some point, for example to notify device drivers of unhandled
    IOPF.
    
    [1] https://lore.kernel.org/linux-iommu/20200306174239.GM31668@ziepe.ca/
    [2] https://lore.kernel.org/linux-iommu/4d68da96-0ad5-b412-5987-2f7a6aa796c3@amd.com/Signed-off-by: default avatarJean-Philippe Brucker <jean-philippe@linaro.org>
    Acked-by: default avatarJacob Pan <jacob.jun.pan@linux.intel.com>
    Acked-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20200423125329.782066-3-jean-philippe@linaro.orgSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
    edcc40d2
iommu.c 70.7 KB