• Alexey Kardashevskiy's avatar
    powerpc/mm_iommu: Fix potential deadlock · eb9d7a62
    Alexey Kardashevskiy authored
    Currently mm_iommu_do_alloc() is called in 2 cases:
    - VFIO_IOMMU_SPAPR_REGISTER_MEMORY ioctl() for normal memory:
    	this locks &mem_list_mutex and then locks mm::mmap_sem
    	several times when adjusting locked_vm or pinning pages;
    - vfio_pci_nvgpu_regops::mmap() for GPU memory:
    	this is called with mm::mmap_sem held already and it locks
    	&mem_list_mutex.
    
    So one can craft a userspace program to do special ioctl and mmap in
    2 threads concurrently and cause a deadlock which lockdep warns about
    (below).
    
    We did not hit this yet because QEMU constructs the machine in a single
    thread.
    
    This moves the overlap check next to where the new entry is added and
    reduces the amount of time spent with &mem_list_mutex held.
    
    This moves locked_vm adjustment from under &mem_list_mutex.
    
    This relies on mm_iommu_adjust_locked_vm() doing nothing when entries==0.
    
    This is one of the lockdep warnings:
    
    ======================================================
    WARNING: possible circular locking dependency detected
    5.1.0-rc2-le_nv2_aikATfstn1-p1 #363 Not tainted
    ------------------------------------------------------
    qemu-system-ppc/8038 is trying to acquire lock:
    000000002ec6c453 (mem_list_mutex){+.+.}, at: mm_iommu_do_alloc+0x70/0x490
    
    but task is already holding lock:
    00000000fd7da97f (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0xf0/0x160
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 (&mm->mmap_sem){++++}:
           lock_acquire+0xf8/0x260
           down_write+0x44/0xa0
           mm_iommu_adjust_locked_vm.part.1+0x4c/0x190
           mm_iommu_do_alloc+0x310/0x490
           tce_iommu_ioctl.part.9+0xb84/0x1150 [vfio_iommu_spapr_tce]
           vfio_fops_unl_ioctl+0x94/0x430 [vfio]
           do_vfs_ioctl+0xe4/0x930
           ksys_ioctl+0xc4/0x110
           sys_ioctl+0x28/0x80
           system_call+0x5c/0x70
    
    -> #0 (mem_list_mutex){+.+.}:
           __lock_acquire+0x1484/0x1900
           lock_acquire+0xf8/0x260
           __mutex_lock+0x88/0xa70
           mm_iommu_do_alloc+0x70/0x490
           vfio_pci_nvgpu_mmap+0xc0/0x130 [vfio_pci]
           vfio_pci_mmap+0x198/0x2a0 [vfio_pci]
           vfio_device_fops_mmap+0x44/0x70 [vfio]
           mmap_region+0x5d4/0x770
           do_mmap+0x42c/0x650
           vm_mmap_pgoff+0x124/0x160
           ksys_mmap_pgoff+0xdc/0x2f0
           sys_mmap+0x40/0x80
           system_call+0x5c/0x70
    
    other info that might help us debug this:
    
     Possible unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(&mm->mmap_sem);
                                   lock(mem_list_mutex);
                                   lock(&mm->mmap_sem);
      lock(mem_list_mutex);
    
     *** DEADLOCK ***
    
    1 lock held by qemu-system-ppc/8038:
     #0: 00000000fd7da97f (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0xf0/0x160
    
    Fixes: c10c21ef ("powerpc/vfio/iommu/kvm: Do not pin device memory", 2018-12-19)
    Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    eb9d7a62
mmu_context_iommu.c 10.9 KB