• Brett Creeley's avatar
    vfio/pds: Fix possible sleep while in atomic context · ae2667cd
    Brett Creeley authored
    The driver could possibly sleep while in atomic context resulting
    in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is
    set:
    
    BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash
    preempt_count: 1, expected: 0
    RCU nest depth: 0, expected: 0
    Call Trace:
     <TASK>
     dump_stack_lvl+0x36/0x50
     __might_resched+0x123/0x170
     mutex_lock+0x1e/0x50
     pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci]
     pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci]
     pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci]
     pci_reset_function+0x4b/0x70
     reset_store+0x5b/0xa0
     kernfs_fop_write_iter+0x137/0x1d0
     vfs_write+0x2de/0x410
     ksys_write+0x5d/0xd0
     do_syscall_64+0x3b/0x90
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    This can happen if pds_vfio_put_restore_file() and/or
    pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock)
    while the spin_lock(&pds_vfio->reset_lock) is held, which can
    happen during while calling pds_vfio_state_mutex_unlock().
    
    Fix this by changing the reset_lock to reset_mutex so there are no such
    conerns. Also, make sure to destroy the reset_mutex in the driver specific
    VFIO device release function.
    
    This also fixes a spinlock bad magic BUG that was caused
    by not calling spinlock_init() on the reset_lock. Since, the lock is
    being changed to a mutex, make sure to call mutex_init() on it.
    Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
    Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/
    Fixes: bb500dbe ("vfio/pds: Add VFIO live migration support")
    Signed-off-by: default avatarBrett Creeley <brett.creeley@amd.com>
    Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
    Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.comSigned-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
    ae2667cd
pci_drv.c 6.02 KB