• Dongsheng Yang's avatar
    rbd: cancel lock_dwork if the wait is interrupted · 25e6be21
    Dongsheng Yang authored
    There is a warning message in my test with below steps:
    
      # rbd bench --io-type write --io-size 4K --io-threads 1 --io-pattern rand test &
      # sleep 5
      # pkill -9 rbd
      # rbd map test &
      # sleep 5
      # pkill rbd
    
    The reason is that the rbd_add_acquire_lock() is interruptable,
    that means, when we kill the waiting on ->acquire_wait, the lock_dwork
    could be still running.
    
    1. do_rbd_add()					2. lock_dwork
    rbd_add_acquire_lock()
      - queue_delayed_work()
    						lock_dwork queued
        - wait_for_completion_killable_timeout()  <-- kill happen
    rbd_dev_image_unlock()	<-- UNLOCKED now, nothing to do.
    rbd_dev_device_release()
    rbd_dev_image_release()
      - ...
    						lock successed here
         - cancel_delayed_work_sync(&rbd_dev->lock_dwork)
    
    Then when we reach the rbd_dev_free(), WARN_ON is triggered because
    lock_state is not RBD_LOCK_STATE_UNLOCKED.
    
    To fix it, this commit make sure the lock_dwork was finished before
    calling rbd_dev_image_unlock().
    
    On the other hand, this would not happend in do_rbd_remove(), because
    after rbd mapped, lock_dwork will only be queued for IO request, and
    request will continue unless lock_dwork finished. when we call
    rbd_dev_image_unlock() in do_rbd_remove(), all requests are done.
    That means, lock_state should not be locked again after
    rbd_dev_image_unlock().
    
    [ Cancel lock_dwork in rbd_add_acquire_lock(), only if the wait is
      interrupted. ]
    
    Fixes: 637cd060 ("rbd: new exclusive lock wait/wake code")
    Signed-off-by: default avatarDongsheng Yang <dongsheng.yang@easystack.cn>
    Reviewed-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    25e6be21
rbd.c 184 KB