• Ilya Dryomov's avatar
    rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings · 2237ceb7
    Ilya Dryomov authored
    Every time a watch is reestablished after getting lost, we need to
    update the cookie which involves quiescing exclusive lock.  For this,
    we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
    roughly for the duration of rbd_reacquire_lock() call.  If the mapping
    is exclusive and I/O happens to arrive in this time window, it's failed
    with EROFS (later translated to EIO) based on the wrong assumption in
    rbd_img_exclusive_lock() -- "lock got released?" check there stopped
    making sense with commit a2b1da09 ("rbd: lock should be quiesced on
    reacquire").
    
    To make it worse, any such I/O is added to the acquiring list before
    EROFS is returned and this sets up for violating rbd_lock_del_request()
    precondition that the request is either on the running list or not on
    any list at all -- see commit ded080c8 ("rbd: don't move requests
    to the running list on errors").  rbd_lock_del_request() ends up
    processing these requests as if they were on the running list which
    screws up quiescing_wait completion counter and ultimately leads to
    
        rbd_assert(!completion_done(&rbd_dev->quiescing_wait));
    
    being triggered on the next watch error.
    
    Cc: stable@vger.kernel.org # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
    Cc: stable@vger.kernel.org
    Fixes: 637cd060 ("rbd: new exclusive lock wait/wake code")
    Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Reviewed-by: default avatarDongsheng Yang <dongsheng.yang@easystack.cn>
    2237ceb7
rbd.c 187 KB