• Matthew R. Ochs's avatar
    cxlflash: Fix to prevent EEH recovery failure · 8e782623
    Matthew R. Ochs authored
    The process_sense() routine can perform a read capacity which
    can take some time to complete. If an EEH occurs while waiting
    on the read capacity, the EEH handler will wait to obtain the
    context's mutex in order to put the context in an error state.
    The EEH handler will sit and wait until the context is free,
    but this wait can potentially last forever (deadlock) if the
    scsi_execute() that performs the read capacity experiences a
    timeout and calls into the reset callback. When that occurs,
    the reset callback sees that the device is already being reset
    and waits for the reset to complete. This leaves two threads
    waiting on the other.
    
    To address this issue, make the context unavailable to new,
    non-system owned threads and release the context while calling
    into process_sense(). After returning from process_sense() the
    context mutex is reacquired and the context is made available
    again. The context can be safely moved to the error state if
    needed during the unavailable window as no other threads will
    hold its reference.
    Signed-off-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
    Signed-off-by: default avatarManoj N. Kumar <manoj@linux.vnet.ibm.com>
    Reviewed-by: default avatarBrian King <brking@linux.vnet.ibm.com>
    Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
    Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
    Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
    8e782623
superpipe.c 58 KB