• Vaibhav Jain's avatar
    scsi: cxlflash: Prevent deadlock when adapter probe fails · bb61b843
    Vaibhav Jain authored
    Presently when an error is encountered during probe of the cxlflash
    adapter, a deadlock is seen with cpu thread stuck inside
    cxlflash_remove(). Below is the trace of the deadlock as logged by
    khungtaskd:
    
    cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
    INFO: task kworker/80:1:890 blocked for more than 120 seconds.
           Not tainted 5.0.0-rc4-capi2-kexec+ #2
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    kworker/80:1    D    0   890      2 0x00000808
    Workqueue: events work_for_cpu_fn
    
    Call Trace:
     0x4d72136320 (unreliable)
     __switch_to+0x2cc/0x460
     __schedule+0x2bc/0xac0
     schedule+0x40/0xb0
     cxlflash_remove+0xec/0x640 [cxlflash]
     cxlflash_probe+0x370/0x8f0 [cxlflash]
     local_pci_probe+0x6c/0x140
     work_for_cpu_fn+0x38/0x60
     process_one_work+0x260/0x530
     worker_thread+0x280/0x5d0
     kthread+0x1a8/0x1b0
     ret_from_kernel_thread+0x5c/0x80
    INFO: task systemd-udevd:5160 blocked for more than 120 seconds.
    
    The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
    without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
    starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
    successfully probed the 'cxlflash_cfg->state' never changes from
    STATE_PROBING hence the deadlock occurs.
    
    We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
    STATE_PROBED in case an error occurs during cxlflash_probe() and just
    before calling cxlflash_remove().
    
    Cc: stable@vger.kernel.org
    Fixes: c21e0bbf("cxlflash: Base support for IBM CXL Flash Adapter")
    Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    bb61b843
main.c 107 KB