• Dan Williams's avatar
    libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock · ca6bf264
    Dan Williams authored
    A multithreaded namespace creation/destruction stress test currently
    deadlocks with the following lockup signature:
    
        INFO: task ndctl:2924 blocked for more than 122 seconds.
              Tainted: G           OE     5.2.0-rc4+ #3382
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        ndctl           D    0  2924   1176 0x00000000
        Call Trace:
         ? __schedule+0x27e/0x780
         schedule+0x30/0xb0
         wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
         ? finish_wait+0x80/0x80
         uuid_store+0xe6/0x2e0 [libnvdimm]
         kernfs_fop_write+0xf0/0x1a0
         vfs_write+0xb7/0x1b0
         ksys_write+0x5c/0xd0
         do_syscall_64+0x60/0x240
    
         INFO: task ndctl:2923 blocked for more than 122 seconds.
               Tainted: G           OE     5.2.0-rc4+ #3382
         "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
         ndctl           D    0  2923   1175 0x00000000
         Call Trace:
          ? __schedule+0x27e/0x780
          ? __mutex_lock+0x489/0x910
          schedule+0x30/0xb0
          schedule_preempt_disabled+0x11/0x20
          __mutex_lock+0x48e/0x910
          ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
          ? __lock_acquire+0x23f/0x1710
          ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
          nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
          __dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
          ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
          dax_pmem_probe+0xc/0x20 [dax_pmem]
          nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
          really_probe+0xef/0x390
          driver_probe_device+0xb4/0x100
    
    In this sequence an 'nd_dax' device is being probed and trying to take
    the lock on its backing namespace to validate that the 'nd_dax' device
    indeed has exclusive access to the backing namespace. Meanwhile, another
    thread is trying to update the uuid property of that same backing
    namespace. So one thread is in the probe path trying to acquire the
    lock, and the other thread has acquired the lock and tries to flush the
    probe path.
    
    Fix this deadlock by not holding the namespace device_lock over the
    wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
    the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
    subsequently dropped internally to wait_nvdimm_bus_probe_idle().
    
    Cc: <stable@vger.kernel.org>
    Fixes: bf9bccc1 ("libnvdimm: pmem label sets and namespace instantiation")
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Tested-by: default avatarJane Chu <jane.chu@oracle.com>
    Link: https://lore.kernel.org/r/156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
    ca6bf264
bus.c 29.9 KB