Commit 0cb42c02 authored by Jason Gunthorpe's avatar Jason Gunthorpe

RDMA/core: Fix bogus WARN_ON during ib_unregister_device_queued()

ib_unregister_device_queued() can only be used by drivers using the new
dealloc_device callback flow, and it has a safety WARN_ON to ensure
drivers are using it properly.

However, if unregister and register are raced there is a special
destruction path that maintains the uniform error handling semantic of
'caller does ib_dealloc_device() on failure'. This requires disabling the
dealloc_device callback which triggers the WARN_ON.

Instead of using NULL to disable the callback use a special function
pointer so the WARN_ON does not trigger.

Fixes: d0899892 ("RDMA/device: Provide APIs from the core code to help unregistration")
Link: https://lore.kernel.org/r/0-v1-a36d512e0a99+762-syz_dealloc_driver_jgg@nvidia.com
Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com
Suggested-by: default avatarHillf Danton <hdanton@sina.com>
Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
parent 65936bf2
...@@ -1339,6 +1339,10 @@ static int enable_device_and_get(struct ib_device *device) ...@@ -1339,6 +1339,10 @@ static int enable_device_and_get(struct ib_device *device)
return ret; return ret;
} }
static void prevent_dealloc_device(struct ib_device *ib_dev)
{
}
/** /**
* ib_register_device - Register an IB device with IB core * ib_register_device - Register an IB device with IB core
* @device: Device to register * @device: Device to register
...@@ -1409,11 +1413,11 @@ int ib_register_device(struct ib_device *device, const char *name) ...@@ -1409,11 +1413,11 @@ int ib_register_device(struct ib_device *device, const char *name)
* possibility for a parallel unregistration along with this * possibility for a parallel unregistration along with this
* error flow. Since we have a refcount here we know any * error flow. Since we have a refcount here we know any
* parallel flow is stopped in disable_device and will see the * parallel flow is stopped in disable_device and will see the
* NULL pointers, causing the responsibility to * special dealloc_driver pointer, causing the responsibility to
* ib_dealloc_device() to revert back to this thread. * ib_dealloc_device() to revert back to this thread.
*/ */
dealloc_fn = device->ops.dealloc_driver; dealloc_fn = device->ops.dealloc_driver;
device->ops.dealloc_driver = NULL; device->ops.dealloc_driver = prevent_dealloc_device;
ib_device_put(device); ib_device_put(device);
__ib_unregister_device(device); __ib_unregister_device(device);
device->ops.dealloc_driver = dealloc_fn; device->ops.dealloc_driver = dealloc_fn;
...@@ -1462,7 +1466,8 @@ static void __ib_unregister_device(struct ib_device *ib_dev) ...@@ -1462,7 +1466,8 @@ static void __ib_unregister_device(struct ib_device *ib_dev)
* Drivers using the new flow may not call ib_dealloc_device except * Drivers using the new flow may not call ib_dealloc_device except
* in error unwind prior to registration success. * in error unwind prior to registration success.
*/ */
if (ib_dev->ops.dealloc_driver) { if (ib_dev->ops.dealloc_driver &&
ib_dev->ops.dealloc_driver != prevent_dealloc_device) {
WARN_ON(kref_read(&ib_dev->dev.kobj.kref) <= 1); WARN_ON(kref_read(&ib_dev->dev.kobj.kref) <= 1);
ib_dealloc_device(ib_dev); ib_dealloc_device(ib_dev);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment