• Ido Schimmel's avatar
    netdevsim: Block until all devices are released · 6aff7cbf
    Ido Schimmel authored
    Like other buses, devices on the netdevsim bus have a release callback
    that is invoked when the reference count of the device drops to zero.
    However, unlike other buses such as PCI, the release callback is not
    necessarily built into the kernel, as netdevsim can be built as a
    module.
    
    The above is problematic as nothing prevents the module from being
    unloaded before the release callback has been invoked, which can happen
    asynchronously. One such example can be found in commit a3806872
    ("devlink: take device reference for devlink object") where devlink
    calls put_device() from an RCU callback.
    
    The issue is not theoretical and the reproducer in [1] can reliably
    crash the kernel. The conclusion of this discussion was that the issue
    should be solved in netdevsim, which is what this patch is trying to do.
    
    Add a reference count that is increased when a device is added to the
    bus and decreased when a device is released. Signal a completion when
    the reference count drops to zero and wait for the completion when
    unloading the module so that the module will not be unloaded before all
    the devices were released. The reference count is initialized to one so
    that completion is only signaled when unloading the module.
    
    With this patch, the reproducer in [1] no longer crashes the kernel.
    
    [1] https://lore.kernel.org/netdev/20230619125015.1541143-2-idosch@nvidia.com/
    
    Fixes: a3806872 ("devlink: take device reference for devlink object")
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
    Link: https://lore.kernel.org/r/20231026083343.890689-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    6aff7cbf
bus.c 8.8 KB