• Mika Westerberg's avatar
    thunderbolt: Take domain lock in switch sysfs attribute callbacks · 09f11b6c
    Mika Westerberg authored
    switch_lock was introduced because it allowed serialization of device
    authorization requests from userspace without need to take the big
    domain lock (tb->lock). This was fine because device authorization with
    ICM is just one command that is sent to the firmware. Now that we start
    to handle all tunneling in the driver switch_lock is not enough because
    we need to walk over the topology to establish paths.
    
    For this reason drop switch_lock from the driver completely in favour of
    big domain lock.
    
    There is one complication, though. If userspace is waiting for the lock
    in tb_switch_set_authorized(), it keeps the device_del() from removing
    the sysfs attribute because it waits for active users to release the
    attribute first which leads into following splat:
    
        INFO: task kworker/u8:3:73 blocked for more than 61 seconds.
              Tainted: G        W         5.1.0-rc1+ #244
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        kworker/u8:3    D12976    73      2 0x80000000
        Workqueue: thunderbolt0 tb_handle_hotplug [thunderbolt]
        Call Trace:
         ? __schedule+0x2e5/0x740
         ? _raw_spin_lock_irqsave+0x12/0x40
         ? prepare_to_wait_event+0xc5/0x160
         schedule+0x2d/0x80
         __kernfs_remove.part.17+0x183/0x1f0
         ? finish_wait+0x80/0x80
         kernfs_remove_by_name_ns+0x4a/0x90
         remove_files.isra.1+0x2b/0x60
         sysfs_remove_group+0x38/0x80
         sysfs_remove_groups+0x24/0x40
         device_remove_attrs+0x3d/0x70
         device_del+0x14c/0x360
         device_unregister+0x15/0x50
         tb_switch_remove+0x9e/0x1d0 [thunderbolt]
         tb_handle_hotplug+0x119/0x5a0 [thunderbolt]
         ? process_one_work+0x1b7/0x420
         process_one_work+0x1b7/0x420
         worker_thread+0x37/0x380
         ? _raw_spin_unlock_irqrestore+0xf/0x30
         ? process_one_work+0x420/0x420
         kthread+0x118/0x130
         ? kthread_create_on_node+0x60/0x60
         ret_from_fork+0x35/0x40
    
    We deal this by following what network stack did for some of their
    attributes and use mutex_trylock() with restart_syscall(). This makes
    userspace release the attribute allowing sysfs attribute removal to
    progress before the write is restarted and eventually fail when the
    attribute is removed.
    Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
    09f11b6c
tb.h 15.5 KB