• Guenter Roeck's avatar
    hwmon: (acpi_power_meter) Fix lockdep splat · badcd454
    Guenter Roeck authored
    Damien Le Moal reports a lockdep splat with the acpi_power_meter,
    observed with Linux v5.5 and later.
    
    ======================================================
    WARNING: possible circular locking dependency detected
    5.6.0-rc2+ #629 Not tainted
    ------------------------------------------------------
    python/1397 is trying to acquire lock:
    ffff888619080070 (&resource->lock){+.+.}, at: show_power+0x3c/0xa0 [acpi_power_meter]
    
                   but task is already holding lock:
    ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
    
                   which lock already depends on the new lock.
    
                   the existing dependency chain (in reverse order) is:
    
                   -> #1 (kn->count#119){++++}:
           __kernfs_remove+0x626/0x7e0
           kernfs_remove_by_name_ns+0x41/0x80
           remove_attrs+0xcb/0x3c0 [acpi_power_meter]
           acpi_power_meter_notify+0x1f7/0x310 [acpi_power_meter]
           acpi_ev_notify_dispatch+0x198/0x1f3
           acpi_os_execute_deferred+0x4d/0x70
           process_one_work+0x7c8/0x1340
           worker_thread+0x94/0xc70
           kthread+0x2ed/0x3f0
           ret_from_fork+0x24/0x30
    
                   -> #0 (&resource->lock){+.+.}:
           __lock_acquire+0x20be/0x49b0
           lock_acquire+0x127/0x340
           __mutex_lock+0x15b/0x1350
           show_power+0x3c/0xa0 [acpi_power_meter]
           dev_attr_show+0x3f/0x80
           sysfs_kf_seq_show+0x216/0x410
           seq_read+0x407/0xf90
           vfs_read+0x152/0x2c0
           ksys_read+0xf3/0x1d0
           do_syscall_64+0x95/0x1010
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
    
                   other info that might help us debug this:
    
     Possible unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(kn->count#119);
                                   lock(&resource->lock);
                                   lock(kn->count#119);
      lock(&resource->lock);
    
                    *** DEADLOCK ***
    4 locks held by python/1397:
     #0: ffff8890242d64e0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x9b/0xb0
     #1: ffff889040be74e0 (&p->lock){+.+.}, at: seq_read+0x6b/0xf90
     #2: ffff8890448eb880 (&of->mutex){+.+.}, at: kernfs_seq_start+0x47/0x160
     #3: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
    
                   stack backtrace:
    CPU: 10 PID: 1397 Comm: python Not tainted 5.6.0-rc2+ #629
    Hardware name: Supermicro Super Server/X11DPL-i, BIOS 3.1 05/21/2019
    Call Trace:
     dump_stack+0x97/0xe0
     check_noncircular+0x32e/0x3e0
     ? print_circular_bug.isra.0+0x1e0/0x1e0
     ? unwind_next_frame+0xb9a/0x1890
     ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
     ? graph_lock+0x79/0x170
     ? __lockdep_reset_lock+0x3c0/0x3c0
     ? mark_lock+0xbc/0x1150
     __lock_acquire+0x20be/0x49b0
     ? mark_held_locks+0xe0/0xe0
     ? stack_trace_save+0x91/0xc0
     lock_acquire+0x127/0x340
     ? show_power+0x3c/0xa0 [acpi_power_meter]
     ? device_remove_bin_file+0x10/0x10
     ? device_remove_bin_file+0x10/0x10
     __mutex_lock+0x15b/0x1350
     ? show_power+0x3c/0xa0 [acpi_power_meter]
     ? show_power+0x3c/0xa0 [acpi_power_meter]
     ? mutex_lock_io_nested+0x11f0/0x11f0
     ? lock_downgrade+0x6a0/0x6a0
     ? kernfs_seq_start+0x47/0x160
     ? lock_acquire+0x127/0x340
     ? kernfs_seq_start+0x6a/0x160
     ? device_remove_bin_file+0x10/0x10
     ? show_power+0x3c/0xa0 [acpi_power_meter]
     show_power+0x3c/0xa0 [acpi_power_meter]
     dev_attr_show+0x3f/0x80
     ? memset+0x20/0x40
     sysfs_kf_seq_show+0x216/0x410
     seq_read+0x407/0xf90
     ? security_file_permission+0x16f/0x2c0
     vfs_read+0x152/0x2c0
    
    Problem is that reading an attribute takes the kernfs lock in the kernfs
    code, then resource->lock in the driver. During an ACPI notification, the
    opposite happens: The resource lock is taken first, followed by the kernfs
    lock when sysfs attributes are removed and re-created. Presumably this is
    now seen due to some locking related changes in kernfs after v5.4, but it
    was likely always a problem.
    
    Fix the problem by not blindly acquiring the lock in the notification
    function. It is only needed to protect the various update functions.
    However, those update functions are called anyway when sysfs attributes
    are read. This means that we can just stop calling those functions from
    the notifier, and the resource lock in the notifier function is no longer
    needed.
    
    That leaves two situations:
    
    First, METER_NOTIFY_CONFIG removes and re-allocates capability strings.
    While it did so under the resource lock, _displaying_ those strings was not
    protected, creating a race condition. To solve this problem, selectively
    protect both removal/creation and reporting of capability attributes with
    the resource lock.
    
    Second, removing and re-creating the attribute files is no longer protected
    by the resource lock. That doesn't matter since access to each individual
    attribute is protected by the kernfs lock. Userspace may get messed up if
    attributes disappear and reappear under its nose, but that is not different
    than today, and there is nothing we can do about it without major driver
    restructuring.
    
    Last but not least, when removing the driver, remove attribute functions
    first, then release capability strings. This avoids yet another race
    condition.
    Reported-by: default avatarDamien Le Moal <Damien.LeMoal@wdc.com>
    Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
    Cc: stable@vger.kernel.org # v5.5+
    Tested-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
    Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
    badcd454
acpi_power_meter.c 23.2 KB