• Vlastimil Babka's avatar
    mm, slub: move slub_debug static key enabling outside slab_mutex · afe0c26d
    Vlastimil Babka authored
    Paul E.  McKenney reported [1] that commit 1f0723a4 ("mm, slub: enable
    slub_debug static key when creating cache with explicit debug flags")
    results in the lockdep complaint:
    
     ======================================================
     WARNING: possible circular locking dependency detected
     5.12.0+ #15 Not tainted
     ------------------------------------------------------
     rcu_torture_sta/109 is trying to acquire lock:
     ffffffff96063cd0 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x9/0x20
    
     but task is already holding lock:
     ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250
    
     which lock already depends on the new lock.
    
     the existing dependency chain (in reverse order) is:
    
     -> #1 (slab_mutex){+.+.}-{3:3}:
            lock_acquire+0xb9/0x3a0
            __mutex_lock+0x8d/0x920
            slub_cpu_dead+0x15/0xf0
            cpuhp_invoke_callback+0x17a/0x7c0
            cpuhp_invoke_callback_range+0x3b/0x80
            _cpu_down+0xdf/0x2a0
            cpu_down+0x2c/0x50
            device_offline+0x82/0xb0
            remove_cpu+0x1a/0x30
            torture_offline+0x80/0x140
            torture_onoff+0x147/0x260
            kthread+0x10a/0x140
            ret_from_fork+0x22/0x30
    
     -> #0 (cpu_hotplug_lock){++++}-{0:0}:
            check_prev_add+0x8f/0xbf0
            __lock_acquire+0x13f0/0x1d80
            lock_acquire+0xb9/0x3a0
            cpus_read_lock+0x21/0xa0
            static_key_enable+0x9/0x20
            __kmem_cache_create+0x38d/0x430
            kmem_cache_create_usercopy+0x146/0x250
            kmem_cache_create+0xd/0x10
            rcu_torture_stats+0x79/0x280
            kthread+0x10a/0x140
            ret_from_fork+0x22/0x30
    
     other info that might help us debug this:
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(slab_mutex);
                                    lock(cpu_hotplug_lock);
                                    lock(slab_mutex);
       lock(cpu_hotplug_lock);
    
      *** DEADLOCK ***
    
     1 lock held by rcu_torture_sta/109:
      #0: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250
    
     stack backtrace:
     CPU: 3 PID: 109 Comm: rcu_torture_sta Not tainted 5.12.0+ #15
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
     Call Trace:
      dump_stack+0x6d/0x89
      check_noncircular+0xfe/0x110
      ? lock_is_held_type+0x98/0x110
      check_prev_add+0x8f/0xbf0
      __lock_acquire+0x13f0/0x1d80
      lock_acquire+0xb9/0x3a0
      ? static_key_enable+0x9/0x20
      ? mark_held_locks+0x49/0x70
      cpus_read_lock+0x21/0xa0
      ? static_key_enable+0x9/0x20
      static_key_enable+0x9/0x20
      __kmem_cache_create+0x38d/0x430
      kmem_cache_create_usercopy+0x146/0x250
      ? rcu_torture_stats_print+0xd0/0xd0
      kmem_cache_create+0xd/0x10
      rcu_torture_stats+0x79/0x280
      ? rcu_torture_stats_print+0xd0/0xd0
      kthread+0x10a/0x140
      ? kthread_park+0x80/0x80
      ret_from_fork+0x22/0x30
    
    This is because there's one order of locking from the hotplug callbacks:
    
    lock(cpu_hotplug_lock); // from hotplug machinery itself
    lock(slab_mutex); // in e.g. slab_mem_going_offline_callback()
    
    And commit 1f0723a4 made the reverse sequence possible:
    lock(slab_mutex); // in kmem_cache_create_usercopy()
    lock(cpu_hotplug_lock); // kmem_cache_open() -> static_key_enable()
    
    The simplest fix is to move static_key_enable() to a place before slab_mutex is
    taken. That means kmem_cache_create_usercopy() in mm/slab_common.c which is not
    ideal for SLUB-specific code, but the #ifdef CONFIG_SLUB_DEBUG makes it
    at least self-contained and obvious.
    
    [1] https://lore.kernel.org/lkml/20210502171827.GA3670492@paulmck-ThinkPad-P17-Gen-1/
    
    Link: https://lkml.kernel.org/r/20210504120019.26791-1-vbabka@suse.cz
    Fixes: 1f0723a4 ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags")
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reported-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    Tested-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    Acked-by: default avatarDavid Rientjes <rientjes@google.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    afe0c26d
slub.c 143 KB