• Jack Morgenstein's avatar
    IB/mlx4: Fix lockdep splat for the iboe lock · dba3ad2a
    Jack Morgenstein authored
    Chuck Lever reported the following stack trace:
    
        =================================
        [ INFO: inconsistent lock state ]
        3.16.0-rc2-00024-g2e78883 #17 Tainted: G            E
        ---------------------------------
        inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
        swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
        (&(&iboe->lock)->rlock){+.?...}, at: [<ffffffffa065f68b>] mlx4_ib_addr_event+0xdb/0x1a0 [mlx4_ib]
        {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff810b3110>] mark_irqflags+0x110/0x170
         [<ffffffff810b4806>] __lock_acquire+0x2c6/0x5b0
         [<ffffffff810b4bd9>] lock_acquire+0xe9/0x120
         [<ffffffff815f7f6e>] _raw_spin_lock+0x3e/0x80
         [<ffffffffa0661084>] mlx4_ib_scan_netdevs+0x34/0x260 [mlx4_ib]
         [<ffffffffa06612db>] mlx4_ib_netdev_event+0x2b/0x40 [mlx4_ib]
         [<ffffffff81522219>] register_netdevice_notifier+0x99/0x1e0
         [<ffffffffa06626e3>] mlx4_ib_add+0x743/0xbc0 [mlx4_ib]
         [<ffffffffa05ec168>] mlx4_add_device+0x48/0xa0 [mlx4_core]
         [<ffffffffa05ec2c3>] mlx4_register_interface+0x73/0xb0 [mlx4_core]
         [<ffffffffa05c505e>] cm_req_handler+0x13e/0x460 [ib_cm]
         [<ffffffff810002e2>] do_one_initcall+0x112/0x1c0
         [<ffffffff810e8264>] do_init_module+0x34/0x190
         [<ffffffff810ea62f>] load_module+0x5cf/0x740
         [<ffffffff810ea939>] SyS_init_module+0x99/0xd0
         [<ffffffff815f8fd2>] system_call_fastpath+0x16/0x1b
        irq event stamp: 336142
        hardirqs last  enabled at (336142): [<ffffffff810612f5>] __local_bh_enable_ip+0xb5/0xc0
        hardirqs last disabled at (336141): [<ffffffff81061296>] __local_bh_enable_ip+0x56/0xc0
        softirqs last  enabled at (336004): [<ffffffff8106123a>] _local_bh_enable+0x4a/0x50
        softirqs last disabled at (336005): [<ffffffff810617a4>] irq_exit+0x44/0xd0
    
        other info that might help us debug this:
        Possible unsafe locking scenario:
    
              CPU0
              ----
         lock(&(&iboe->lock)->rlock);
         <Interrupt>
           lock(&(&iboe->lock)->rlock);
    
        *** DEADLOCK ***
    
    The above problem was caused by the spin lock being taken both in the process
    context and in a soft-irq context (in a netdev notifier handler).
    
    The required fix is to use spin_lock/unlock_bh() instead of spin_lock/unlock
    on the iboe lock.
    Reported-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
    Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
    dba3ad2a
main.c 68.3 KB