• Julian Anastasov's avatar
    ipvs: fix rtnl_lock lockups caused by start_sync_thread · 5c64576a
    Julian Anastasov authored
    syzkaller reports for wrong rtnl_lock usage in sync code [1] and [2]
    
    We have 2 problems in start_sync_thread if error path is
    taken, eg. on memory allocation error or failure to configure
    sockets for mcast group or addr/port binding:
    
    1. recursive locking: holding rtnl_lock while calling sock_release
    which in turn calls again rtnl_lock in ip_mc_drop_socket to leave
    the mcast group, as noticed by Florian Westphal. Additionally,
    sock_release can not be called while holding sync_mutex (ABBA
    deadlock).
    
    2. task hung: holding rtnl_lock while calling kthread_stop to
    stop the running kthreads. As the kthreads do the same to leave
    the mcast group (sock_release -> ip_mc_drop_socket -> rtnl_lock)
    they hang.
    
    Fix the problems by calling rtnl_unlock early in the error path,
    now sock_release is called after unlocking both mutexes.
    
    Problem 3 (task hung reported by syzkaller [2]) is variant of
    problem 2: use _trylock to prevent one user to call rtnl_lock and
    then while waiting for sync_mutex to block kthreads that execute
    sock_release when they are stopped by stop_sync_thread.
    
    [1]
    IPVS: stopping backup sync thread 4500 ...
    WARNING: possible recursive locking detected
    4.16.0-rc7+ #3 Not tainted
    --------------------------------------------
    syzkaller688027/4497 is trying to acquire lock:
      (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
    
    but task is already holding lock:
    IPVS: stopping backup sync thread 4495 ...
      (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
    
    other info that might help us debug this:
      Possible unsafe locking scenario:
    
            CPU0
            ----
       lock(rtnl_mutex);
       lock(rtnl_mutex);
    
      *** DEADLOCK ***
    
      May be due to missing lock nesting notation
    
    2 locks held by syzkaller688027/4497:
      #0:  (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
      #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000703f78e3>]
    do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
    
    stack backtrace:
    CPU: 1 PID: 4497 Comm: syzkaller688027 Not tainted 4.16.0-rc7+ #3
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
      __dump_stack lib/dump_stack.c:17 [inline]
      dump_stack+0x194/0x24d lib/dump_stack.c:53
      print_deadlock_bug kernel/locking/lockdep.c:1761 [inline]
      check_deadlock kernel/locking/lockdep.c:1805 [inline]
      validate_chain kernel/locking/lockdep.c:2401 [inline]
      __lock_acquire+0xe8f/0x3e00 kernel/locking/lockdep.c:3431
      lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
      __mutex_lock_common kernel/locking/mutex.c:756 [inline]
      __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
      mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
      rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
      ip_mc_drop_socket+0x88/0x230 net/ipv4/igmp.c:2643
      inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:413
      sock_release+0x8d/0x1e0 net/socket.c:595
      start_sync_thread+0x2213/0x2b70 net/netfilter/ipvs/ip_vs_sync.c:1924
      do_ip_vs_set_ctl+0x1139/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2389
      nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
      nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
      ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1261
      udp_setsockopt+0x45/0x80 net/ipv4/udp.c:2406
      sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2975
      SYSC_setsockopt net/socket.c:1849 [inline]
      SyS_setsockopt+0x189/0x360 net/socket.c:1828
      do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    RIP: 0033:0x446a69
    RSP: 002b:00007fa1c3a64da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000446a69
    RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000003
    RBP: 00000000006e29fc R08: 0000000000000018 R09: 0000000000000000
    R10: 00000000200000c0 R11: 0000000000000246 R12: 00000000006e29f8
    R13: 00676e697279656b R14: 00007fa1c3a659c0 R15: 00000000006e2b60
    
    [2]
    IPVS: sync thread started: state = BACKUP, mcast_ifn = syz_tun, syncid = 4,
    id = 0
    IPVS: stopping backup sync thread 25415 ...
    INFO: task syz-executor7:25421 blocked for more than 120 seconds.
           Not tainted 4.16.0-rc6+ #284
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    syz-executor7   D23688 25421   4408 0x00000004
    Call Trace:
      context_switch kernel/sched/core.c:2862 [inline]
      __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440
      schedule+0xf5/0x430 kernel/sched/core.c:3499
      schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777
      do_wait_for_common kernel/sched/completion.c:86 [inline]
      __wait_for_common kernel/sched/completion.c:107 [inline]
      wait_for_common kernel/sched/completion.c:118 [inline]
      wait_for_completion+0x415/0x770 kernel/sched/completion.c:139
      kthread_stop+0x14a/0x7a0 kernel/kthread.c:530
      stop_sync_thread+0x3d9/0x740 net/netfilter/ipvs/ip_vs_sync.c:1996
      do_ip_vs_set_ctl+0x2b1/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2394
      nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
      nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
      ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1253
      sctp_setsockopt+0x2ca/0x63e0 net/sctp/socket.c:4154
      sock_common_setsockopt+0x95/0xd0 net/core/sock.c:3039
      SYSC_setsockopt net/socket.c:1850 [inline]
      SyS_setsockopt+0x189/0x360 net/socket.c:1829
      do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
      entry_SYSCALL_64_after_hwframe+0x42/0xb7
    RIP: 0033:0x454889
    RSP: 002b:00007fc927626c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 00007fc9276276d4 RCX: 0000000000454889
    RDX: 000000000000048c RSI: 0000000000000000 RDI: 0000000000000017
    RBP: 000000000072bf58 R08: 0000000000000018 R09: 0000000000000000
    R10: 0000000020000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 000000000000051c R14: 00000000006f9b40 R15: 0000000000000001
    
    Showing all locks held in the system:
    2 locks held by khungtaskd/868:
      #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>]
    check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
      #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>] watchdog+0x1c5/0xd60
    kernel/hung_task.c:249
      #1:  (tasklist_lock){.+.+}, at: [<0000000037c2f8f9>]
    debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470
    1 lock held by rsyslogd/4247:
      #0:  (&f->f_pos_lock){+.+.}, at: [<000000000d8d6983>]
    __fdget_pos+0x12b/0x190 fs/file.c:765
    2 locks held by getty/4338:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4339:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4340:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4341:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4342:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4343:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    2 locks held by getty/4344:
      #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
    ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
      #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
    n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
    3 locks held by kworker/0:5/6494:
      #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
    [<00000000a062b18e>] work_static include/linux/workqueue.h:198 [inline]
      #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
    [<00000000a062b18e>] set_work_data kernel/workqueue.c:619 [inline]
      #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
    [<00000000a062b18e>] set_work_pool_and_clear_pending kernel/workqueue.c:646
    [inline]
      #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
    [<00000000a062b18e>] process_one_work+0xb12/0x1bb0 kernel/workqueue.c:2084
      #1:  ((addr_chk_work).work){+.+.}, at: [<00000000278427d5>]
    process_one_work+0xb89/0x1bb0 kernel/workqueue.c:2088
      #2:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
    1 lock held by syz-executor7/25421:
      #0:  (ipvs->sync_mutex){+.+.}, at: [<00000000d414a689>]
    do_ip_vs_set_ctl+0x277/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2393
    2 locks held by syz-executor7/25427:
      #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
      #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000e6d48489>]
    do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
    1 lock held by syz-executor7/25435:
      #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
    1 lock held by ipvs-b:2:0/25415:
      #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74
    
    Reported-and-tested-by: syzbot+a46d6abf9d56b1365a72@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+5fe074c01b2032ce9618@syzkaller.appspotmail.com
    Fixes: e0b26cc9 ("ipvs: call rtnl_lock early")
    Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
    Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    5c64576a
ip_vs_ctl.c 102 KB