1. 16 May, 2018 1 commit
    • Julian Anastasov's avatar
      ipvs: fix rtnl_lock lockups caused by start_sync_thread · 83797a77
      Julian Anastasov authored
      commit 5c64576a upstream.
      
      syzkaller reports for wrong rtnl_lock usage in sync code [1] and [2]
      
      We have 2 problems in start_sync_thread if error path is
      taken, eg. on memory allocation error or failure to configure
      sockets for mcast group or addr/port binding:
      
      1. recursive locking: holding rtnl_lock while calling sock_release
      which in turn calls again rtnl_lock in ip_mc_drop_socket to leave
      the mcast group, as noticed by Florian Westphal. Additionally,
      sock_release can not be called while holding sync_mutex (ABBA
      deadlock).
      
      2. task hung: holding rtnl_lock while calling kthread_stop to
      stop the running kthreads. As the kthreads do the same to leave
      the mcast group (sock_release -> ip_mc_drop_socket -> rtnl_lock)
      they hang.
      
      Fix the problems by calling rtnl_unlock early in the error path,
      now sock_release is called after unlocking both mutexes.
      
      Problem 3 (task hung reported by syzkaller [2]) is variant of
      problem 2: use _trylock to prevent one user to call rtnl_lock and
      then while waiting for sync_mutex to block kthreads that execute
      sock_release when they are stopped by stop_sync_thread.
      
      [1]
      IPVS: stopping backup sync thread 4500 ...
      WARNING: possible recursive locking detected
      4.16.0-rc7+ #3 Not tainted
      --------------------------------------------
      syzkaller688027/4497 is trying to acquire lock:
        (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      but task is already holding lock:
      IPVS: stopping backup sync thread 4495 ...
        (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(rtnl_mutex);
         lock(rtnl_mutex);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
      2 locks held by syzkaller688027/4497:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
        #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000703f78e3>]
      do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
      
      stack backtrace:
      CPU: 1 PID: 4497 Comm: syzkaller688027 Not tainted 4.16.0-rc7+ #3
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x24d lib/dump_stack.c:53
        print_deadlock_bug kernel/locking/lockdep.c:1761 [inline]
        check_deadlock kernel/locking/lockdep.c:1805 [inline]
        validate_chain kernel/locking/lockdep.c:2401 [inline]
        __lock_acquire+0xe8f/0x3e00 kernel/locking/lockdep.c:3431
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
        ip_mc_drop_socket+0x88/0x230 net/ipv4/igmp.c:2643
        inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:413
        sock_release+0x8d/0x1e0 net/socket.c:595
        start_sync_thread+0x2213/0x2b70 net/netfilter/ipvs/ip_vs_sync.c:1924
        do_ip_vs_set_ctl+0x1139/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2389
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1261
        udp_setsockopt+0x45/0x80 net/ipv4/udp.c:2406
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2975
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x446a69
      RSP: 002b:00007fa1c3a64da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000446a69
      RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00000000006e29fc R08: 0000000000000018 R09: 0000000000000000
      R10: 00000000200000c0 R11: 0000000000000246 R12: 00000000006e29f8
      R13: 00676e697279656b R14: 00007fa1c3a659c0 R15: 00000000006e2b60
      
      [2]
      IPVS: sync thread started: state = BACKUP, mcast_ifn = syz_tun, syncid = 4,
      id = 0
      IPVS: stopping backup sync thread 25415 ...
      INFO: task syz-executor7:25421 blocked for more than 120 seconds.
             Not tainted 4.16.0-rc6+ #284
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D23688 25421   4408 0x00000004
      Call Trace:
        context_switch kernel/sched/core.c:2862 [inline]
        __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440
        schedule+0xf5/0x430 kernel/sched/core.c:3499
        schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777
        do_wait_for_common kernel/sched/completion.c:86 [inline]
        __wait_for_common kernel/sched/completion.c:107 [inline]
        wait_for_common kernel/sched/completion.c:118 [inline]
        wait_for_completion+0x415/0x770 kernel/sched/completion.c:139
        kthread_stop+0x14a/0x7a0 kernel/kthread.c:530
        stop_sync_thread+0x3d9/0x740 net/netfilter/ipvs/ip_vs_sync.c:1996
        do_ip_vs_set_ctl+0x2b1/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2394
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1253
        sctp_setsockopt+0x2ca/0x63e0 net/sctp/socket.c:4154
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:3039
        SYSC_setsockopt net/socket.c:1850 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1829
        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x454889
      RSP: 002b:00007fc927626c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 00007fc9276276d4 RCX: 0000000000454889
      RDX: 000000000000048c RSI: 0000000000000000 RDI: 0000000000000017
      RBP: 000000000072bf58 R08: 0000000000000018 R09: 0000000000000000
      R10: 0000000020000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 000000000000051c R14: 00000000006f9b40 R15: 0000000000000001
      
      Showing all locks held in the system:
      2 locks held by khungtaskd/868:
        #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>]
      check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
        #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>] watchdog+0x1c5/0xd60
      kernel/hung_task.c:249
        #1:  (tasklist_lock){.+.+}, at: [<0000000037c2f8f9>]
      debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470
      1 lock held by rsyslogd/4247:
        #0:  (&f->f_pos_lock){+.+.}, at: [<000000000d8d6983>]
      __fdget_pos+0x12b/0x190 fs/file.c:765
      2 locks held by getty/4338:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4339:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4340:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4341:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4342:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4343:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4344:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      3 locks held by kworker/0:5/6494:
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] work_static include/linux/workqueue.h:198 [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] set_work_data kernel/workqueue.c:619 [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] set_work_pool_and_clear_pending kernel/workqueue.c:646
      [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] process_one_work+0xb12/0x1bb0 kernel/workqueue.c:2084
        #1:  ((addr_chk_work).work){+.+.}, at: [<00000000278427d5>]
      process_one_work+0xb89/0x1bb0 kernel/workqueue.c:2088
        #2:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      1 lock held by syz-executor7/25421:
        #0:  (ipvs->sync_mutex){+.+.}, at: [<00000000d414a689>]
      do_ip_vs_set_ctl+0x277/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2393
      2 locks held by syz-executor7/25427:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
        #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000e6d48489>]
      do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
      1 lock held by syz-executor7/25435:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      1 lock held by ipvs-b:2:0/25415:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      Reported-and-tested-by: syzbot+a46d6abf9d56b1365a72@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+5fe074c01b2032ce9618@syzkaller.appspotmail.com
      Fixes: e0b26cc9 ("ipvs: call rtnl_lock early")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Cc: Zubin Mithra <zsm@chromium.org>
      Cc: Guenter Roeck <groeck@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83797a77
  2. 09 May, 2018 33 commits
  3. 01 May, 2018 6 commits