1. 18 Nov, 2017 7 commits
    • Guillaume Nault's avatar
      l2tp: check ps->sock before running pppol2tp_session_ioctl() · 9075216b
      Guillaume Nault authored
      
      [ Upstream commit 5903f594 ]
      
      When pppol2tp_session_ioctl() is called by pppol2tp_tunnel_ioctl(),
      the session may be unconnected. That is, it was created by
      pppol2tp_session_create() and hasn't been connected with
      pppol2tp_connect(). In this case, ps->sock is NULL, so we need to check
      for this case in order to avoid dereferencing a NULL pointer.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9075216b
    • Eric Dumazet's avatar
      tcp: fix tcp_mtu_probe() vs highest_sack · e12c42c5
      Eric Dumazet authored
      
      [ Upstream commit 2b7cda9c ]
      
      Based on SNMP values provided by Roman, Yuchung made the observation
      that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
      
      Looking at tcp_mtu_probe(), I found that when a new skb was placed
      in front of the write queue, we were not updating tcp highest sack.
      
      If one skb is freed because all its content was copied to the new skb
      (for MTU probing), then tp->highest_sack could point to a now freed skb.
      
      Bad things would then happen, including infinite loops.
      
      This patch renames tcp_highest_sack_combine() and uses it
      from tcp_mtu_probe() to fix the bug.
      
      Note that I also removed one test against tp->sacked_out,
      since we want to replace tp->highest_sack regardless of whatever
      condition, since keeping a stale pointer to freed skb is a recipe
      for disaster.
      
      Fixes: a47e5a98 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reported-by: default avatarRoman Gushchin <guro@fb.com>
      Reported-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e12c42c5
    • Eric Dumazet's avatar
      net: call cgroup_sk_alloc() earlier in sk_clone_lock() · cb5880e6
      Eric Dumazet authored
      
      [ Upstream commit c0576e39 ]
      
      If for some reason, the newly allocated child need to be freed,
      we will call cgroup_put() (via sk_free_unlock_clone()) while the
      corresponding cgroup_get() was not yet done, and we will free memory
      too soon.
      
      Fixes: d979a39d ("cgroup: duplicate cgroup reference when cloning sockets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb5880e6
    • Jason A. Donenfeld's avatar
      netlink: do not set cb_running if dump's start() errs · 4cd69ad5
      Jason A. Donenfeld authored
      
      [ Upstream commit 41c87425 ]
      
      It turns out that multiple places can call netlink_dump(), which means
      it's still possible to dereference partially initialized values in
      dump() that were the result of a faulty returned start().
      
      This fixes the issue by calling start() _before_ setting cb_running to
      true, so that there's no chance at all of hitting the dump() function
      through any indirect paths.
      
      It also moves the call to start() to be when the mutex is held. This has
      the nice side effect of serializing invocations to start(), which is
      likely desirable anyway. It also prevents any possible other races that
      might come out of this logic.
      
      In testing this with several different pieces of tricky code to trigger
      these issues, this commit fixes all avenues that I'm aware of.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Reviewed-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cd69ad5
    • Eric Dumazet's avatar
      ipv6: addrconf: increment ifp refcount before ipv6_del_addr() · d87890d9
      Eric Dumazet authored
      
      [ Upstream commit e669b869 ]
      
      In the (unlikely) event fixup_permanent_addr() returns a failure,
      addrconf_permanent_addr() calls ipv6_del_addr() without the
      mandatory call to in6_ifa_hold(), leading to a refcount error,
      spotted by syzkaller :
      
      WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
      lib/refcount.c:227
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:52
       panic+0x1e4/0x41c kernel/panic.c:181
       __warn+0x1c4/0x1e0 kernel/panic.c:544
       report_bug+0x211/0x2d0 lib/bug.c:183
       fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
       do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
       do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
       do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
       invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
      RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
      RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286
      RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000
      RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4
      RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000
      R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9
      R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc
       __in6_ifa_put include/net/addrconf.h:369 [inline]
       ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
       addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
       addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
       call_netdevice_notifiers net/core/dev.c:1715 [inline]
       __dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
       dev_change_flags+0xf5/0x140 net/core/dev.c:6879
       do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
       rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
       rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
       netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
       netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
       netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
       netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
       sock_sendmsg_nosec net/socket.c:633 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:643
       ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
       __sys_sendmsg+0xe5/0x210 net/socket.c:2083
       SYSC_sendmsg net/socket.c:2094 [inline]
       SyS_sendmsg+0x2d/0x50 net/socket.c:2090
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x7fa9174d3320
      RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320
      RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016
      RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0
      R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d87890d9
    • Craig Gallek's avatar
      tun/tap: sanitize TUNSETSNDBUF input · 5b9d2019
      Craig Gallek authored
      
      [ Upstream commit 93161922 ]
      
      Syzkaller found several variants of the lockup below by setting negative
      values with the TUNSETSNDBUF ioctl.  This patch adds a sanity check
      to both the tun and tap versions of this ioctl.
      
        watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:2389]
        Modules linked in:
        irq event stamp: 329692056
        hardirqs last  enabled at (329692055): [<ffffffff824b8381>] _raw_spin_unlock_irqrestore+0x31/0x75
        hardirqs last disabled at (329692056): [<ffffffff824b9e58>] apic_timer_interrupt+0x98/0xb0
        softirqs last  enabled at (35659740): [<ffffffff824bc958>] __do_softirq+0x328/0x48c
        softirqs last disabled at (35659731): [<ffffffff811c796c>] irq_exit+0xbc/0xd0
        CPU: 0 PID: 2389 Comm: repro Not tainted 4.14.0-rc7 #23
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff880009452140 task.stack: ffff880006a20000
        RIP: 0010:_raw_spin_lock_irqsave+0x11/0x80
        RSP: 0018:ffff880006a27c50 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
        RAX: ffff880009ac68d0 RBX: ffff880006a27ce0 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: ffff880006a27ce0 RDI: ffff880009ac6900
        RBP: ffff880006a27c60 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000001 R11: 000000000063ff00 R12: ffff880009ac6900
        R13: ffff880006a27cf8 R14: 0000000000000001 R15: ffff880006a27cf8
        FS:  00007f4be4838700(0000) GS:ffff88000cc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000020101000 CR3: 0000000009616000 CR4: 00000000000006f0
        Call Trace:
         prepare_to_wait+0x26/0xc0
         sock_alloc_send_pskb+0x14e/0x270
         ? remove_wait_queue+0x60/0x60
         tun_get_user+0x2cc/0x19d0
         ? __tun_get+0x60/0x1b0
         tun_chr_write_iter+0x57/0x86
         __vfs_write+0x156/0x1e0
         vfs_write+0xf7/0x230
         SyS_write+0x57/0xd0
         entry_SYSCALL_64_fastpath+0x1f/0xbe
        RIP: 0033:0x7f4be4356df9
        RSP: 002b:00007ffc18101c08 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4be4356df9
        RDX: 0000000000000046 RSI: 0000000020101000 RDI: 0000000000000005
        RBP: 00007ffc18101c40 R08: 0000000000000001 R09: 0000000000000001
        R10: 0000000000000001 R11: 0000000000000293 R12: 0000559c75f64780
        R13: 00007ffc18101d30 R14: 0000000000000000 R15: 0000000000000000
      
      Fixes: 33dccbb0 ("tun: Limit amount of queued packets per device")
      Fixes: 20d29d7a ("net: macvtap driver")
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b9d2019
    • Alexey Kodanev's avatar
      gso: fix payload length when gso_size is zero · 97ba8f88
      Alexey Kodanev authored
      
      [ Upstream commit 3d0241d5 ]
      
      When gso_size reset to zero for the tail segment in skb_segment(), later
      in ipv6_gso_segment(), __skb_udp_tunnel_segment() and gre_gso_segment()
      we will get incorrect results (payload length, pcsum) for that segment.
      inet_gso_segment() already has a check for gso_size before calculating
      payload.
      
      The issue was found with LTP vxlan & gre tests over ixgbe NIC.
      
      Fixes: 07b26c94 ("gso: Support partial splitting at the frag_list pointer")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97ba8f88
  2. 15 Nov, 2017 33 commits