1. 21 Jul, 2017 21 commits
    • Sowmini Varadhan's avatar
      rds: tcp: use sock_create_lite() to create the accept socket · 181dda46
      Sowmini Varadhan authored
      commit 0933a578 upstream.
      
      There are two problems with calling sock_create_kern() from
      rds_tcp_accept_one()
      1. it sets up a new_sock->sk that is wasteful, because this ->sk
         is going to get replaced by inet_accept() in the subsequent ->accept()
      2. The new_sock->sk is a leaked reference in sock_graft() which
         expects to find a null parent->sk
      
      Avoid these problems by calling sock_create_lite().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      181dda46
    • Nikolay Aleksandrov's avatar
      vrf: fix bug_on triggered by rx when destroying a vrf · e6577f1e
      Nikolay Aleksandrov authored
      commit f630c38e upstream.
      
      When destroying a VRF device we cleanup the slaves in its ndo_uninit()
      function, but that causes packets to be switched (skb->dev == vrf being
      destroyed) even though we're pass the point where the VRF should be
      receiving any packets while it is being dismantled. This causes a BUG_ON
      to trigger if we have raw sockets (trace below).
      The reason is that the inetdev of the VRF has been destroyed but we're
      still sending packets up the stack with it, so let's free the slaves in
      the dellink callback as David Ahern suggested.
      
      Note that this fix doesn't prevent packets from going up when the VRF
      device is admin down.
      
      [   35.631371] ------------[ cut here ]------------
      [   35.631603] kernel BUG at net/ipv4/fib_frontend.c:285!
      [   35.631854] invalid opcode: 0000 [#1] SMP
      [   35.631977] Modules linked in:
      [   35.632081] CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 4.12.0-rc7+ #45
      [   35.632247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [   35.632477] task: ffff88005ad68000 task.stack: ffff88005ad64000
      [   35.632632] RIP: 0010:fib_compute_spec_dst+0xfc/0x1ee
      [   35.632769] RSP: 0018:ffff88005ad67978 EFLAGS: 00010202
      [   35.632910] RAX: 0000000000000001 RBX: ffff880059a7f200 RCX: 0000000000000000
      [   35.633084] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82274af0
      [   35.633256] RBP: ffff88005ad679f8 R08: 000000000001ef70 R09: 0000000000000046
      [   35.633430] R10: ffff88005ad679f8 R11: ffff880037731cb0 R12: 0000000000000001
      [   35.633603] R13: ffff8800599e3000 R14: 0000000000000000 R15: ffff8800599cb852
      [   35.634114] FS:  0000000000000000(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
      [   35.634306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   35.634456] CR2: 00007f3563227095 CR3: 000000000201d000 CR4: 00000000000406e0
      [   35.634632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   35.634865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   35.635055] Call Trace:
      [   35.635271]  ? __lock_acquire+0xf0d/0x1117
      [   35.635522]  ipv4_pktinfo_prepare+0x82/0x151
      [   35.635831]  raw_rcv_skb+0x17/0x3c
      [   35.636062]  raw_rcv+0xe5/0xf7
      [   35.636287]  raw_local_deliver+0x169/0x1d9
      [   35.636534]  ip_local_deliver_finish+0x87/0x1c4
      [   35.636820]  ip_local_deliver+0x63/0x7f
      [   35.637058]  ip_rcv_finish+0x340/0x3a1
      [   35.637295]  ip_rcv+0x314/0x34a
      [   35.637525]  __netif_receive_skb_core+0x49f/0x7c5
      [   35.637780]  ? lock_acquire+0x13f/0x1d7
      [   35.638018]  ? lock_acquire+0x15e/0x1d7
      [   35.638259]  __netif_receive_skb+0x1e/0x94
      [   35.638502]  ? __netif_receive_skb+0x1e/0x94
      [   35.638748]  netif_receive_skb_internal+0x74/0x300
      [   35.639002]  ? dev_gro_receive+0x2ed/0x411
      [   35.639246]  ? lock_is_held_type+0xc4/0xd2
      [   35.639491]  napi_gro_receive+0x105/0x1a0
      [   35.639736]  receive_buf+0xc32/0xc74
      [   35.639965]  ? detach_buf+0x67/0x153
      [   35.640201]  ? virtqueue_get_buf_ctx+0x120/0x176
      [   35.640453]  virtnet_poll+0x128/0x1c5
      [   35.640690]  net_rx_action+0x103/0x343
      [   35.640932]  __do_softirq+0x1c7/0x4b7
      [   35.641171]  run_ksoftirqd+0x23/0x5c
      [   35.641403]  smpboot_thread_fn+0x24f/0x26d
      [   35.641646]  ? sort_range+0x22/0x22
      [   35.641878]  kthread+0x129/0x131
      [   35.642104]  ? __list_add+0x31/0x31
      [   35.642335]  ? __list_add+0x31/0x31
      [   35.642568]  ret_from_fork+0x2a/0x40
      [   35.642804] Code: 05 bd 87 a3 00 01 e8 1f ef 98 ff 4d 85 f6 48 c7 c7 f0 4a 27 82 41 0f 94 c4 31 c9 31 d2 41 0f b6 f4 e8 04 71 a1 ff 45 84 e4 74 02 <0f> 0b 0f b7 93 c4 00 00 00 4d 8b a5 80 05 00 00 48 03 93 d0 00
      [   35.644342] RIP: fib_compute_spec_dst+0xfc/0x1ee RSP: ffff88005ad67978
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Reported-by: default avatarChris Cormier <chriscormier@cumulusnetworks.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6577f1e
    • David Ahern's avatar
      net: ipv6: Compare lwstate in detecting duplicate nexthops · 0bc26d1c
      David Ahern authored
      commit f06b7549 upstream.
      
      Lennert reported a failure to add different mpls encaps in a multipath
      route:
      
        $ ip -6 route add 1234::/16 \
              nexthop encap mpls 10 via fe80::1 dev ens3 \
              nexthop encap mpls 20 via fe80::1 dev ens3
        RTNETLINK answers: File exists
      
      The problem is that the duplicate nexthop detection does not compare
      lwtunnel configuration. Add it.
      
      Fixes: 19e42e45 ("ipv6: support for fib route lwtunnel encap attributes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reported-by: default avatarJoão Taveira Araújo <joao.taveira@gmail.com>
      Reported-by: default avatarLennert Buytenhek <buytenh@wantstofly.org>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Tested-by: default avatarLennert Buytenhek <buytenh@wantstofly.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0bc26d1c
    • Alban Browaeys's avatar
      net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64 · 05e165e9
      Alban Browaeys authored
      commit 9af9959e upstream.
      
      commit 9256645a ("net/core: relax BUILD_BUG_ON in
      netdev_stats_to_stats64") made an attempt to read beyond
      the size of the source a possibility.
      
      Fix to only copy src size to dest. As dest might be bigger than src.
      
       ==================================================================
       BUG: KASAN: slab-out-of-bounds in netdev_stats_to_stats64+0xe/0x30 at addr ffff8801be248b20
       Read of size 192 by task VBoxNetAdpCtl/6734
       CPU: 1 PID: 6734 Comm: VBoxNetAdpCtl Tainted: G           O    4.11.4prahal+intel+ #118
       Hardware name: LENOVO 20CDCTO1WW/20CDCTO1WW, BIOS GQET52WW (1.32 ) 05/04/2017
       Call Trace:
        dump_stack+0x63/0x86
        kasan_object_err+0x1c/0x70
        kasan_report+0x270/0x520
        ? netdev_stats_to_stats64+0xe/0x30
        ? sched_clock_cpu+0x1b/0x190
        ? __module_address+0x3e/0x3b0
        ? unwind_next_frame+0x1ea/0xb00
        check_memory_region+0x13c/0x1a0
        memcpy+0x23/0x50
        netdev_stats_to_stats64+0xe/0x30
        dev_get_stats+0x1b9/0x230
        rtnl_fill_stats+0x44/0xc00
        ? nla_put+0xc6/0x130
        rtnl_fill_ifinfo+0xe9e/0x3700
        ? rtnl_fill_vfinfo+0xde0/0xde0
        ? sched_clock+0x9/0x10
        ? sched_clock+0x9/0x10
        ? sched_clock_local+0x120/0x130
        ? __module_address+0x3e/0x3b0
        ? unwind_next_frame+0x1ea/0xb00
        ? sched_clock+0x9/0x10
        ? sched_clock+0x9/0x10
        ? sched_clock_cpu+0x1b/0x190
        ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? depot_save_stack+0x1d8/0x4a0
        ? depot_save_stack+0x34f/0x4a0
        ? depot_save_stack+0x34f/0x4a0
        ? save_stack+0xb1/0xd0
        ? save_stack_trace+0x16/0x20
        ? save_stack+0x46/0xd0
        ? kasan_slab_alloc+0x12/0x20
        ? __kmalloc_node_track_caller+0x10d/0x350
        ? __kmalloc_reserve.isra.36+0x2c/0xc0
        ? __alloc_skb+0xd0/0x560
        ? rtmsg_ifinfo_build_skb+0x61/0x120
        ? rtmsg_ifinfo.part.25+0x16/0xb0
        ? rtmsg_ifinfo+0x47/0x70
        ? register_netdev+0x15/0x30
        ? vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
        ? vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? do_vfs_ioctl+0x17f/0xff0
        ? SyS_ioctl+0x74/0x80
        ? do_syscall_64+0x182/0x390
        ? __alloc_skb+0xd0/0x560
        ? __alloc_skb+0xd0/0x560
        ? save_stack_trace+0x16/0x20
        ? init_object+0x64/0xa0
        ? ___slab_alloc+0x1ae/0x5c0
        ? ___slab_alloc+0x1ae/0x5c0
        ? __alloc_skb+0xd0/0x560
        ? sched_clock+0x9/0x10
        ? kasan_unpoison_shadow+0x35/0x50
        ? kasan_kmalloc+0xad/0xe0
        ? __kmalloc_node_track_caller+0x246/0x350
        ? __alloc_skb+0xd0/0x560
        ? kasan_unpoison_shadow+0x35/0x50
        ? memset+0x31/0x40
        ? __alloc_skb+0x31f/0x560
        ? napi_consume_skb+0x320/0x320
        ? br_get_link_af_size_filtered+0xb7/0x120 [bridge]
        ? if_nlmsg_size+0x440/0x630
        rtmsg_ifinfo_build_skb+0x83/0x120
        rtmsg_ifinfo.part.25+0x16/0xb0
        rtmsg_ifinfo+0x47/0x70
        register_netdevice+0xa2b/0xe50
        ? __kmalloc+0x171/0x2d0
        ? netdev_change_features+0x80/0x80
        register_netdev+0x15/0x30
        vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
        vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        ? vboxNetAdpComposeMACAddress+0x1d0/0x1d0 [vboxnetadp]
        ? kasan_check_write+0x14/0x20
        VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? VBoxNetAdpLinuxOpen+0x20/0x20 [vboxnetadp]
        ? lock_acquire+0x11c/0x270
        ? __audit_syscall_entry+0x2fb/0x660
        do_vfs_ioctl+0x17f/0xff0
        ? __audit_syscall_entry+0x2fb/0x660
        ? ioctl_preallocate+0x1d0/0x1d0
        ? __audit_syscall_entry+0x2fb/0x660
        ? kmem_cache_free+0xb2/0x250
        ? syscall_trace_enter+0x537/0xd00
        ? exit_to_usermode_loop+0x100/0x100
        SyS_ioctl+0x74/0x80
        ? do_sys_open+0x350/0x350
        ? do_vfs_ioctl+0xff0/0xff0
        do_syscall_64+0x182/0x390
        entry_SYSCALL64_slow_path+0x25/0x25
       RIP: 0033:0x7f7e39a1ae07
       RSP: 002b:00007ffc6f04c6d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
       RAX: ffffffffffffffda RBX: 00007ffc6f04c730 RCX: 00007f7e39a1ae07
       RDX: 00007ffc6f04c730 RSI: 00000000c0207601 RDI: 0000000000000007
       RBP: 00007ffc6f04c700 R08: 00007ffc6f04c780 R09: 0000000000000008
       R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000007
       R13: 00000000c0207601 R14: 00007ffc6f04c730 R15: 0000000000000012
       Object at ffff8801be248008, in cache kmalloc-4096 size: 4096
       Allocated:
       PID = 6734
        save_stack_trace+0x16/0x20
        save_stack+0x46/0xd0
        kasan_kmalloc+0xad/0xe0
        __kmalloc+0x171/0x2d0
        alloc_netdev_mqs+0x8a7/0xbe0
        vboxNetAdpOsCreate+0x65/0x1c0 [vboxnetadp]
        vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        do_vfs_ioctl+0x17f/0xff0
        SyS_ioctl+0x74/0x80
        do_syscall_64+0x182/0x390
        return_from_SYSCALL_64+0x0/0x6a
       Freed:
       PID = 5600
        save_stack_trace+0x16/0x20
        save_stack+0x46/0xd0
        kasan_slab_free+0x73/0xc0
        kfree+0xe4/0x220
        kvfree+0x25/0x30
        single_release+0x74/0xb0
        __fput+0x265/0x6b0
        ____fput+0x9/0x10
        task_work_run+0xd5/0x150
        exit_to_usermode_loop+0xe2/0x100
        do_syscall_64+0x26c/0x390
        return_from_SYSCALL_64+0x0/0x6a
       Memory state around the buggy address:
        ffff8801be248a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff8801be248b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       >ffff8801be248b80: 00 00 00 00 00 00 00 00 00 00 00 07 fc fc fc fc
                                                           ^
        ffff8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ==================================================================
      Signed-off-by: default avatarAlban Browaeys <alban.browaeys@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05e165e9
    • Jiri Benc's avatar
      vxlan: fix hlist corruption · beabc603
      Jiri Benc authored
      
      [ Upstream commit 69e76661 ]
      
      It's not a good idea to add the same hlist_node to two different hash lists.
      This leads to various hard to debug memory corruptions.
      
      Fixes: b1be00a6 ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device")
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      beabc603
    • Sabrina Dubroca's avatar
      ipv6: dad: don't remove dynamic addresses if link is down · d2c95120
      Sabrina Dubroca authored
      commit ec8add2a upstream.
      
      Currently, when the link for $DEV is down, this command succeeds but the
      address is removed immediately by DAD (1):
      
          ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
      
      In the same situation, this will succeed and not remove the address (2):
      
          ip addr add 1111::12/64 dev $DEV
          ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
      
      The comment in addrconf_dad_begin() when !IF_READY makes it look like
      this is the intended behavior, but doesn't explain why:
      
           * If the device is not ready:
           * - keep it tentative if it is a permanent address.
           * - otherwise, kill it.
      
      We clearly cannot prevent userspace from doing (2), but we can make (1)
      work consistently with (2).
      
      addrconf_dad_stop() is only called in two cases: if DAD failed, or to
      skip DAD when the link is down. In that second case, the fix is to avoid
      deleting the address, like we already do for permanent addresses.
      
      Fixes: 3c21edbd ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2c95120
    • Gal Pressman's avatar
      net/mlx5e: Fix TX carrier errors report in get stats ndo · 74356430
      Gal Pressman authored
      commit 8ff93de7 upstream.
      
      Symbol error during carrier counter from PPCNT was mistakenly reported as
      TX carrier errors in get_stats ndo, although it's an RX counter.
      
      Fixes: 269e6b3a ("net/mlx5e: Report additional error statistics in get stats ndo")
      Signed-off-by: default avatarGal Pressman <galp@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74356430
    • Derek Chickles's avatar
      liquidio: fix bug in soft reset failure detection · a80a70a4
      Derek Chickles authored
      commit 05a6b4ca upstream.
      
      The code that detects a failed soft reset of Octeon is comparing the wrong
      value against the reset value of the Octeon SLI_SCRATCH_1 register,
      resulting in an inability to detect a soft reset failure.  Fix it by using
      the correct value in the comparison, which is any non-zero value.
      
      Fixes: f21fb3ed ("Add support of Cavium Liquidio ethernet adapters")
      Fixes: c0eab5b3 ("liquidio: CN23XX firmware download")
      Signed-off-by: default avatarDerek Chickles <derek.chickles@cavium.com>
      Signed-off-by: default avatarSatanand Burla <satananda.burla@cavium.com>
      Signed-off-by: default avatarRaghu Vatsavayi <raghu.vatsavayi@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a80a70a4
    • Mohamad Haj Yahia's avatar
      net/mlx5: Cancel delayed recovery work when unloading the driver · e20204dc
      Mohamad Haj Yahia authored
      commit 2a0165a0 upstream.
      
      Draining the health workqueue will ignore future health works including
      the one that report hardware failure and thus we can't enter error state
      Instead cancel the recovery flow and make sure only recovery flow won't
      be scheduled.
      
      Fixes: 5e44fca5 ('net/mlx5: Only cancel recovery work when cleaning up device')
      Signed-off-by: default avatarMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e20204dc
    • Michal Kubeček's avatar
      net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() · 06732807
      Michal Kubeček authored
      commit e44699d2 upstream.
      
      Recently I started seeing warnings about pages with refcount -1. The
      problem was traced to packets being reused after their head was merged into
      a GRO packet by skb_gro_receive(). While bisecting the issue pointed to
      commit c21b48cc ("net: adjust skb->truesize in ___pskb_trim()") and
      I have never seen it on a kernel with it reverted, I believe the real
      problem appeared earlier when the option to merge head frag in GRO was
      implemented.
      
      Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE
      branch of napi_skb_finish() so that if the driver uses napi_gro_frags()
      and head is merged (which in my case happens after the skb_condense()
      call added by the commit mentioned above), the skb is reused including the
      head that has been merged. As a result, we release the page reference
      twice and eventually end up with negative page refcount.
      
      To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish()
      the same way it's done in napi_skb_finish().
      
      Fixes: d7e8883c ("net: make GRO aware of skb->head_frag")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06732807
    • Daniel Borkmann's avatar
      bpf: prevent leaking pointer via xadd on unpriviledged · cd5de9cb
      Daniel Borkmann authored
      commit 6bdf6abc upstream.
      
      Leaking kernel addresses on unpriviledged is generally disallowed,
      for example, verifier rejects the following:
      
        0: (b7) r0 = 0
        1: (18) r2 = 0xffff897e82304400
        3: (7b) *(u64 *)(r1 +48) = r2
        R2 leaks addr into ctx
      
      Doing pointer arithmetic on them is also forbidden, so that they
      don't turn into unknown value and then get leaked out. However,
      there's xadd as a special case, where we don't check the src reg
      for being a pointer register, e.g. the following will pass:
      
        0: (b7) r0 = 0
        1: (7b) *(u64 *)(r1 +48) = r0
        2: (18) r2 = 0xffff897e82304400 ; map
        4: (db) lock *(u64 *)(r1 +48) += r2
        5: (95) exit
      
      We could store the pointer into skb->cb, loose the type context,
      and then read it out from there again to leak it eventually out
      of a map value. Or more easily in a different variant, too:
      
         0: (bf) r6 = r1
         1: (7a) *(u64 *)(r10 -8) = 0
         2: (bf) r2 = r10
         3: (07) r2 += -8
         4: (18) r1 = 0x0
         6: (85) call bpf_map_lookup_elem#1
         7: (15) if r0 == 0x0 goto pc+3
         R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp
         8: (b7) r3 = 0
         9: (7b) *(u64 *)(r0 +0) = r3
        10: (db) lock *(u64 *)(r0 +0) += r6
        11: (b7) r0 = 0
        12: (95) exit
      
        from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp
        11: (b7) r0 = 0
        12: (95) exit
      
      Prevent this by checking xadd src reg for pointer types. Also
      add a couple of test cases related to this.
      
      Fixes: 1be7f75d ("bpf: enable non-root eBPF programs")
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd5de9cb
    • Dan Carpenter's avatar
      rocker: move dereference before free · bee80705
      Dan Carpenter authored
      commit acb4b7df upstream.
      
      My static checker complains that ofdpa_neigh_del() can sometimes free
      "found".   It just makes sense to use it first before deleting it.
      
      Fixes: ecf244f7 ("rocker: fix maybe-uninitialized warning")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bee80705
    • Eduardo Valentin's avatar
      bridge: mdb: fix leak on complete_info ptr on fail path · e5e5c0ec
      Eduardo Valentin authored
      commit 1bfb1596 upstream.
      
      We currently get the following kmemleak report:
      unreferenced object 0xffff8800039d9820 (size 32):
        comm "softirq", pid 0, jiffies 4295212383 (age 792.416s)
        hex dump (first 32 bytes):
          00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00  ................
          00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff  ................
        backtrace:
          [<ffffffff8152b4aa>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff811d8ec8>] kmem_cache_alloc_trace+0xb8/0x1c0
          [<ffffffffa0389683>] __br_mdb_notify+0x2a3/0x300 [bridge]
          [<ffffffffa038a0ce>] br_mdb_notify+0x6e/0x70 [bridge]
          [<ffffffffa0386479>] br_multicast_add_group+0x109/0x150 [bridge]
          [<ffffffffa0386518>] br_ip6_multicast_add_group+0x58/0x60 [bridge]
          [<ffffffffa0387fb5>] br_multicast_rcv+0x1d5/0xdb0 [bridge]
          [<ffffffffa037d7cf>] br_handle_frame_finish+0xcf/0x510 [bridge]
          [<ffffffffa03a236b>] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
          [<ffffffffa03a3738>] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
          [<ffffffffa03a3fb9>] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 [br_netfilter]
          [<ffffffffa03a4400>] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
          [<ffffffffa03a3c27>] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
          [<ffffffff814a2952>] nf_iterate+0x52/0x60
          [<ffffffff814a29bc>] nf_hook_slow+0x5c/0xb0
          [<ffffffffa037ddf4>] br_handle_frame+0x1a4/0x2c0 [bridge]
      
      This happens when switchdev_port_obj_add() fails. This patch
      frees complete_info object in the fail path.
      Reviewed-by: default avatarVallish Vaidyeshwara <vallish@amazon.com>
      Signed-off-by: default avatarEduardo Valentin <eduval@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5e5c0ec
    • Eric Dumazet's avatar
      net: prevent sign extension in dev_get_stats() · 3f04c32b
      Eric Dumazet authored
      commit 6f64ec74 upstream.
      
      Similar to the fix provided by Dominik Heidler in commit
      9b3dc0a1 ("l2tp: cast l2tp traffic counter to unsigned")
      we need to take care of 32bit kernels in dev_get_stats().
      
      When using atomic_long_read(), we add a 'long' to u64 and
      might misinterpret high order bit, unless we cast to unsigned.
      
      Fixes: caf586e5 ("net: add a core netdev->rx_dropped counter")
      Fixes: 015f0688 ("net: net: add a core netdev->tx_dropped counter")
      Fixes: 6e7333d3 ("net: add rx_nohandler stat counter")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f04c32b
    • WANG Cong's avatar
      tcp: reset sk_rx_dst in tcp_disconnect() · ef138400
      WANG Cong authored
      commit d747a7a5 upstream.
      
      We have to reset the sk->sk_rx_dst when we disconnect a TCP
      connection, because otherwise when we re-connect it this
      dst reference is simply overridden in tcp_finish_connect().
      
      This fixes a dst leak which leads to a loopback dev refcnt
      leak. It is a long-standing bug, Kevin reported a very similar
      (if not same) bug before. Thanks to Andrei for providing such
      a reliable reproducer which greatly narrows down the problem.
      
      Fixes: 41063e9d ("ipv4: Early TCP socket demux.")
      Reported-by: default avatarAndrei Vagin <avagin@gmail.com>
      Reported-by: default avatarKevin Xu <kaiwen.xu@hulu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef138400
    • Richard Cochran's avatar
      net: dp83640: Avoid NULL pointer dereference. · cf81b4ab
      Richard Cochran authored
      commit db9d8b29 upstream.
      
      The function, skb_complete_tx_timestamp(), used to allow passing in a
      NULL pointer for the time stamps, but that was changed in commit
      62bccb8c ("net-timestamp: Make the
      clone operation stand-alone from phy timestamping"), and the existing
      call sites, all of which are in the dp83640 driver, were fixed up.
      
      Even though the kernel-doc was subsequently updated in commit
      7a76a021 ("net-timestamp: Update
      skb_complete_tx_timestamp comment"), still a bug fix from Manfred
      Rudigier came into the driver using the old semantics.  Probably
      Manfred derived that patch from an older kernel version.
      
      This fix should be applied to the stable trees as well.
      
      Fixes: 81e8f2e9 ("net: dp83640: Fix tx timestamp overflow handling.")
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf81b4ab
    • WANG Cong's avatar
      ipv6: avoid unregistering inet6_dev for loopback · 0526ff30
      WANG Cong authored
      commit 60abc0be upstream.
      
      The per netns loopback_dev->ip6_ptr is unregistered and set to
      NULL when its mtu is set to smaller than IPV6_MIN_MTU, this
      leads to that we could set rt->rt6i_idev NULL after a
      rt6_uncached_list_flush_dev() and then crash after another
      call.
      
      In this case we should just bring its inet6_dev down, rather
      than unregistering it, at least prior to commit 176c39af
      ("netns: fix addrconf_ifdown kernel panic") we always
      override the case for loopback.
      
      Thanks a lot to Andrey for finding a reliable reproducer.
      
      Fixes: 176c39af ("netns: fix addrconf_ifdown kernel panic")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0526ff30
    • Zach Brown's avatar
      net/phy: micrel: configure intterupts after autoneg workaround · 3f7e07c3
      Zach Brown authored
      commit b866203d upstream.
      
      The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an
      autoneg failure case by resetting the hardware. This turns off
      intterupts. Things will work themselves out if the phy polls, as it will
      figure out it's state during a poll. However if the phy uses only
      intterupts, the phy will stall, since interrupts are off. This patch
      fixes the issue by calling config_intr after resetting the phy.
      
      Fixes: d2fd719b ("net/phy: micrel: Add workaround for bad autoneg ")
      Signed-off-by: default avatarZach Brown <zach.brown@ni.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f7e07c3
    • Gao Feng's avatar
      net: sched: Fix one possible panic when no destroy callback · dc491cdd
      Gao Feng authored
      commit c1a4872e upstream.
      
      When qdisc fail to init, qdisc_create would invoke the destroy callback
      to cleanup. But there is no check if the callback exists really. So it
      would cause the panic if there is no real destroy callback like the qdisc
      codel, fq, and so on.
      
      Take codel as an example following:
      When a malicious user constructs one invalid netlink msg, it would cause
      codel_init->codel_change->nla_parse_nested failed.
      Then kernel would invoke the destroy callback directly but qdisc codel
      doesn't define one. It causes one panic as a result.
      
      Now add one the check for destroy to avoid the possible panic.
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc491cdd
    • Eric Dumazet's avatar
      net_sched: fix error recovery at qdisc creation · 13550ffc
      Eric Dumazet authored
      commit 87b60cfa upstream.
      
      Dmitry reported uses after free in qdisc code [1]
      
      The problem here is that ops->init() can return an error.
      
      qdisc_create_dflt() then call ops->destroy(),
      while qdisc_create() does _not_ call it.
      
      Four qdisc chose to call their own ops->destroy(), assuming their caller
      would not.
      
      This patch makes sure qdisc_create() calls ops->destroy()
      and fixes the four qdisc to avoid double free.
      
      [1]
      BUG: KASAN: use-after-free in mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 at addr ffff8801d415d440
      Read of size 8 by task syz-executor2/5030
      CPU: 0 PID: 5030 Comm: syz-executor2 Not tainted 4.3.5-smp-DEV #119
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       0000000000000046 ffff8801b435b870 ffffffff81bbbed4 ffff8801db000400
       ffff8801d415d440 ffff8801d415dc40 ffff8801c4988510 ffff8801b435b898
       ffffffff816682b1 ffff8801b435b928 ffff8801d415d440 ffff8801c49880c0
      Call Trace:
       [<ffffffff81bbbed4>] __dump_stack lib/dump_stack.c:15 [inline]
       [<ffffffff81bbbed4>] dump_stack+0x6c/0x98 lib/dump_stack.c:51
       [<ffffffff816682b1>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
       [<ffffffff81668524>] print_address_description mm/kasan/report.c:196 [inline]
       [<ffffffff81668524>] kasan_report_error+0x1b4/0x4b0 mm/kasan/report.c:285
       [<ffffffff81668953>] kasan_report mm/kasan/report.c:305 [inline]
       [<ffffffff81668953>] __asan_report_load8_noabort+0x43/0x50 mm/kasan/report.c:326
       [<ffffffff82527b02>] mq_destroy+0x242/0x290 net/sched/sch_mq.c:33
       [<ffffffff82524bdd>] qdisc_destroy+0x12d/0x290 net/sched/sch_generic.c:953
       [<ffffffff82524e30>] qdisc_create_dflt+0xf0/0x120 net/sched/sch_generic.c:848
       [<ffffffff8252550d>] attach_default_qdiscs net/sched/sch_generic.c:1029 [inline]
       [<ffffffff8252550d>] dev_activate+0x6ad/0x880 net/sched/sch_generic.c:1064
       [<ffffffff824b1db1>] __dev_open+0x221/0x320 net/core/dev.c:1403
       [<ffffffff824b24ce>] __dev_change_flags+0x15e/0x3e0 net/core/dev.c:6858
       [<ffffffff824b27de>] dev_change_flags+0x8e/0x140 net/core/dev.c:6926
       [<ffffffff824f5bf6>] dev_ifsioc+0x446/0x890 net/core/dev_ioctl.c:260
       [<ffffffff824f61fa>] dev_ioctl+0x1ba/0xb80 net/core/dev_ioctl.c:546
       [<ffffffff82430509>] sock_do_ioctl+0x99/0xb0 net/socket.c:879
       [<ffffffff82430d30>] sock_ioctl+0x2a0/0x390 net/socket.c:958
       [<ffffffff816f3b68>] vfs_ioctl fs/ioctl.c:44 [inline]
       [<ffffffff816f3b68>] do_vfs_ioctl+0x8a8/0xe50 fs/ioctl.c:611
       [<ffffffff816f41a4>] SYSC_ioctl fs/ioctl.c:626 [inline]
       [<ffffffff816f41a4>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:617
       [<ffffffff8123e357>] entry_SYSCALL_64_fastpath+0x12/0x17
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13550ffc
    • Vineeth Remanan Pillai's avatar
      xen-netfront: Rework the fix for Rx stall during OOM and network stress · 21f79ae4
      Vineeth Remanan Pillai authored
      commit 538d9291 upstream.
      
      The commit 90c311b0 ("xen-netfront: Fix Rx stall during network
      stress and OOM") caused the refill timer to be triggerred almost on
      all invocations of xennet_alloc_rx_buffers for certain workloads.
      This reworks the fix by reverting to the old behaviour and taking into
      consideration the skb allocation failure. Refill timer is now triggered
      on insufficient requests or skb allocation failure.
      Signed-off-by: default avatarVineeth Remanan Pillai <vineethp@amazon.com>
      Fixes: 90c311b0 (xen-netfront: Fix Rx stall during network stress and OOM)
      Reported-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21f79ae4
  2. 15 Jul, 2017 19 commits