1. 02 Jul, 2019 19 commits
    • David S. Miller's avatar
      Merge branch 'bridge-stale-ptrs' · f2f17175
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: fix possible stale skb pointers
      
      In the bridge driver we have a couple of places which call pskb_may_pull
      but we've cached skb pointers before that and use them after which can
      lead to out-of-bounds/stale pointer use. I've had these in my "to fix"
      list for some time and now we got a report (patch 01) so here they are.
      Patches 02-04 are fixes based on code inspection. Also patch 01 was
      tested by Martin Weinelt, Martin if you don't mind please add your
      tested-by tag to it by replying with Tested-by: name <email>.
      I've also briefly tested the set by trying to exercise those code paths.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2f17175
    • Nikolay Aleksandrov's avatar
      net: bridge: stp: don't cache eth dest pointer before skb pull · 2446a68a
      Nikolay Aleksandrov authored
      Don't cache eth dest pointer before calling pskb_may_pull.
      
      Fixes: cf0f02d0 ("[BRIDGE]: use llc for receiving STP packets")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2446a68a
    • Nikolay Aleksandrov's avatar
      net: bridge: don't cache ether dest pointer on input · 3d26eb8a
      Nikolay Aleksandrov authored
      We would cache ether dst pointer on input in br_handle_frame_finish but
      after the neigh suppress code that could lead to a stale pointer since
      both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to
      always reload it after the suppress code so there's no point in having
      it cached just retrieve it directly.
      
      Fixes: 057658cb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
      Fixes: ed842fae ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d26eb8a
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query · 3b26a5d0
      Nikolay Aleksandrov authored
      We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
      call pskb_may_pull afterwards and end up using a stale pointer.
      So use the header directly, it's just 1 place where it's needed.
      
      Fixes: 08b202b6 ("bridge br_multicast: IPv6 MLD support.")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b26a5d0
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling · e57f6185
      Nikolay Aleksandrov authored
      We take a pointer to grec prior to calling pskb_may_pull and use it
      afterwards to get nsrcs so record nsrcs before the pull when handling
      igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling
      mld2 which again could lead to reading 2 bytes out-of-bounds.
      
       ==================================================================
       BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge]
       Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16
      
       CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G           OE     5.2.0-rc6+ #1
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
       Call Trace:
        dump_stack+0x71/0xab
        print_address_description+0x6a/0x280
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        __kasan_report+0x152/0x1aa
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_rcv+0x480c/0x4ad0 [bridge]
        kasan_report+0xe/0x20
        br_multicast_rcv+0x480c/0x4ad0 [bridge]
        ? br_multicast_disable_port+0x150/0x150 [bridge]
        ? ktime_get_with_offset+0xb4/0x150
        ? __kasan_kmalloc.constprop.6+0xa6/0xf0
        ? __netif_receive_skb+0x1b0/0x1b0
        ? br_fdb_update+0x10e/0x6e0 [bridge]
        ? br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        br_handle_frame_finish+0x3c6/0x11d0 [bridge]
        ? br_pass_frame_up+0x3a0/0x3a0 [bridge]
        ? virtnet_probe+0x1c80/0x1c80 [virtio_net]
        br_handle_frame+0x731/0xd90 [bridge]
        ? select_idle_sibling+0x25/0x7d0
        ? br_handle_frame_finish+0x11d0/0x11d0 [bridge]
        __netif_receive_skb_core+0xced/0x2d70
        ? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring]
        ? do_xdp_generic+0x20/0x20
        ? virtqueue_napi_complete+0x39/0x70 [virtio_net]
        ? virtnet_poll+0x94d/0xc78 [virtio_net]
        ? receive_buf+0x5120/0x5120 [virtio_net]
        ? __netif_receive_skb_one_core+0x97/0x1d0
        __netif_receive_skb_one_core+0x97/0x1d0
        ? __netif_receive_skb_core+0x2d70/0x2d70
        ? _raw_write_trylock+0x100/0x100
        ? __queue_work+0x41e/0xbe0
        process_backlog+0x19c/0x650
        ? _raw_read_lock_irq+0x40/0x40
        net_rx_action+0x71e/0xbc0
        ? __switch_to_asm+0x40/0x70
        ? napi_complete_done+0x360/0x360
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
        ? __schedule+0x85e/0x14d0
        __do_softirq+0x1db/0x5f9
        ? takeover_tasklets+0x5f0/0x5f0
        run_ksoftirqd+0x26/0x40
        smpboot_thread_fn+0x443/0x680
        ? sort_range+0x20/0x20
        ? schedule+0x94/0x210
        ? __kthread_parkme+0x78/0xf0
        ? sort_range+0x20/0x20
        kthread+0x2ae/0x3a0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x35/0x40
      
       The buggy address belongs to the page:
       page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
       flags: 0xffffc000000000()
       raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000
       raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
       ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       > ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                           ^
       ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ==================================================================
       Disabling lock debugging due to kernel taint
      
      Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
      Reported-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarMartin Weinelt <martin@linuxlounge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e57f6185
    • Hayes Wang's avatar
      r8152: fix the setting of detecting the linking change for runtime suspend · 13e04fbf
      Hayes Wang authored
      1. Rename r8153b_queue_wake() to r8153_queue_wake().
      
      2. Correct the setting. The enable bit should be 0xd38c bit 0. Besides,
         the 0xd38a bit 0 and 0xd398 bit 8 have to be cleared for both enabled
         and disabled.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13e04fbf
    • Jakub Kicinski's avatar
      net/tls: make sure offload also gets the keys wiped · acd3e96d
      Jakub Kicinski authored
      Commit 86029d10 ("tls: zero the crypto information from tls_context
      before freeing") added memzero_explicit() calls to clear the key material
      before freeing struct tls_context, but it missed tls_device.c has its
      own way of freeing this structure. Replace the missing free.
      
      Fixes: 86029d10 ("tls: zero the crypto information from tls_context before freeing")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acd3e96d
    • Jakub Kicinski's avatar
      net/tls: reject offload of TLS 1.3 · 618bac45
      Jakub Kicinski authored
      Neither drivers nor the tls offload code currently supports TLS
      version 1.3. Check the TLS version when installing connection
      state. TLS 1.3 will just fallback to the kernel crypto for now.
      
      Fixes: 130b392c ("net: tls: Add tls 1.3 support")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      618bac45
    • David S. Miller's avatar
      Merge branch 'idr-fix-overflow-cases-on-32-bit-CPU' · 8a534f8f
      David S. Miller authored
      Cong Wang says:
      
      ====================
      idr: fix overflow cases on 32-bit CPU
      
      idr_get_next_ul() is problematic by design, it can't handle
      the following overflow case well on 32-bit CPU:
      
      u32 id = UINT_MAX;
      idr_alloc_u32(&id);
      while (idr_get_next_ul(&id) != NULL)
       id++;
      
      when 'id' overflows and becomes 0 after UINT_MAX, the loop
      goes infinite.
      
      Fix this by eliminating external users of idr_get_next_ul()
      and migrating them to idr_for_each_entry_continue_ul(). And
      add an additional parameter for these iteration macros to detect
      overflow properly.
      
      Please merge this through networking tree, as all the users
      are in networking subsystem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a534f8f
    • Davide Caratti's avatar
    • Cong Wang's avatar
      idr: introduce idr_for_each_entry_continue_ul() · d39d7149
      Cong Wang authored
      Similarly, other callers of idr_get_next_ul() suffer the same
      overflow bug as they don't handle it properly either.
      
      Introduce idr_for_each_entry_continue_ul() to help these callers
      iterate from a given ID.
      
      cls_flower needs more care here because it still has overflow when
      does arg->cookie++, we have to fold its nested loops into one
      and remove the arg->cookie++.
      
      Fixes: 01683a14 ("net: sched: refactor flower walk to iterate over idr")
      Fixes: 12d6066c ("net/mlx5: Add flow counters idr")
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Cc: Davide Caratti <dcaratti@redhat.com>
      Cc: Vlad Buslov <vladbu@mellanox.com>
      Cc: Chris Mi <chrism@mellanox.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Tested-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d39d7149
    • Cong Wang's avatar
      idr: fix overflow case for idr_for_each_entry_ul() · e33d2b74
      Cong Wang authored
      idr_for_each_entry_ul() is buggy as it can't handle overflow
      case correctly. When we have an ID == UINT_MAX, it becomes an
      infinite loop. This happens when running on 32-bit CPU where
      unsigned long has the same size with unsigned int.
      
      There is no better way to fix this than casting it to a larger
      integer, but we can't just 64 bit integer on 32 bit CPU. Instead
      we could just use an additional integer to help us to detect this
      overflow case, that is, adding a new parameter to this macro.
      Fortunately tc action is its only user right now.
      
      Fixes: 65a206c0 ("net/sched: Change act_api and act_xxx modules to use IDR")
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Tested-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chris Mi <chrism@mellanox.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e33d2b74
    • David S. Miller's avatar
      Merge branch 'vsock-virtio-fixes' · eb1f5c02
      David S. Miller authored
      Stefano Garzarella says:
      
      ====================
      vsock/virtio: several fixes in the .probe() and .remove()
      
      During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock
      before registering the driver", Stefan pointed out some possible issues
      in the .probe() and .remove() callbacks of the virtio-vsock driver.
      
      This series tries to solve these issues:
      - Patch 1 adds RCU critical sections to avoid use-after-free of
        'the_virtio_vsock' pointer.
      - Patch 2 stops workers before to call vdev->config->reset(vdev) to
        be sure that no one is accessing the device.
      - Patch 3 moves the works flush at the end of the .remove() to avoid
        use-after-free of 'vsock' object.
      
      v2:
      - Patch 1: use RCU to protect 'the_virtio_vsock' pointer
      - Patch 2: no changes
      - Patch 3: flush works only at the end of .remove()
      - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers
        allocated.
      
      v1: https://patchwork.kernel.org/cover/10964733/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb1f5c02
    • Stefano Garzarella's avatar
      vsock/virtio: fix flush of works during the .remove() · 0d20e56e
      Stefano Garzarella authored
      This patch moves the flush of works after vdev->config->del_vqs(vdev),
      because we need to be sure that no workers run before to free the
      'vsock' object.
      
      Since we stopped the workers using the [tx|rx|event]_run flags,
      we are sure no one is accessing the device while we are calling
      vdev->config->reset(vdev), so we can safely move the workers' flush.
      
      Before the vdev->config->del_vqs(vdev), workers can be scheduled
      by VQ callbacks, so we must flush them after del_vqs(), to avoid
      use-after-free of 'vsock' object.
      Suggested-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d20e56e
    • Stefano Garzarella's avatar
      vsock/virtio: stop workers during the .remove() · 17dd1367
      Stefano Garzarella authored
      Before to call vdev->config->reset(vdev) we need to be sure that
      no one is accessing the device, for this reason, we add new variables
      in the struct virtio_vsock to stop the workers during the .remove().
      
      This patch also add few comments before vdev->config->reset(vdev)
      and vdev->config->del_vqs(vdev).
      Suggested-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Suggested-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17dd1367
    • Stefano Garzarella's avatar
      vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock · 9c7a5582
      Stefano Garzarella authored
      Some callbacks used by the upper layers can run while we are in the
      .remove(). A potential use-after-free can happen, because we free
      the_virtio_vsock without knowing if the callbacks are over or not.
      
      To solve this issue we move the assignment of the_virtio_vsock at the
      end of .probe(), when we finished all the initialization, and at the
      beginning of .remove(), before to release resources.
      For the same reason, we do the same also for the vdev->priv.
      
      We use RCU to be sure that all callbacks that use the_virtio_vsock
      ended before freeing it. This is not required for callbacks that
      use vdev->priv, because after the vdev->config->del_vqs() we are sure
      that they are ended and will no longer be invoked.
      
      We also take the mutex during the .remove() to avoid that .probe() can
      run while we are resetting the device.
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c7a5582
    • Taehee Yoo's avatar
      vxlan: do not destroy fdb if register_netdevice() is failed · 7c31e54a
      Taehee Yoo authored
      __vxlan_dev_create() destroys FDB using specific pointer which indicates
      a fdb when error occurs.
      But that pointer should not be used when register_netdevice() fails because
      register_netdevice() internally destroys fdb when error occurs.
      
      This patch makes vxlan_fdb_create() to do not link fdb entry to vxlan dev
      internally.
      Instead, a new function vxlan_fdb_insert() is added to link fdb to vxlan
      dev.
      
      vxlan_fdb_insert() is called after calling register_netdevice().
      This routine can avoid situation that ->ndo_uninit() destroys fdb entry
      in error path of register_netdevice().
      Hence, error path of __vxlan_dev_create() routine can have an opportunity
      to destroy default fdb entry by hand.
      
      Test command
          ip link add bonding_masters type vxlan id 0 group 239.1.1.1 \
      	    dev enp0s9 dstport 4789
      
      Splat looks like:
      [  213.392816] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  213.401257] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  213.402178] CPU: 0 PID: 1414 Comm: ip Not tainted 5.2.0-rc5+ #256
      [  213.402178] RIP: 0010:vxlan_fdb_destroy+0x120/0x220 [vxlan]
      [  213.402178] Code: df 48 8b 2b 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 06 01 00 00 4c 8b 63 08 48 b8 00 00 00 00 00 fc d
      [  213.402178] RSP: 0018:ffff88810cb9f0a0 EFLAGS: 00010202
      [  213.402178] RAX: dffffc0000000000 RBX: ffff888101d4a8c8 RCX: 0000000000000000
      [  213.402178] RDX: 1bd5a00000000040 RSI: ffff888101d4a8c8 RDI: ffff888101d4a8d0
      [  213.402178] RBP: 0000000000000000 R08: fffffbfff22b72d9 R09: 0000000000000000
      [  213.402178] R10: 00000000ffffffef R11: 0000000000000000 R12: dead000000000200
      [  213.402178] R13: ffff88810cb9f1f8 R14: ffff88810efccda0 R15: ffff88810efccda0
      [  213.402178] FS:  00007f7f6621a0c0(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
      [  213.402178] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  213.402178] CR2: 000055746f0807d0 CR3: 00000001123e0000 CR4: 00000000001006f0
      [  213.402178] Call Trace:
      [  213.402178]  __vxlan_dev_create+0x3a9/0x7d0 [vxlan]
      [  213.402178]  ? vxlan_changelink+0x740/0x740 [vxlan]
      [  213.402178]  ? rcu_read_unlock+0x60/0x60 [vxlan]
      [  213.402178]  ? __kasan_kmalloc.constprop.3+0xa0/0xd0
      [  213.402178]  vxlan_newlink+0x8d/0xc0 [vxlan]
      [  213.402178]  ? __vxlan_dev_create+0x7d0/0x7d0 [vxlan]
      [  213.554119]  ? __netlink_ns_capable+0xc3/0xf0
      [  213.554119]  __rtnl_newlink+0xb75/0x1180
      [  213.554119]  ? rtnl_link_unregister+0x230/0x230
      [ ... ]
      
      Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
      Suggested-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c31e54a
    • Marcelo Ricardo Leitner's avatar
      sctp: fix error handling on stream scheduler initialization · 4d141581
      Marcelo Ricardo Leitner authored
      It allocates the extended area for outbound streams only on sendmsg
      calls, if they are not yet allocated.  When using the priority
      stream scheduler, this initialization may imply into a subsequent
      allocation, which may fail.  In this case, it was aborting the stream
      scheduler initialization but leaving the ->ext pointer (allocated) in
      there, thus in a partially initialized state.  On a subsequent call to
      sendmsg, it would notice the ->ext pointer in there, and trip on
      uninitialized stuff when trying to schedule the data chunk.
      
      The fix is undo the ->ext initialization if the stream scheduler
      initialization fails and avoid the partially initialized state.
      
      Although syzkaller bisected this to commit 4ff40b86 ("sctp: set
      chunk transport correctly when it's a new asoc"), this bug was actually
      introduced on the commit I marked below.
      
      Reported-by: syzbot+c1a380d42b190ad1e559@syzkaller.appspotmail.com
      Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations")
      Tested-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d141581
    • Cong Wang's avatar
      netrom: fix a memory leak in nr_rx_frame() · c8c8218e
      Cong Wang authored
      When the skb is associated with a new sock, just assigning
      it to skb->sk is not sufficient, we have to set its destructor
      to free the sock properly too.
      
      Reported-by: syzbot+d6636a36d3c34bd88938@syzkaller.appspotmail.com
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8c8218e
  2. 01 Jul, 2019 5 commits
  3. 30 Jun, 2019 6 commits
  4. 29 Jun, 2019 9 commits
    • Baruch Siach's avatar
      net: dsa: mv88e6xxx: wait after reset deactivation · 7b75e49d
      Baruch Siach authored
      Add a 1ms delay after reset deactivation. Otherwise the chip returns
      bogus ID value. This is observed with 88E6390 (Peridot) chip.
      Signed-off-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b75e49d
    • Guilherme G. Piccoli's avatar
      bnx2x: Prevent ptp_task to be rescheduled indefinitely · 3c91f25c
      Guilherme G. Piccoli authored
      Currently bnx2x ptp worker tries to read a register with timestamp
      information in case of TX packet timestamping and in case it fails,
      the routine reschedules itself indefinitely. This was reported as a
      kworker always at 100% of CPU usage, which was narrowed down to be
      bnx2x ptp_task.
      
      By following the ioctl handler, we could narrow down the problem to
      an NTP tool (chrony) requesting HW timestamping from bnx2x NIC with
      RX filter zeroed; this isn't reproducible for example with ptp4l
      (from linuxptp) since this tool requests a supported RX filter.
      It seems NIC FW timestamp mechanism cannot work well with
      RX_FILTER_NONE - driver's PTP filter init routine skips a register
      write to the adapter if there's not a supported filter request.
      
      This patch addresses the problem of bnx2x ptp thread's everlasting
      reschedule by retrying the register read 10 times; between the read
      attempts the thread sleeps for an increasing amount of time starting
      in 1ms to give FW some time to perform the timestamping. If it still
      fails after all retries, we bail out in order to prevent an unbound
      resource consumption from bnx2x.
      
      The patch also adds an ethtool statistic for accounting the skipped
      TX timestamp packets and it reduces the priority of timestamping
      error messages to prevent log flooding. The code was tested using
      both linuxptp and chrony.
      Reported-and-tested-by: default avatarPrzemyslaw Hausman <przemyslaw.hausman@canonical.com>
      Suggested-by: default avatarSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@canonical.com>
      Acked-by: default avatarSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c91f25c
    • Eric Dumazet's avatar
      igmp: fix memory leak in igmpv3_del_delrec() · e5b1c6c6
      Eric Dumazet authored
      im->tomb and/or im->sources might not be NULL, but we
      currently overwrite their values blindly.
      
      Using swap() will make sure the following call to kfree_pmc(pmc)
      will properly free the psf structures.
      
      Tested with the C repro provided by syzbot, which basically does :
      
       socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
       setsockopt(3, SOL_IP, IP_ADD_MEMBERSHIP, "\340\0\0\2\177\0\0\1\0\0\0\0", 12) = 0
       ioctl(3, SIOCSIFFLAGS, {ifr_name="lo", ifr_flags=0}) = 0
       setsockopt(3, SOL_IP, IP_MSFILTER, "\340\0\0\2\177\0\0\1\1\0\0\0\1\0\0\0\377\377\377\377", 20) = 0
       ioctl(3, SIOCSIFFLAGS, {ifr_name="lo", ifr_flags=IFF_UP}) = 0
       exit_group(0)                    = ?
      
      BUG: memory leak
      unreferenced object 0xffff88811450f140 (size 64):
        comm "softirq", pid 0, jiffies 4294942448 (age 32.070s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00  ................
          00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000c7bad083>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<00000000c7bad083>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000c7bad083>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000c7bad083>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
          [<000000009acc4151>] kmalloc include/linux/slab.h:547 [inline]
          [<000000009acc4151>] kzalloc include/linux/slab.h:742 [inline]
          [<000000009acc4151>] ip_mc_add1_src net/ipv4/igmp.c:1976 [inline]
          [<000000009acc4151>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2100
          [<000000004ac14566>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2484
          [<0000000052d8f995>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:959
          [<000000004ee1e21f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1248
          [<0000000066cdfe74>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2618
          [<000000009383a786>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3126
          [<00000000d8ac0c94>] __sys_setsockopt+0x98/0x120 net/socket.c:2072
          [<000000001b1e9666>] __do_sys_setsockopt net/socket.c:2083 [inline]
          [<000000001b1e9666>] __se_sys_setsockopt net/socket.c:2080 [inline]
          [<000000001b1e9666>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2080
          [<00000000420d395e>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
          [<000000007fd83a4b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 24803f38 ("igmp: do not remove igmp souce list info when set link down")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Hangbin Liu <liuhangbin@gmail.com>
      Reported-by: syzbot+6ca1abd0db68b5173a4f@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5b1c6c6
    • David S. Miller's avatar
      Merge branch 'Sub-ns-increment-fixes-in-Macb-PTP' · c09fedd6
      David S. Miller authored
      Harini Katakam says:
      
      ====================
      Sub ns increment fixes in Macb PTP
      
      The subns increment register fields are not captured correctly in the
      driver. Fix the same and also increase the subns incr resolution.
      
      Sub ns resolution was increased to 24 bits in r1p06f2 version. To my
      knowledge, this PTP driver, with its current BD time stamp
      implementation, is only useful to that version or above. So, I have
      increased the resolution unconditionally. Please let me know if there
      is any IP versions incompatible with this - there is no register to
      obtain this information from.
      
      Changes from RFC:
      None
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c09fedd6
    • Harini Katakam's avatar
      net: macb: Fix SUBNS increment and increase resolution · 7ad342bc
      Harini Katakam authored
      The subns increment register has 24 bits as follows:
      RegBit[15:0] = Subns[23:8]; RegBit[31:24] = Subns[7:0]
      
      Fix the same in the driver and increase sub ns resolution to the
      best capable, 24 bits. This should be the case on all GEM versions
      that this PTP driver supports.
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ad342bc
    • Harini Katakam's avatar
      net: macb: Add separate definition for PPM fraction · a8ee4dc1
      Harini Katakam authored
      The scaled ppm parameter passed to _adjfine() contains a 16 bit
      fraction. This just happens to be the same as SUBNSINCR_SIZE now.
      Hence define this separately.
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8ee4dc1
    • Jiunn Chang's avatar
      packet: Fix undefined behavior in bit shift · 79293f49
      Jiunn Chang authored
      Shifting signed 32-bit value by 31 bits is undefined.  Changing most
      significant bit to unsigned.
      
      Changes included in v2:
        - use subsystem specific subject lines
        - CC required mailing lists
      Signed-off-by: default avatarJiunn Chang <c0d1n61at3@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79293f49
    • Florian Westphal's avatar
      net: make skb_dst_force return true when dst is refcounted · b60a7738
      Florian Westphal authored
      netfilter did not expect that skb_dst_force() can cause skb to lose its
      dst entry.
      
      I got a bug report with a skb->dst NULL dereference in netfilter
      output path.  The backtrace contains nf_reinject(), so the dst might have
      been cleared when skb got queued to userspace.
      
      Other users were fixed via
      if (skb_dst(skb)) {
      	skb_dst_force(skb);
      	if (!skb_dst(skb))
      		goto handle_err;
      }
      
      But I think its preferable to make the 'dst might be cleared' part
      of the function explicit.
      
      In netfilter case, skb with a null dst is expected when queueing in
      prerouting hook, so drop skb for the other hooks.
      
      v2:
       v1 of this patch returned true in case skb had no dst entry.
       Eric said:
         Say if we have two skb_dst_force() calls for some reason
         on the same skb, only the first one will return false.
      
       This now returns false even when skb had no dst, as per Erics
       suggestion, so callers might need to check skb_dst() first before
       skb_dst_force().
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b60a7738
    • Xin Long's avatar
      sctp: not bind the socket in sctp_connect · 9b6c0887
      Xin Long authored
      Now when sctp_connect() is called with a wrong sa_family, it binds
      to a port but doesn't set bp->port, then sctp_get_af_specific will
      return NULL and sctp_connect() returns -EINVAL.
      
      Then if sctp_bind() is called to bind to another port, the last
      port it has bound will leak due to bp->port is NULL by then.
      
      sctp_connect() doesn't need to bind ports, as later __sctp_connect
      will do it if bp->port is NULL. So remove it from sctp_connect().
      While at it, remove the unnecessary sockaddr.sa_family len check
      as it's already done in sctp_inet_connect.
      
      Fixes: 644fbdea ("sctp: fix the issue that flags are ignored when using kernel_connect")
      Reported-by: syzbot+079bf326b38072f849d9@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b6c0887
  5. 28 Jun, 2019 1 commit