1. 15 Dec, 2018 22 commits
  2. 14 Dec, 2018 18 commits
    • David S. Miller's avatar
      Merge branch 'net-prefer-listeners-bound-to-an-address' · b9948e11
      David S. Miller authored
      Peter Oskolkov says:
      
      ====================
      net: prefer listeners bound to an address
      
      A relatively common use case is to have several IPs configured
      on a host, and have different listeners for each of them. We would
      like to add a "catch all" listener on addr_any, to match incoming
      connections not served by any of the listeners bound to a specific
      address.
      
      However, port-only lookups can match addr_any sockets when sockets
      listening on specific addresses are present if so_reuseport flag
      is set. This patchset eliminates lookups into port-only hashtable,
      as lookups by (addr,port) tuple are easily available.
      
      In a future patchset I plan to explore whether it is possible
      to remove port-only hashtables completely: additional refactoring
      will be required, as some non-lookup code uses the hashtables.
      ====================
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9948e11
    • Peter Oskolkov's avatar
      selftests: net: test that listening sockets match on address properly · 6254e5c6
      Peter Oskolkov authored
      This patch adds a selftest that verifies that a socket listening
      on a specific address is chosen in preference over sockets
      that listen on any address. The test covers UDP/UDP6/TCP/TCP6.
      
      It is based on, and similar to, reuseport_dualstack.c selftest.
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6254e5c6
    • Peter Oskolkov's avatar
      net: tcp6: prefer listeners bound to an address · 0ee58dad
      Peter Oskolkov authored
      A relatively common use case is to have several IPs configured
      on a host, and have different listeners for each of them. We would
      like to add a "catch all" listener on addr_any, to match incoming
      connections not served by any of the listeners bound to a specific
      address.
      
      However, port-only lookups can match addr_any sockets when sockets
      listening on specific addresses are present if so_reuseport flag
      is set. This patch eliminates lookups into port-only hashtable,
      as lookups by (addr,port) tuple are easily available.
      
      In addition, compute_score() is tweaked to _not_ match
      addr_any sockets to specific addresses, as hash collisions
      could result in the unwanted behavior described above.
      
      Tested: the patch compiles; full test in the last patch in this
      patchset. Existing reuseport_* selftests also pass.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ee58dad
    • Peter Oskolkov's avatar
      net: tcp: prefer listeners bound to an address · d9fbc7f6
      Peter Oskolkov authored
      A relatively common use case is to have several IPs configured
      on a host, and have different listeners for each of them. We would
      like to add a "catch all" listener on addr_any, to match incoming
      connections not served by any of the listeners bound to a specific
      address.
      
      However, port-only lookups can match addr_any sockets when sockets
      listening on specific addresses are present if so_reuseport flag
      is set. This patch eliminates lookups into port-only hashtable,
      as lookups by (addr,port) tuple are easily available.
      
      In addition, compute_score() is tweaked to _not_ match
      addr_any sockets to specific addresses, as hash collisions
      could result in the unwanted behavior described above.
      
      Tested: the patch compiles; full test in the last patch in this
      patchset. Existing reuseport_* selftests also pass.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9fbc7f6
    • Peter Oskolkov's avatar
      net: udp6: prefer listeners bound to an address · 23b0269e
      Peter Oskolkov authored
      A relatively common use case is to have several IPs configured
      on a host, and have different listeners for each of them. We would
      like to add a "catch all" listener on addr_any, to match incoming
      connections not served by any of the listeners bound to a specific
      address.
      
      However, port-only lookups can match addr_any sockets when sockets
      listening on specific addresses are present if so_reuseport flag
      is set. This patch eliminates lookups into port-only hashtable,
      as lookups by (addr,port) tuple are easily available.
      
      In addition, compute_score() is tweaked to _not_ match
      addr_any sockets to specific addresses, as hash collisions
      could result in the unwanted behavior described above.
      
      Tested: the patch compiles; full test in the last patch in this
      patchset. Existing reuseport_* selftests also pass.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23b0269e
    • Peter Oskolkov's avatar
      net: udp: prefer listeners bound to an address · 4cdeeee9
      Peter Oskolkov authored
      A relatively common use case is to have several IPs configured
      on a host, and have different listeners for each of them. We would
      like to add a "catch all" listener on addr_any, to match incoming
      connections not served by any of the listeners bound to a specific
      address.
      
      However, port-only lookups can match addr_any sockets when sockets
      listening on specific addresses are present if so_reuseport flag
      is set. This patch eliminates lookups into port-only hashtable,
      as lookups by (addr,port) tuple are easily available.
      
      In addition, compute_score() is tweaked to _not_ match
      addr_any sockets to specific addresses, as hash collisions
      could result in the unwanted behavior described above.
      
      Tested: the patch compiles; full test in the last patch in this
      patchset. Existing reuseport_* selftests also pass.
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4cdeeee9
    • yupeng's avatar
      add snmp counters document · 8e2ea53a
      yupeng authored
      Add explainations for some general IP counters, SACK and DSACK related
      counters
      Signed-off-by: default avataryupeng <yupeng0921@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e2ea53a
    • David S. Miller's avatar
      Merge branch 'neighbor-More-gc_list-changes' · 384aee46
      David S. Miller authored
      David Ahern says:
      
      ====================
      neighbor: More gc_list changes
      
      More gc_list changes and cleanups.
      
      The first 2 patches are bug fixes from the first gc_list change.
      Specifically, fix the locking order to be consistent - table lock
      followed by neighbor lock, and then entries in the FAILED state
      should always be candidates for forced_gc without waiting for any
      time span (return to the eviction logic prior to the separate gc_list).
      
      Patch 3 removes 2 now unnecessary arguments to neigh_del.
      
      Patch 4 moves a helper from a header file to core code in preparation
      for Patch 5 which removes NTF_EXT_LEARNED entries from the gc_list.
      These entries are already exempt from forced_gc; patch 5 removes them
      from consideration and makes them on par with PERMANENT entries given
      that they are also managed by userspace.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      384aee46
    • David Ahern's avatar
      neighbor: Remove externally learned entries from gc_list · e997f8a2
      David Ahern authored
      Externally learned entries are similar to PERMANENT entries in the
      sense they are managed by userspace and can not be garbage collected.
      As such remove them from the gc_list, remove the flags check from
      neigh_forced_gc and skip threshold checks in neigh_alloc. As with
      PERMANENT entries, this allows unlimited number of NTF_EXT_LEARNED
      entries.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e997f8a2
    • David Ahern's avatar
      neighbor: Move neigh_update_ext_learned to core file · 526f1b58
      David Ahern authored
      neigh_update_ext_learned has one caller in neighbour.c so does not need
      to be defined in the header. Move it and in the process remove the
      intialization of ndm_flags and just set it based on the flags check.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      526f1b58
    • David Ahern's avatar
      neighbor: Remove state and flags arguments to neigh_del · 7e6f182b
      David Ahern authored
      neigh_del now only has 1 caller, and the state and flags arguments
      are both 0. Remove them and simplify neigh_del.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e6f182b
    • David Ahern's avatar
      neighbor: Fix state check in neigh_forced_gc · 758a7f0b
      David Ahern authored
      PERMANENT entries are not on the gc_list so the state check is now
      redundant. Also, the move to not purge entries until after 5 seconds
      should not apply to FAILED entries; those can be removed immediately
      to make way for newer ones. This restores the previous logic prior to
      the gc_list.
      
      Fixes: 58956317 ("neighbor: Improve garbage collection")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      758a7f0b
    • David Ahern's avatar
      neighbor: Fix locking order for gc_list changes · 9c29a2f5
      David Ahern authored
      Lock checker noted an inverted lock order between neigh_change_state
      (neighbor lock then table lock) and neigh_periodic_work (table lock and
      then neighbor lock) resulting in:
      
      [  121.057652] ======================================================
      [  121.058740] WARNING: possible circular locking dependency detected
      [  121.059861] 4.20.0-rc6+ #43 Not tainted
      [  121.060546] ------------------------------------------------------
      [  121.061630] kworker/0:2/65 is trying to acquire lock:
      [  121.062519] (____ptrval____) (&n->lock){++--}, at: neigh_periodic_work+0x237/0x324
      [  121.063894]
      [  121.063894] but task is already holding lock:
      [  121.064920] (____ptrval____) (&tbl->lock){+.-.}, at: neigh_periodic_work+0x194/0x324
      [  121.066274]
      [  121.066274] which lock already depends on the new lock.
      [  121.066274]
      [  121.067693]
      [  121.067693] the existing dependency chain (in reverse order) is:
      ...
      
      Fix by renaming neigh_change_state to neigh_update_gc_list, changing
      it to only manage whether an entry should be on the gc_list and taking
      locks in the same order as neigh_periodic_work. Invoke at the end of
      neigh_update only if diff between old or new states has the PERMANENT
      flag set.
      
      Fixes: 8cc196d6 ("neighbor: gc_list changes should be protected by table lock")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c29a2f5
    • Cong Wang's avatar
      net_sched: fold tcf_block_cb_call() into tc_setup_cb_call() · aeb3fecd
      Cong Wang authored
      After commit 69bd4840 ("net/sched: Remove egdev mechanism"),
      tc_setup_cb_call() is nearly identical to tcf_block_cb_call(),
      so we can just fold tcf_block_cb_call() into tc_setup_cb_call()
      and remove its unused parameter 'exts'.
      
      Fixes: 69bd4840 ("net/sched: Remove egdev mechanism")
      Cc: Oz Shlomo <ozsh@mellanox.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Acked-by: default avatarOz Shlomo <ozsh@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aeb3fecd
    • Wen Yang's avatar
      net/ibmvnic: Remove tests of member address · 390de194
      Wen Yang authored
      The driver was checking for non-NULL address.
      - adapter->napi[i]
      
      This is pointless as these will be always non-NULL, since the
      'dapter->napi' is allocated in init_napi().
      It is safe to get rid of useless checks for addresses to fix the
      coccinelle warning:
      >>drivers/net/ethernet/ibm/ibmvnic.c: test of a variable/field address
      Since such statements always return true, they are redundant.
      Signed-off-by: default avatarWen Yang <wen.yang99@zte.com.cn>
      CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      CC: Michael Ellerman <mpe@ellerman.id.au>
      CC: Thomas Falcon <tlfalcon@linux.ibm.com>
      CC: John Allen <jallen@linux.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: linuxppc-dev@lists.ozlabs.org
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      390de194
    • Prashant Bhole's avatar
      tun: replace get_cpu_ptr with this_cpu_ptr when bh disabled · 6342ca64
      Prashant Bhole authored
      tun_xdp_one() runs with local bh disabled. So there is no need to
      disable preemption by calling get_cpu_ptr while updating stats. This
      patch replaces the use of get_cpu_ptr() with this_cpu_ptr() as a
      micro-optimization. Also removes related put_cpu_ptr call.
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6342ca64
    • Aviv Heller's avatar
      net/mlx5: Handle LAG FW commands failure gracefully · 95824666
      Aviv Heller authored
      When create_lag or destroy_lag FW commands fail, display an appropriate
      error message, and try to recover, if possible.
      Signed-off-by: default avatarAviv Heller <avivh@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarYevgeny Kliteynik <kliteyn@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      95824666
    • Aviv Heller's avatar
      net/mlx5: Make RoCE and SR-IOV LAG modes explicit · 7c34ec19
      Aviv Heller authored
      With the introduction of SR-IOV LAG, checking whether LAG is active
      is no longer good enough, since RoCE and SR-IOV LAG each entails
      different behavior by both the core and infiniband drivers.
      
      This patch introduces facilities to discern LAG type, in addition to
      mlx5_lag_is_active(). These are implemented in such a way as to allow
      more complex mode combinations in the future.
      Signed-off-by: default avatarAviv Heller <avivh@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      7c34ec19