1. 03 Aug, 2020 5 commits
    • Leon Romanovsky's avatar
      net/mlx5: Delete extra dump stack that gives nothing · 6c4e9bcf
      Leon Romanovsky authored
      The WARN_*() macros are intended to catch impossible situations
      from the SW point of view. They gave a little in case HW<->SW interface
      is out-of-sync.
      
      Such out-of-sync scenario can be due to SW errors that are not part
      of this flow or because some HW errors, where dump stack won't help
      either.
      
      This specific WARN_ON() is useless because mlx5_core code is prepared
      to handle such situations and will unfold everything correctly while
      providing enough information to the users to understand why FS is not
      working.
      
      WARNING: CPU: 0 PID: 3222 at drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825 connect_fts_in_prio.isra.20+0x1dd/0x260 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 3222 Comm: syz-executor861 Not tainted 5.5.0-rc6+ #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       __dump_stack linux/lib/dump_stack.c:77 [inline]
       dump_stack+0x94/0xce linux/lib/dump_stack.c:118
       panic+0x234/0x56f linux/kernel/panic.c:221
       __warn+0x1cc/0x1e1 linux/kernel/panic.c:582
       report_bug+0x200/0x310 linux/lib/bug.c:195
       fixup_bug.part.11+0x32/0x80 linux/arch/x86/kernel/traps.c:174
       fixup_bug linux/arch/x86/kernel/traps.c:273 [inline]
       do_error_trap+0xd3/0x100 linux/arch/x86/kernel/traps.c:267
       do_invalid_op+0x31/0x40 linux/arch/x86/kernel/traps.c:286
       invalid_op+0x1e/0x30 linux/arch/x86/entry/entry_64.S:1027
      RIP: 0010:connect_fts_in_prio.isra.20+0x1dd/0x260
      linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:825
      Code: 00 00 48 c7 c2 60 8c 31 84 48 c7 c6 00 81 31 84 48 8b 38 e8 3c a8
      cb ff 41 83 fd 01 8b 04 24 0f 8e 29 ff ff ff e8 83 7b bc fe <0f> 0b 8b
      04 24 e9 1a ff ff ff 89 04 24 e8 c1 20 e0 fe 8b 04 24 eb
      RSP: 0018:ffffc90004bb7858 EFLAGS: 00010293
      RAX: ffff88805de98e80 RBX: 0000000000000c96 RCX: ffffffff827a853d
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: fffff52000976efa
      RBP: 0000000000000007 R08: ffffed100da060e3 R09: ffffed100da060e3
      R10: 0000000000000001 R11: ffffed100da060e2 R12: dffffc0000000000
      R13: 0000000000000002 R14: ffff8880683a1a10 R15: ffffed100d07bc1c
       connect_prev_fts linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:844 [inline]
       connect_flow_table linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:975 [inline]
       __mlx5_create_flow_table+0x8f8/0x1710 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1064
       mlx5_create_flow_table linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1094 [inline]
       mlx5_create_auto_grouped_flow_table+0xe1/0x210 linux/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c:1136
       _get_prio linux/drivers/infiniband/hw/mlx5/main.c:3286 [inline]
       get_flow_table+0x2ea/0x760 linux/drivers/infiniband/hw/mlx5/main.c:3376
       mlx5_ib_create_flow+0x331/0x11c0 linux/drivers/infiniband/hw/mlx5/main.c:3896
       ib_uverbs_ex_create_flow+0x13e8/0x1b40 linux/drivers/infiniband/core/uverbs_cmd.c:3311
       ib_uverbs_write+0xaa5/0xdf0 linux/drivers/infiniband/core/uverbs_main.c:769
       __vfs_write+0x7c/0x100 linux/fs/read_write.c:494
       vfs_write+0x168/0x4a0 linux/fs/read_write.c:558
       ksys_write+0xc8/0x200 linux/fs/read_write.c:611
       do_syscall_64+0x9c/0x390 linux/arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45a059
      Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
      f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
      f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fcc17564c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007fcc17564ca0 RCX: 000000000045a059
      RDX: 0000000000000030 RSI: 00000000200003c0 RDI: 0000000000000005
      RBP: 0000000000000007 R08: 0000000000000002 R09: 0000000000003131
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e636c
      R13: 0000000000000000 R14: 00000000006e6360 R15: 00007ffdcbdaf6a0
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Kernel Offset: disabled
      Rebooting in 1 seconds..
      
      Fixes: f90edfd2 ("net/mlx5_core: Connect flow tables")
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      6c4e9bcf
    • Jakub Kicinski's avatar
      net/mlx5: convert to new udp_tunnel infrastructure · 18a2b7f9
      Jakub Kicinski authored
      Allocate nic_info dynamically - n_entries is not constant.
      
      Attach the tunnel offload info only to the uplink representor.
      We expect the "main" netdev to be unregistered in switchdev
      mode, and there to be only one uplink representor.
      
      Drop the udp_tunnel_drop_rx_info() call, it was not there until
      commit b3c2ed21 ("net/mlx5e: Fix VXLAN configuration restore after function reload")
      so the device doesn't need it, and core should handle reloads and
      reset just fine.
      
      v2:
       - don't drop the ndos on reprs, and register info on uplink repr.
      v4:
       - Move netdev tunnel structure handling to en_main.c
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      18a2b7f9
    • Jakub Kicinski's avatar
      udp_tunnel: add the ability to hard-code IANA VXLAN · 966e5059
      Jakub Kicinski authored
      mlx5 has the IANA VXLAN port (4789) hard coded by the device,
      instead of being added dynamically when tunnels are created.
      
      To support this add a workaround flag to struct udp_tunnel_nic_info.
      Skipping updates for the port is fairly trivial, dumping the hard
      coded port via ethtool requires some code duplication. The port
      is not a part of any real table, we dump it in a special table
      which has no tunnel types supported and only one entry.
      
      This is the last known workaround / hack needed to convert
      all drivers to the new infra.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      966e5059
    • Alex Vesker's avatar
      net/mlx5: DR, Change push vlan action sequence · b2064909
      Alex Vesker authored
      The DR TX state machine supports the following order:
      modify header, push vlan and encapsulation.
      Instead fs_dr would pass:
      push vlan, modify header and encapsulation.
      
      The above caused the rule creation to fail on invalid action
      sequence provided error.
      
      Fixes: 6a48faee ("net/mlx5: Add direct rule fs_cmd implementation")
      Signed-off-by: default avatarAlex Vesker <valex@mellanox.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      b2064909
    • Parav Pandit's avatar
      net/mlx5e: Enable users to change VF/PF representors carrier state · 45d252ca
      Parav Pandit authored
      Currently PF and VF representor netdevice carrier is always controlled
      by controlling the representor netdevice device state as up/down.
      
      Representor netdevice state change undergoes one or more txq/rxq
      destroy/create commands to firmware, skb and its rx buffer allocation,
      health reporters creation and more.
      
      Due to this limitation users do not have the ability to just change
      the carrier of the non uplink representors without modifying the
      device state.
      
      In one use case when the eswitch physical port carrier is down/up,
      user needs to update the VF link state to same as physical port
      carrier.
      
      Example of updating VF representor carrier state:
      $ ip link set enp0s8f0npf0vf0 carrier off
      $ ip link set enp0s8f0npf0vf0 carrier on
      
      This enhancement results into VF link state change which is
      represented by the VF representor netdevice carrier.
      
      This enables users to modify the representor carrier without modifying
      the representor netdevice state.
      
      A simple test is run using [1] to calculate the time difference between
      updating carrier vs updating device state (to update just the carrier)
      with one VF to simulate 255 VFs.
      
      Time taken to update the carrier using device up/down:
      $ time ./calculate.sh dev enp0s8f0npf0vf0
      real    0m30.913s
      user    0m0.200s
      sys     0m11.168s
      
      Time taken to update just the carrier using carrier iproute2 command:
      $ time ./calculate.sh carrier enp0s8f0npf0vf0
      real    0m2.142s
      user    0m0.160s
      sys     0m2.021s
      
      Test shows that its better to use carrier on/off user interface to notify
      link up/down event to VF compare to device up/down interface, because
      carrier user interface delivers the same event 15 times faster.
      
      [1] https://github.com/paravmellanox/myscripts/blob/master/calculate_carrier_time.shSigned-off-by: default avatarParav Pandit <parav@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      45d252ca
  2. 02 Aug, 2020 1 commit
  3. 01 Aug, 2020 15 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ac3a0c84
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca.
      
       2) Better parameter validation in pfkey_dump(), from Mark Salyzyn.
      
       3) Fix several clang issues on powerpc in selftests, from Tanner Love.
      
       4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al
          Viro.
      
       5) Out of bounds access in mlx5e driver, from Raed Salem.
      
       6) Fix transfer buffer memleak in lan78xx, from Johan Havold.
      
       7) RCU fixups in rhashtable, from Herbert Xu.
      
       8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang.
      
       9) vxlan FDB dump must be done under RCU, from Ido Schimmel.
      
      10) Fix use after free in mlxsw, from Ido Schimmel.
      
      11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko.
      
      12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar
          Thiagarajan.
      
      13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang.
      
      14) Fix bpf program reference count leaks in mlx5 during
          mlx5e_alloc_rq(), from Xin Xiong.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
        vxlan: fix memleak of fdb
        rds: Prevent kernel-infoleak in rds_notify_queue_get()
        net/sched: The error lable position is corrected in ct_init_module
        net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
        net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
        net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
        net/mlx5e: CT: Support restore ipv6 tunnel
        net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
        ionic: unlock queue mutex in error path
        atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
        net: ethernet: mtk_eth_soc: fix MTU warnings
        net: nixge: fix potential memory leak in nixge_probe()
        devlink: ignore -EOPNOTSUPP errors on dumpit
        rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
        MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
        selftests/bpf: fix netdevsim trap_flow_action_cookie read
        ipv6: fix memory leaks on IPV6_ADDRFORM path
        net/bpfilter: Initialize pos in __bpfilter_process_sockopt
        igb: reinit_locked() should be called with rtnl_lock
        e1000e: continue to init PHY even when failed to disable ULP
        ...
      ac3a0c84
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 0ae3495b
      Linus Torvalds authored
      Pull thread fix from Christian Brauner:
       "A simple spelling fix for dequeue_synchronous_signal()"
      
      * tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        signal: fix typo in dequeue_synchronous_signal()
      0ae3495b
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux · bf121a0b
      Linus Torvalds authored
      Pull perf tooling fixes from Arnaldo Carvalho de Melo:
      
       - Fix libtraceevent build with binutils 2.35
      
       - Fix memory leak in process_dynamic_array_len in libtraceevent
      
       - Fix 'perf test 68' zstd compression for s390
      
       - Fix record failure when mixed with ARM SPE event
      
      * tag 'perf-tools-fixes-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        libtraceevent: Fix build with binutils 2.35
        perf tools: Fix record failure when mixed with ARM SPE event
        perf tests: Fix test 68 zstd compression for s390
        tools lib traceevent: Fix memory leak in process_dynamic_array_len
      bf121a0b
    • Florian Westphal's avatar
      mptcp: fix syncookie build error on UP · 7126bd5c
      Florian Westphal authored
      kernel test robot says:
      net/mptcp/syncookies.c: In function 'mptcp_join_cookie_init':
      include/linux/kernel.h:47:38: warning: division by zero [-Wdiv-by-zero]
       #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
      
      I forgot that spinock_t size is 0 on UP, so ARRAY_SIZE cannot be used.
      
      Fixes: 9466a1cc ("mptcp: enable JOIN requests even if cookies are in use")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7126bd5c
    • Taehee Yoo's avatar
      vxlan: fix memleak of fdb · fda2ec62
      Taehee Yoo authored
      When vxlan interface is deleted, all fdbs are deleted by vxlan_flush().
      vxlan_flush() flushes fdbs but it doesn't delete fdb, which contains
      all-zeros-mac because it is deleted by vxlan_uninit().
      But vxlan_uninit() deletes only the fdb, which contains both all-zeros-mac
      and default vni.
      So, the fdb, which contains both all-zeros-mac and non-default vni
      will not be deleted.
      
      Test commands:
          ip link add vxlan0 type vxlan dstport 4789 external
          ip link set vxlan0 up
          bridge fdb add to 00:00:00:00:00:00 dst 172.0.0.1 dev vxlan0 via lo \
      	    src_vni 10000 self permanent
          ip link del vxlan0
      
      kmemleak reports as follows:
      unreferenced object 0xffff9486b25ced88 (size 96):
        comm "bridge", pid 2151, jiffies 4294701712 (age 35506.901s)
        hex dump (first 32 bytes):
          02 00 00 00 ac 00 00 01 40 00 09 b1 86 94 ff ff  ........@.......
          46 02 00 00 00 00 00 00 a7 03 00 00 12 b5 6a 6b  F.............jk
        backtrace:
          [<00000000c10cf651>] vxlan_fdb_append.part.51+0x3c/0xf0 [vxlan]
          [<000000006b31a8d9>] vxlan_fdb_create+0x184/0x1a0 [vxlan]
          [<0000000049399045>] vxlan_fdb_update+0x12f/0x220 [vxlan]
          [<0000000090b1ef00>] vxlan_fdb_add+0x12a/0x1b0 [vxlan]
          [<0000000056633c2c>] rtnl_fdb_add+0x187/0x270
          [<00000000dd5dfb6b>] rtnetlink_rcv_msg+0x264/0x490
          [<00000000fc44dd54>] netlink_rcv_skb+0x4a/0x110
          [<00000000dff433e7>] netlink_unicast+0x18e/0x250
          [<00000000b87fb421>] netlink_sendmsg+0x2e9/0x400
          [<000000002ed55153>] ____sys_sendmsg+0x237/0x260
          [<00000000faa51c66>] ___sys_sendmsg+0x88/0xd0
          [<000000006c3982f1>] __sys_sendmsg+0x4e/0x80
          [<00000000a8f875d2>] do_syscall_64+0x56/0xe0
          [<000000003610eefa>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      unreferenced object 0xffff9486b1c40080 (size 128):
        comm "bridge", pid 2157, jiffies 4294701754 (age 35506.866s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 f8 dc 42 b2 86 94 ff ff  ..........B.....
          6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
        backtrace:
          [<00000000a2981b60>] vxlan_fdb_create+0x67/0x1a0 [vxlan]
          [<0000000049399045>] vxlan_fdb_update+0x12f/0x220 [vxlan]
          [<0000000090b1ef00>] vxlan_fdb_add+0x12a/0x1b0 [vxlan]
          [<0000000056633c2c>] rtnl_fdb_add+0x187/0x270
          [<00000000dd5dfb6b>] rtnetlink_rcv_msg+0x264/0x490
          [<00000000fc44dd54>] netlink_rcv_skb+0x4a/0x110
          [<00000000dff433e7>] netlink_unicast+0x18e/0x250
          [<00000000b87fb421>] netlink_sendmsg+0x2e9/0x400
          [<000000002ed55153>] ____sys_sendmsg+0x237/0x260
          [<00000000faa51c66>] ___sys_sendmsg+0x88/0xd0
          [<000000006c3982f1>] __sys_sendmsg+0x4e/0x80
          [<00000000a8f875d2>] do_syscall_64+0x56/0xe0
          [<000000003610eefa>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 3ad7a4b1 ("vxlan: support fdb and learning in COLLECT_METADATA mode")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fda2ec62
    • Brian Vazquez's avatar
      fib: fix another fib_rules_ops indirect call wrapper problem · 8b66a6fd
      Brian Vazquez authored
      It turns out that on commit 41d707b7 ("fib: fix fib_rules_ops
      indirect calls wrappers") I forgot to include the case when
      CONFIG_IP_MULTIPLE_TABLES is not set.
      
      Fixes: 41d707b7 ("fib: fix fib_rules_ops indirect calls wrappers")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarBrian Vazquez <brianvv@google.com>
      Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b66a6fd
    • Eric Dumazet's avatar
      tcp: fix build fong CONFIG_MPTCP=n · 0e8642cf
      Eric Dumazet authored
      Fixes these errors:
      
      net/ipv4/syncookies.c: In function 'tcp_get_cookie_sock':
      net/ipv4/syncookies.c:216:19: error: 'struct tcp_request_sock' has no
      member named 'drop_req'
        216 |   if (tcp_rsk(req)->drop_req) {
            |                   ^~
      net/ipv4/syncookies.c: In function 'cookie_tcp_reqsk_alloc':
      net/ipv4/syncookies.c:289:27: warning: unused variable 'treq'
      [-Wunused-variable]
        289 |  struct tcp_request_sock *treq;
            |                           ^~~~
      make[3]: *** [scripts/Makefile.build:280: net/ipv4/syncookies.o] Error 1
      make[3]: *** Waiting for unfinished jobs....
      
      Fixes: 9466a1cc ("mptcp: enable JOIN requests even if cookies are in use")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e8642cf
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · d52daa86
      Linus Torvalds authored
      Pull pin control fix from Linus Walleij:
       "A single last minute pin control fix to the Qualcomm driver fixing
        missing dual edge PCH interrupts"
      
      * tag 'pinctrl-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: qcom: Handle broken/missing PDC dual edge IRQs on sc7180
      d52daa86
    • David S. Miller's avatar
      Merge tag 'mac80211-next-for-davem-2020-07-31' of... · 6f3de75c
      David S. Miller authored
      Merge tag 'mac80211-next-for-davem-2020-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
      
      Johannes Berg says:
      
      ====================
      We have a number of changes
       * code cleanups and fixups as usual
       * AQL & internal TXQ improvements from Felix
       * some mesh 802.1X support bits
       * some injection improvements from Mathy of KRACK
         fame, so we'll see what this results in ;-)
       * some more initial S1G supports bits, this time
         (some of?) the userspace APIs
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f3de75c
    • Roopa Prabhu's avatar
      rtnetlink: add support for protodown reason · 829eb208
      Roopa Prabhu authored
      netdev protodown is a mechanism that allows protocols to
      hold an interface down. It was initially introduced in
      the kernel to hold links down by a multihoming protocol.
      There was also an attempt to introduce protodown
      reason at the time but was rejected. protodown and protodown reason
      is supported by almost every switching and routing platform.
      It was ok for a while to live without a protodown reason.
      But, its become more critical now given more than
      one protocol may need to keep a link down on a system
      at the same time. eg: vrrp peer node, port security,
      multihoming protocol. Its common for Network operators and
      protocol developers to look for such a reason on a networking
      box (Its also known as errDisable by most networking operators)
      
      This patch adds support for link protodown reason
      attribute. There are two ways to maintain protodown
      reasons.
      (a) enumerate every possible reason code in kernel
          - A protocol developer has to make a request and
            have that appear in a certain kernel version
      (b) provide the bits in the kernel, and allow user-space
      (sysadmin or NOS distributions) to manage the bit-to-reasonname
      map.
      	- This makes extending reason codes easier (kind of like
            the iproute2 table to vrf-name map /etc/iproute2/rt_tables.d/)
      
      This patch takes approach (b).
      
      a few things about the patch:
      - It treats the protodown reason bits as counter to indicate
      active protodown users
      - Since protodown attribute is already an exposed UAPI,
      the reason is not enforced on a protodown set. Its a no-op
      if not used.
      the patch follows the below algorithm:
        - presence of reason bits set indicates protodown
          is in use
        - user can set protodown and protodown reason in a
          single or multiple setlink operations
        - setlink operation to clear protodown, will return -EBUSY
          if there are active protodown reason bits
        - reason is not included in link dumps if not used
      
      example with patched iproute2:
      $cat /etc/iproute2/protodown_reasons.d/r.conf
      0 mlag
      1 evpn
      2 vrrp
      3 psecurity
      
      $ip link set dev vxlan0 protodown on protodown_reason vrrp on
      $ip link set dev vxlan0 protodown_reason mlag on
      $ip link show
      14: vxlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
      DEFAULT group default qlen 1000
          link/ether f6:06:be:17:91:e7 brd ff:ff:ff:ff:ff:ff protodown on <mlag,vrrp>
      
      $ip link set dev vxlan0 protodown_reason mlag off
      $ip link set dev vxlan0 protodown off protodown_reason vrrp off
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      829eb208
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 69138b34
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2020-07-31
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 5 non-merge commits during the last 21 day(s) which contain
      a total of 5 files changed, 126 insertions(+), 18 deletions(-).
      
      The main changes are:
      
      1) Fix a map element leak in HASH_OF_MAPS map type, from Andrii Nakryiko.
      
      2) Fix a NULL pointer dereference in __btf_resolve_helper_id() when no
         btf_vmlinux is available, from Peilin Ye.
      
      3) Init pos variable in __bpfilter_process_sockopt(), from Christoph Hellwig.
      
      4) Fix a cgroup sockopt verifier test by specifying expected attach type,
         from Jean-Philippe Brucker.
      
      Note that when net gets merged into net-next later on, there is a small
      merge conflict in kernel/bpf/btf.c between commit 5b801dfb ("bpf: Fix
      NULL pointer dereference in __btf_resolve_helper_id()") from the bpf tree
      and commit 138b9a05 ("bpf: Remove btf_id helpers resolving") from the
      net-next tree.
      
      Resolve as follows: remove the old hunk with the __btf_resolve_helper_id()
      function. Change the btf_resolve_helper_id() so it actually tests for a
      NULL btf_vmlinux and bails out:
      
      int btf_resolve_helper_id(struct bpf_verifier_log *log,
                                const struct bpf_func_proto *fn, int arg)
      {
              int id;
      
              if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID || !btf_vmlinux)
                      return -EINVAL;
              id = fn->btf_id[arg];
              if (!id || id > btf_vmlinux->nr_types)
                      return -EINVAL;
              return id;
      }
      
      Let me know if you run into any others issues (CC'ing Jiri Olsa so he's in
      the loop with regards to merge conflict resolution).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69138b34
    • Jason Wang's avatar
      tun: add missing rcu annotation in tun_set_ebpf() · 8f3f330d
      Jason Wang authored
      We expecte prog_p to be protected by rcu, so adding the rcu annotation
      to fix the following sparse warning:
      
      drivers/net/tun.c:3003:36: warning: incorrect type in argument 2 (different address spaces)
      drivers/net/tun.c:3003:36:    expected struct tun_prog [noderef] __rcu **prog_p
      drivers/net/tun.c:3003:36:    got struct tun_prog **prog_p
      drivers/net/tun.c:3292:42: warning: incorrect type in argument 2 (different address spaces)
      drivers/net/tun.c:3292:42:    expected struct tun_prog **prog_p
      drivers/net/tun.c:3292:42:    got struct tun_prog [noderef] __rcu **
      drivers/net/tun.c:3296:42: warning: incorrect type in argument 2 (different address spaces)
      drivers/net/tun.c:3296:42:    expected struct tun_prog **prog_p
      drivers/net/tun.c:3296:42:    got struct tun_prog [noderef] __rcu **
      Reported-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f3f330d
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 8d46215a
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2020-07-31
      
      1) Fix policy matching with mark and mask on userspace interfaces.
         From Xin Long.
      
      2) Several fixes for the new ESP in TCP encapsulation.
         From Sabrina Dubroca.
      
      3) Fix crash when the hold queue is used. The assumption that
         xdst->path and dst->child are not a NULL pointer only if dst->xfrm
         is not a NULL pointer is true with the exception of using the
         hold queue. Fix this by checking for hold queue usage before
         dereferencing xdst->path or dst->child.
      
      4) Validate pfkey_dump parameter before sending them.
         From Mark Salyzyn.
      
      5) Fix the location of the transport header with ESP in UDPv6
         encapsulation. From Sabrina Dubroca.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d46215a
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2020-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · e535d87d
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2020-07-30
      
      This small patchset introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.18:
       ('net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq')
      
      For -stable v5.7:
       ('net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e535d87d
    • Yousuk Seung's avatar
      tcp: add earliest departure time to SCM_TIMESTAMPING_OPT_STATS · 48040793
      Yousuk Seung authored
      This change adds TCP_NLA_EDT to SCM_TIMESTAMPING_OPT_STATS that reports
      the earliest departure time(EDT) of the timestamped skb. By tracking EDT
      values of the skb from different timestamps, we can observe when and how
      much the value changed. This allows to measure the precise delay
      injected on the sender host e.g. by a bpf-base throttler.
      Signed-off-by: default avatarYousuk Seung <ysseung@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48040793
  4. 31 Jul, 2020 19 commits