1. 01 Jun, 2022 7 commits
    • Aya Levin's avatar
      net: ping6: Fix ping -6 with interface name · e6652a8e
      Aya Levin authored
      When passing interface parameter to ping -6:
      $ ping -6 ::11:141:84:9 -I eth2
      Results in:
      PING ::11:141:84:10(::11:141:84:10) from ::11:141:84:9 eth2: 56 data bytes
      ping: sendmsg: Invalid argument
      ping: sendmsg: Invalid argument
      
      Initialize the fl6's outgoing interface (OIF) before triggering
      ip6_datagram_send_ctl. Don't wipe fl6 after ip6_datagram_send_ctl() as
      changes in fl6 that may happen in the function are overwritten explicitly.
      Update comment accordingly.
      
      Fixes: 13651224 ("net: ping6: support setting basic SOL_IPV6 options via cmsg")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220531084544.15126-1-tariqt@nvidia.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e6652a8e
    • Ziyang Xuan's avatar
      macsec: fix UAF bug for real_dev · 196a888c
      Ziyang Xuan authored
      Create a new macsec device but not get reference to real_dev. That can
      not ensure that real_dev is freed after macsec. That will trigger the
      UAF bug for real_dev as following:
      
      ==================================================================
      BUG: KASAN: use-after-free in macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662
      Call Trace:
       ...
       macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662
       dev_get_iflink+0x73/0xe0 net/core/dev.c:637
       default_operstate net/core/link_watch.c:42 [inline]
       rfc2863_policy+0x233/0x2d0 net/core/link_watch.c:54
       linkwatch_do_dev+0x2a/0x150 net/core/link_watch.c:161
      
      Allocated by task 22209:
       ...
       alloc_netdev_mqs+0x98/0x1100 net/core/dev.c:10549
       rtnl_create_link+0x9d7/0xc00 net/core/rtnetlink.c:3235
       veth_newlink+0x20e/0xa90 drivers/net/veth.c:1748
      
      Freed by task 8:
       ...
       kfree+0xd6/0x4d0 mm/slub.c:4552
       kvfree+0x42/0x50 mm/util.c:615
       device_release+0x9f/0x240 drivers/base/core.c:2229
       kobject_cleanup lib/kobject.c:673 [inline]
       kobject_release lib/kobject.c:704 [inline]
       kref_put include/linux/kref.h:65 [inline]
       kobject_put+0x1c8/0x540 lib/kobject.c:721
       netdev_run_todo+0x72e/0x10b0 net/core/dev.c:10327
      
      After commit faab39f6 ("net: allow out-of-order netdev unregistration")
      and commit e5f80fcf ("ipv6: give an IPv6 dev to blackhole_netdev"), we
      can add dev_hold_track() in macsec_dev_init() and dev_put_track() in
      macsec_free_netdev() to fix the problem.
      
      Fixes: 2bce1ebe ("macsec: fix refcnt leak in module exit routine")
      Reported-by: syzbot+d0e94b65ac259c29ce7a@syzkaller.appspotmail.com
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Link: https://lore.kernel.org/r/20220531074500.1272846-1-william.xuanziyang@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      196a888c
    • Dan Carpenter's avatar
      octeontx2-af: fix error code in is_valid_offset() · f3d671c7
      Dan Carpenter authored
      The is_valid_offset() function returns success/true if the call to
      validate_and_get_cpt_blkaddr() fails.
      
      Fixes: ecad2ce8 ("octeontx2-af: cn10k: Add mailbox to configure reassembly timeout")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/YpXDrTPb8qV01JSP@kiliSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f3d671c7
    • Hangbin Liu's avatar
      bonding: guard ns_targets by CONFIG_IPV6 · c4caa500
      Hangbin Liu authored
      Guard ns_targets in struct bond_params by CONFIG_IPV6, which could save
      256 bytes if IPv6 not configed. Also add this protection for function
      bond_is_ip6_target_ok() and bond_get_targets_ip6().
      
      Remove the IS_ENABLED() check for bond_opts[] as this will make
      BOND_OPT_NS_TARGETS uninitialized if CONFIG_IPV6 not enabled. Add
      a dummy bond_option_ns_ip6_targets_set() for this situation.
      
      Fixes: 4e24be01 ("bonding: add new parameter ns_targets")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Link: https://lore.kernel.org/r/20220531063727.224043-1-liuhangbin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c4caa500
    • Eric Dumazet's avatar
      tcp: tcp_rtx_synack() can be called from process context · 0a375c82
      Eric Dumazet authored
      Laurent reported the enclosed report [1]
      
      This bug triggers with following coditions:
      
      0) Kernel built with CONFIG_DEBUG_PREEMPT=y
      
      1) A new passive FastOpen TCP socket is created.
         This FO socket waits for an ACK coming from client to be a complete
         ESTABLISHED one.
      2) A socket operation on this socket goes through lock_sock()
         release_sock() dance.
      3) While the socket is owned by the user in step 2),
         a retransmit of the SYN is received and stored in socket backlog.
      4) At release_sock() time, the socket backlog is processed while
         in process context.
      5) A SYNACK packet is cooked in response of the SYN retransmit.
      6) -> tcp_rtx_synack() is called in process context.
      
      Before blamed commit, tcp_rtx_synack() was always called from BH handler,
      from a timer handler.
      
      Fix this by using TCP_INC_STATS() & NET_INC_STATS()
      which do not assume caller is in non preemptible context.
      
      [1]
      BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
      caller is tcp_rtx_synack.part.0+0x36/0xc0
      CPU: 10 PID: 2180 Comm: epollpep Tainted: G           OE     5.16.0-0.bpo.4-amd64 #1  Debian 5.16.12-1~bpo11+1
      Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
      Call Trace:
       <TASK>
       dump_stack_lvl+0x48/0x5e
       check_preemption_disabled+0xde/0xe0
       tcp_rtx_synack.part.0+0x36/0xc0
       tcp_rtx_synack+0x8d/0xa0
       ? kmem_cache_alloc+0x2e0/0x3e0
       ? apparmor_file_alloc_security+0x3b/0x1f0
       inet_rtx_syn_ack+0x16/0x30
       tcp_check_req+0x367/0x610
       tcp_rcv_state_process+0x91/0xf60
       ? get_nohz_timer_target+0x18/0x1a0
       ? lock_timer_base+0x61/0x80
       ? preempt_count_add+0x68/0xa0
       tcp_v4_do_rcv+0xbd/0x270
       __release_sock+0x6d/0xb0
       release_sock+0x2b/0x90
       sock_setsockopt+0x138/0x1140
       ? __sys_getsockname+0x7e/0xc0
       ? aa_sk_perm+0x3e/0x1a0
       __sys_setsockopt+0x198/0x1e0
       __x64_sys_setsockopt+0x21/0x30
       do_syscall_64+0x38/0xc0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarLaurent Fasnacht <laurent.fasnacht@proton.ch>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0a375c82
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · b3c0a9ef
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Missing proper sanitization for nft_set_desc_concat_parse().
      
      2) Missing mutex in nf_tables pre_exit path.
      
      3) Possible double hook unregistration from clean_net path.
      
      4) Missing FLOWI_FLAG_ANYSRC flag in flowtable route lookup.
         Fix incorrect source and destination address in case of NAT.
         Patch from wenxu.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: flowtable: fix nft_flow_route source address for nat case
        netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag
        netfilter: nf_tables: double hook unregistration in netns path
        netfilter: nf_tables: hold mutex on netns pre_exit path
        netfilter: nf_tables: sanitize nft_set_desc_concat_parse()
      ====================
      
      Link: https://lore.kernel.org/r/20220531215839.84765-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b3c0a9ef
    • Guoju Fang's avatar
      net: sched: add barrier to fix packet stuck problem for lockless qdisc · 2e8728c9
      Guoju Fang authored
      In qdisc_run_end(), the spin_unlock() only has store-release semantic,
      which guarantees all earlier memory access are visible before it. But
      the subsequent test_bit() has no barrier semantics so may be reordered
      ahead of the spin_unlock(). The store-load reordering may cause a packet
      stuck problem.
      
      The concurrent operations can be described as below,
               CPU 0                      |          CPU 1
         qdisc_run_end()                  |     qdisc_run_begin()
                .                         |           .
       ----> /* may be reorderd here */   |           .
      |         .                         |           .
      |     spin_unlock()                 |         set_bit()
      |         .                         |         smp_mb__after_atomic()
       ---- test_bit()                    |         spin_trylock()
                .                         |          .
      
      Consider the following sequence of events:
          CPU 0 reorder test_bit() ahead and see MISSED = 0
          CPU 1 calls set_bit()
          CPU 1 calls spin_trylock() and return fail
          CPU 0 executes spin_unlock()
      
      At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
      because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
      take it, until the next cpu pushing to the qdisc (if ever ...) will
      notice and dequeue it.
      
      This patch fix this by adding one explicit barrier. As spin_unlock() and
      test_bit() ordering is a store-load ordering, a full memory barrier
      smp_mb() is needed here.
      
      Fixes: a90c57f2 ("net: sched: fix packet stuck problem for lockless qdisc")
      Signed-off-by: default avatarGuoju Fang <gjfang@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20220528101628.120193-1-gjfang@linux.alibaba.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e8728c9
  2. 31 May, 2022 10 commits
  3. 29 May, 2022 3 commits
    • David S. Miller's avatar
      Merge branch 'sfc-fixes' · 90343f57
      David S. Miller authored
      Íñigo Huguet says:
      
      ====================
      sfc: fix some efx_separate_tx_channels errors
      
      Trying to load sfc driver with modparam efx_separate_tx_channels=1
      resulted in errors during initialization and not being able to use the
      NIC. This patches fix a few bugs and make it work again.
      
      v2:
      * added Martin's patch instead of a previous mine. Mine one solved some
      of the initialization errors, but Martin's solves them also in all
      possible cases.
      * removed whitespaces cleanup, as requested by Jakub
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90343f57
    • Íñigo Huguet's avatar
      sfc: fix wrong tx channel offset with efx_separate_tx_channels · c308dfd1
      Íñigo Huguet authored
      tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
      also calculated again in efx_set_channels because it was originally done
      there, and when efx_allocate_msix_channels was introduced it was
      forgotten to be removed from efx_set_channels.
      
      Moreover, the old calculation is wrong when using
      efx_separate_tx_channels because now we can have XDP channels after the
      TX channels, so n_channels - n_tx_channels doesn't point to the first TX
      channel.
      
      Remove the old calculation from efx_set_channels, and add the
      initialization of this variable if MSI or legacy interrupts are used,
      next to the initialization of the rest of the related variables, where
      it was missing.
      
      Fixes: 3990a8ff ("sfc: allocate channels for XDP tx queues")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c308dfd1
    • Martin Habets's avatar
      sfc: fix considering that all channels have TX queues · 2e102b53
      Martin Habets authored
      Normally, all channels have RX and TX queues, but this is not true if
      modparam efx_separate_tx_channels=1 is used. In that cases, some
      channels only have RX queues and others only TX queues (or more
      preciselly, they have them allocated, but not initialized).
      
      Fix efx_channel_has_tx_queues to return the correct value for this case
      too.
      
      Messages shown at probe time before the fix:
       sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
       ------------[ cut here ]------------
       netdevice: ens6f0np0: failed to initialise TXQ -1
       WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       Call Trace:
        efx_init_tx_queue+0xaa/0xf0 [sfc]
        efx_start_channels+0x49/0x120 [sfc]
        efx_start_all+0x1f8/0x430 [sfc]
        efx_net_open+0x5a/0xe0 [sfc]
        __dev_open+0xd0/0x190
        __dev_change_flags+0x1b3/0x220
        dev_change_flags+0x21/0x60
       [...] stripped
      
      Messages shown at remove time before the fix:
       sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
       sfc 0000:03:00.0 ens6f0np0: failed to flush queues
      
      Fixes: 8700aff0 ("sfc: fix channel allocation with brute force")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Tested-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e102b53
  4. 28 May, 2022 13 commits
  5. 27 May, 2022 7 commits