1. 30 Dec, 2022 3 commits
    • David S. Miller's avatar
      Merge branch 'tcp-bhash2-fixes' · 0798311c
      David S. Miller authored
      Kuniyuki Iwashima says:
      
      ===================
      tcp: Fix bhash2 and TIME_WAIT regression.
      
      We forgot to add twsk to bhash2.  Therefore TIME_WAIT sockets cannot
      prevent bind() to the same local address and port.
      
      Changes:
        v1:
          * Patch 1:
            * Add tw_bind2_node in inet_timewait_sock instead of
              moving sk_bind2_node from struct sock to struct
      	sock_common.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0798311c
    • Kuniyuki Iwashima's avatar
      tcp: Add selftest for bind() and TIME_WAIT. · 2c042e8e
      Kuniyuki Iwashima authored
      bhash2 split the bind() validation logic into wildcard and non-wildcard
      cases.  Let's add a test to catch future regression.
      
      Before the previous patch:
      
        # ./bind_timewait
        TAP version 13
        1..2
        # Starting 2 tests from 3 test cases.
        #  RUN           bind_timewait.localhost.1 ...
        # bind_timewait.c:87:1:Expected ret (0) == -1 (-1)
        # 1: Test terminated by assertion
        #          FAIL  bind_timewait.localhost.1
        not ok 1 bind_timewait.localhost.1
        #  RUN           bind_timewait.addrany.1 ...
        #            OK  bind_timewait.addrany.1
        ok 2 bind_timewait.addrany.1
        # FAILED: 1 / 2 tests passed.
        # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0
      
      After:
      
        # ./bind_timewait
        TAP version 13
        1..2
        # Starting 2 tests from 3 test cases.
        #  RUN           bind_timewait.localhost.1 ...
        #            OK  bind_timewait.localhost.1
        ok 1 bind_timewait.localhost.1
        #  RUN           bind_timewait.addrany.1 ...
        #            OK  bind_timewait.addrany.1
        ok 2 bind_timewait.addrany.1
        # PASSED: 2 / 2 tests passed.
        # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Acked-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c042e8e
    • Kuniyuki Iwashima's avatar
      tcp: Add TIME_WAIT sockets in bhash2. · 936a192f
      Kuniyuki Iwashima authored
      Jiri Slaby reported regression of bind() with a simple repro. [0]
      
      The repro creates a TIME_WAIT socket and tries to bind() a new socket
      with the same local address and port.  Before commit 28044fc1 ("net:
      Add a bhash2 table hashed by port and address"), the bind() failed with
      -EADDRINUSE, but now it succeeds.
      
      The cited commit should have put TIME_WAIT sockets into bhash2; otherwise,
      inet_bhash2_conflict() misses TIME_WAIT sockets when validating bind()
      requests if the address is not a wildcard one.
      
      The straight option is to move sk_bind2_node from struct sock to struct
      sock_common to add twsk to bhash2 as implemented as RFC. [1]  However, the
      binary layout change in the struct sock could affect performances moving
      hot fields on different cachelines.
      
      To avoid that, we add another TIME_WAIT list in inet_bind2_bucket and check
      it while validating bind().
      
      [0]: https://lore.kernel.org/netdev/6b971a4e-c7d8-411e-1f92-fda29b5b2fb9@kernel.org/
      [1]: https://lore.kernel.org/netdev/20221221151258.25748-2-kuniyu@amazon.com/
      
      Fixes: 28044fc1 ("net: Add a bhash2 table hashed by port and address")
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Acked-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      936a192f
  2. 28 Dec, 2022 22 commits
  3. 26 Dec, 2022 5 commits
    • Anuradha Weeraman's avatar
      net: ethernet: marvell: octeontx2: Fix uninitialized variable warning · d3805695
      Anuradha Weeraman authored
      Fix for uninitialized variable warning.
      
      Addresses-Coverity: ("Uninitialized scalar variable")
      Signed-off-by: default avatarAnuradha Weeraman <anuradha@debian.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3805695
    • Miaoqian Lin's avatar
      nfc: Fix potential resource leaks · df49908f
      Miaoqian Lin authored
      nfc_get_device() take reference for the device, add missing
      nfc_put_device() to release it when not need anymore.
      Also fix the style warnning by use error EOPNOTSUPP instead of
      ENOTSUPP.
      
      Fixes: 5ce3f32b ("NFC: netlink: SE API implementation")
      Fixes: 29e76924 ("nfc: netlink: Add capability to reply to vendor_cmd with data")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df49908f
    • Johnny S. Lee's avatar
      net: dsa: mv88e6xxx: depend on PTP conditionally · 30e72553
      Johnny S. Lee authored
      PTP hardware timestamping related objects are not linked when PTP
      support for MV88E6xxx (NET_DSA_MV88E6XXX_PTP) is disabled, therefore
      NET_DSA_MV88E6XXX should not depend on PTP_1588_CLOCK_OPTIONAL
      regardless of NET_DSA_MV88E6XXX_PTP.
      
      Instead, condition more strictly on how NET_DSA_MV88E6XXX_PTP's
      dependencies are met, making sure that it cannot be enabled when
      NET_DSA_MV88E6XXX=y and PTP_1588_CLOCK=m.
      
      In other words, this commit allows NET_DSA_MV88E6XXX to be built-in
      while PTP_1588_CLOCK is a module, as long as NET_DSA_MV88E6XXX_PTP is
      prevented from being enabled.
      
      Fixes: e5f31552 ("ethernet: fix PTP_1588_CLOCK dependencies")
      Signed-off-by: default avatarJohnny S. Lee <foss@jsl.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30e72553
    • Daniil Tatianin's avatar
      qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure · 13a7c896
      Daniil Tatianin authored
      adapter->dcb would get silently freed inside qlcnic_dcb_enable() in
      case qlcnic_dcb_attach() would return an error, which always happens
      under OOM conditions. This would lead to use-after-free because both
      of the existing callers invoke qlcnic_dcb_get_info() on the obtained
      pointer, which is potentially freed at that point.
      
      Propagate errors from qlcnic_dcb_enable(), and instead free the dcb
      pointer at callsite using qlcnic_dcb_free(). This also removes the now
      unused qlcnic_clear_dcb_ops() helper, which was a simple wrapper around
      kfree() also causing memory leaks for partially initialized dcb.
      
      Found by Linux Verification Center (linuxtesting.org) with the SVACE
      static analysis tool.
      
      Fixes: 3c44bba1 ("qlcnic: Disable DCB operations from SR-IOV VFs")
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Signed-off-by: default avatarDaniil Tatianin <d-tatianin@yandex-team.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13a7c896
    • Hawkins Jiawei's avatar
      net: sched: fix memory leak in tcindex_set_parms · 399ab7fe
      Hawkins Jiawei authored
      Syzkaller reports a memory leak as follows:
      ====================================
      BUG: memory leak
      unreferenced object 0xffff88810c287f00 (size 256):
        comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff814cf9f0>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046
          [<ffffffff839c9e07>] kmalloc include/linux/slab.h:576 [inline]
          [<ffffffff839c9e07>] kmalloc_array include/linux/slab.h:627 [inline]
          [<ffffffff839c9e07>] kcalloc include/linux/slab.h:659 [inline]
          [<ffffffff839c9e07>] tcf_exts_init include/net/pkt_cls.h:250 [inline]
          [<ffffffff839c9e07>] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342
          [<ffffffff839caa1f>] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553
          [<ffffffff8394db62>] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147
          [<ffffffff8389e91c>] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082
          [<ffffffff839eba67>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540
          [<ffffffff839eab87>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
          [<ffffffff839eab87>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345
          [<ffffffff839eb046>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921
          [<ffffffff8383e796>] sock_sendmsg_nosec net/socket.c:714 [inline]
          [<ffffffff8383e796>] sock_sendmsg+0x56/0x80 net/socket.c:734
          [<ffffffff8383eb08>] ____sys_sendmsg+0x178/0x410 net/socket.c:2482
          [<ffffffff83843678>] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536
          [<ffffffff838439c5>] __sys_sendmmsg+0x105/0x330 net/socket.c:2622
          [<ffffffff83843c14>] __do_sys_sendmmsg net/socket.c:2651 [inline]
          [<ffffffff83843c14>] __se_sys_sendmmsg net/socket.c:2648 [inline]
          [<ffffffff83843c14>] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648
          [<ffffffff84605fd5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
          [<ffffffff84605fd5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
          [<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      ====================================
      
      Kernel uses tcindex_change() to change an existing
      filter properties.
      
      Yet the problem is that, during the process of changing,
      if `old_r` is retrieved from `p->perfect`, then
      kernel uses tcindex_alloc_perfect_hash() to newly
      allocate filter results, uses tcindex_filter_result_init()
      to clear the old filter result, without destroying
      its tcf_exts structure, which triggers the above memory leak.
      
      To be more specific, there are only two source for the `old_r`,
      according to the tcindex_lookup(). `old_r` is retrieved from
      `p->perfect`, or `old_r` is retrieved from `p->h`.
      
        * If `old_r` is retrieved from `p->perfect`, kernel uses
      tcindex_alloc_perfect_hash() to newly allocate the
      filter results. Then `r` is assigned with `cp->perfect + handle`,
      which is newly allocated. So condition `old_r && old_r != r` is
      true in this situation, and kernel uses tcindex_filter_result_init()
      to clear the old filter result, without destroying
      its tcf_exts structure
      
        * If `old_r` is retrieved from `p->h`, then `p->perfect` is NULL
      according to the tcindex_lookup(). Considering that `cp->h`
      is directly copied from `p->h` and `p->perfect` is NULL,
      `r` is assigned with `tcindex_lookup(cp, handle)`, whose value
      should be the same as `old_r`, so condition `old_r && old_r != r`
      is false in this situation, kernel ignores using
      tcindex_filter_result_init() to clear the old filter result.
      
      So only when `old_r` is retrieved from `p->perfect` does kernel use
      tcindex_filter_result_init() to clear the old filter result, which
      triggers the above memory leak.
      
      Considering that there already exists a tc_filter_wq workqueue
      to destroy the old tcindex_data by tcindex_partial_destroy_work()
      at the end of tcindex_set_parms(), this patch solves
      this memory leak bug by removing this old filter result
      clearing part and delegating it to the tc_filter_wq workqueue.
      
      Note that this patch doesn't introduce any other issues. If
      `old_r` is retrieved from `p->perfect`, this patch just
      delegates old filter result clearing part to the
      tc_filter_wq workqueue; If `old_r` is retrieved from `p->h`,
      kernel doesn't reach the old filter result clearing part, so
      removing this part has no effect.
      
      [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni
      and Dmitry Vyukov]
      
      Fixes: b9a24bb7 ("net_sched: properly handle failure case of tcf_exts_init()")
      Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/
      Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com
      Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com
      Cc: Cong Wang <cong.wang@bytedance.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarHawkins Jiawei <yin31149@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      399ab7fe
  4. 24 Dec, 2022 1 commit
    • David S. Miller's avatar
      Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · be1236fc
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 7 non-merge commits during the last 5 day(s) which contain
      a total of 11 files changed, 231 insertions(+), 3 deletions(-).
      
      The main changes are:
      
      1) Fix a splat in bpf_skb_generic_pop() under CHECKSUM_PARTIAL due to
         misuse of skb_postpull_rcsum(), from Jakub Kicinski with test case
         from Martin Lau.
      
      2) Fix BPF verifier's nullness propagation when registers are of
         type PTR_TO_BTF_ID, from Hao Sun.
      
      3) Fix bpftool build for JIT disassembler under statically built
         libllvm, from Anton Protopopov.
      
      4) Fix warnings reported by resolve_btfids when building vmlinux
         with CONFIG_SECURITY_NETWORK disabled, from Hou Tao.
      
      5) Minor fix up for BPF selftest gitignore, from Stanislav Fomichev.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be1236fc
  5. 23 Dec, 2022 9 commits