1. 09 Dec, 2021 6 commits
    • Louis Amas's avatar
      net: mvpp2: fix XDP rx queues registering · a50e659b
      Louis Amas authored
      The registration of XDP queue information is incorrect because the
      RX queue id we use is invalid. When port->id == 0 it appears to works
      as expected yet it's no longer the case when port->id != 0.
      
      The problem arised while using a recent kernel version on the
      MACCHIATOBin. This board has several ports:
       * eth0 and eth1 are 10Gbps interfaces ; both ports has port->id == 0;
       * eth2 is a 1Gbps interface with port->id != 0.
      
      Code from xdp-tutorial (more specifically advanced03-AF_XDP) was used
      to test packet capture and injection on all these interfaces. The XDP
      kernel was simplified to:
      
      	SEC("xdp_sock")
      	int xdp_sock_prog(struct xdp_md *ctx)
      	{
      		int index = ctx->rx_queue_index;
      
      		/* A set entry here means that the correspnding queue_id
      		* has an active AF_XDP socket bound to it. */
      		if (bpf_map_lookup_elem(&xsks_map, &index))
      			return bpf_redirect_map(&xsks_map, index, 0);
      
      		return XDP_PASS;
      	}
      
      Starting the program using:
      
      	./af_xdp_user -d DEV
      
      Gives the following result:
      
       * eth0 : ok
       * eth1 : ok
       * eth2 : no capture, no injection
      
      Investigating the issue shows that XDP rx queues for eth2 are wrong:
      XDP expects their id to be in the range [0..3] but we found them to be
      in the range [32..35].
      
      Trying to force rx queue ids using:
      
      	./af_xdp_user -d eth2 -Q 32
      
      fails as expected (we shall not have more than 4 queues).
      
      When we register the XDP rx queue information (using
      xdp_rxq_info_reg() in function mvpp2_rxq_init()) we tell it to use
      rxq->id as the queue id. This value is computed as:
      
      	rxq->id = port->id * max_rxq_count + queue_id
      
      where max_rxq_count depends on the device version. In the MACCHIATOBin
      case, this value is 32, meaning that rx queues on eth2 are numbered
      from 32 to 35 - there are four of them.
      
      Clearly, this is not the per-port queue id that XDP is expecting:
      it wants a value in the range [0..3]. It shall directly use queue_id
      which is stored in rxq->logic_rxq -- so let's use that value instead.
      
      rxq->id is left untouched ; its value is indeed valid but it should
      not be used in this context.
      
      This is consistent with the remaining part of the code in
      mvpp2_rxq_init().
      
      With this change, packet capture is working as expected on all the
      MACCHIATOBin ports.
      
      Fixes: b27db227 ("mvpp2: use page_pool allocator")
      Signed-off-by: default avatarLouis Amas <louis.amas@eho.link>
      Signed-off-by: default avatarEmmanuel Deloget <emmanuel.deloget@eho.link>
      Reviewed-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Link: https://lore.kernel.org/r/20211207143423.916334-1-louis.amas@eho.linkSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a50e659b
    • Ronak Doshi's avatar
      vmxnet3: fix minimum vectors alloc issue · f71ef02f
      Ronak Doshi authored
      'Commit 39f9895a ("vmxnet3: add support for 32 Tx/Rx queues")'
      added support for 32Tx/Rx queues. Within that patch, value of
      VMXNET3_LINUX_MIN_MSIX_VECT was updated.
      
      However, there is a case (numvcpus = 2) which actually requires 3
      intrs which matches VMXNET3_LINUX_MIN_MSIX_VECT which then is
      treated as failure by stack to allocate more vectors. This patch
      fixes this issue.
      
      Fixes: 39f9895a ("vmxnet3: add support for 32 Tx/Rx queues")
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Link: https://lore.kernel.org/r/20211207081737.14000-1-doshir@vmware.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f71ef02f
    • Eric Dumazet's avatar
      net, neigh: clear whole pneigh_entry at alloc time · e195e9b5
      Eric Dumazet authored
      Commit 2c611ad9 ("net, neigh: Extend neigh->flags to 32 bit
      to allow for extensions") enables a new KMSAM warning [1]
      
      I think the bug is actually older, because the following intruction
      only occurred if ndm->ndm_flags had NTF_PROXY set.
      
      	pn->flags = ndm->ndm_flags;
      
      Let's clear all pneigh_entry fields at alloc time.
      
      [1]
      BUG: KMSAN: uninit-value in pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593
       pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593
       pneigh_dump_table net/core/neighbour.c:2715 [inline]
       neigh_dump_info+0x1e3f/0x2c60 net/core/neighbour.c:2832
       netlink_dump+0xaca/0x16a0 net/netlink/af_netlink.c:2265
       __netlink_dump_start+0xd1c/0xee0 net/netlink/af_netlink.c:2370
       netlink_dump_start include/linux/netlink.h:254 [inline]
       rtnetlink_rcv_msg+0x181b/0x18c0 net/core/rtnetlink.c:5534
       netlink_rcv_skb+0x447/0x800 net/netlink/af_netlink.c:2491
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5589
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x1095/0x1360 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x16f3/0x1870 net/netlink/af_netlink.c:1916
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg net/socket.c:724 [inline]
       sock_write_iter+0x594/0x690 net/socket.c:1057
       call_write_iter include/linux/fs.h:2162 [inline]
       new_sync_write fs/read_write.c:503 [inline]
       vfs_write+0x1318/0x2030 fs/read_write.c:590
       ksys_write+0x28c/0x520 fs/read_write.c:643
       __do_sys_write fs/read_write.c:655 [inline]
       __se_sys_write fs/read_write.c:652 [inline]
       __x64_sys_write+0xdb/0x120 fs/read_write.c:652
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Uninit was created at:
       slab_post_alloc_hook mm/slab.h:524 [inline]
       slab_alloc_node mm/slub.c:3251 [inline]
       slab_alloc mm/slub.c:3259 [inline]
       __kmalloc+0xc3c/0x12d0 mm/slub.c:4437
       kmalloc include/linux/slab.h:595 [inline]
       pneigh_lookup+0x60f/0xd70 net/core/neighbour.c:766
       arp_req_set_public net/ipv4/arp.c:1016 [inline]
       arp_req_set+0x430/0x10a0 net/ipv4/arp.c:1032
       arp_ioctl+0x8d4/0xb60 net/ipv4/arp.c:1232
       inet_ioctl+0x4ef/0x820 net/ipv4/af_inet.c:947
       sock_do_ioctl net/socket.c:1118 [inline]
       sock_ioctl+0xa3f/0x13e0 net/socket.c:1235
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:874 [inline]
       __se_sys_ioctl+0x2df/0x4a0 fs/ioctl.c:860
       __x64_sys_ioctl+0xd8/0x110 fs/ioctl.c:860
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      CPU: 1 PID: 20001 Comm: syz-executor.0 Not tainted 5.16.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 62dd9318 ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20211206165329.1049835-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e195e9b5
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · fd31cb0c
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal.
      
      2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel.
      
      3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit
         groups, from Stefano Brivio.
      
      4) Break rule evaluation on malformed TCP options.
      
      5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh,
         also from Florian
      
      6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
        netfilter: conntrack: annotate data-races around ct->timeout
        selftests: netfilter: switch zone stress to socat
        netfilter: nft_exthdr: break evaluation if setting TCP option fails
        selftests: netfilter: Add correctness test for mac,net set type
        nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
        vrf: don't run conntrack on vrf with !dflt qdisc
        netfilter: nfnetlink_queue: silence bogus compiler warning
      ====================
      
      Link: https://lore.kernel.org/r/20211209000847.102598-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fd31cb0c
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · b5b6b6ba
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-12-08
      
      Yahui adds re-initialization of Flow Director for VF reset.
      
      Paul restores interrupts when enabling VFs.
      
      Dave re-adds bandwidth check for DCBNL and moves DSCP mode check
      earlier in the function.
      
      Jesse prevents reporting of dropped packets that occur during
      initialization and fixes reporting of statistics which could occur with
      frequent reads.
      
      Michal corrects setting of protocol type for UDP header and fixes lack
      of differentiation when adding filters for tunnels.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: safer stats processing
        ice: fix adding different tunnels
        ice: fix choosing UDP header type
        ice: ignore dropped packets during init
        ice: Fix problems with DSCP QoS implementation
        ice: rearm other interrupt cause register after enabling VFs
        ice: fix FDIR init missing when reset VF
      ====================
      
      Link: https://lore.kernel.org/r/20211208211144.2629867-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b5b6b6ba
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 6efcdadc
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      bpf 2021-12-08
      
      We've added 12 non-merge commits during the last 22 day(s) which contain
      a total of 29 files changed, 659 insertions(+), 80 deletions(-).
      
      The main changes are:
      
      1) Fix an off-by-two error in packet range markings and also add a batch of
         new tests for coverage of these corner cases, from Maxim Mikityanskiy.
      
      2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.
      
      3) Fix two functional regressions and a build warning related to BTF kfunc
         for modules, from Kumar Kartikeya Dwivedi.
      
      4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
         PREEMPT_RT kernels, from Sebastian Andrzej Siewior.
      
      5) Add missing includes in order to be able to detangle cgroup vs bpf header
         dependencies, from Jakub Kicinski.
      
      6) Fix regression in BPF sockmap tests caused by missing detachment of progs
         from sockets when they are removed from the map, from John Fastabend.
      
      7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
         dispatcher, from Björn Töpel.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Add selftests to cover packet access corner cases
        bpf: Fix the off-by-two error in range markings
        treewide: Add missing includes masked by cgroup -> bpf dependency
        tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
        bpf: Fix bpf_check_mod_kfunc_call for built-in modules
        bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
        mips, bpf: Fix reference to non-existing Kconfig symbol
        bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
        Documentation/locking/locktypes: Update migrate_disable() bits.
        bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
        bpf, sockmap: Attach map progs to psock early for feature probes
        bpf, x86: Fix "no previous prototype" warning
      ====================
      
      Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6efcdadc
  2. 08 Dec, 2021 19 commits
  3. 07 Dec, 2021 15 commits