1. 15 Aug, 2024 1 commit
  2. 14 Aug, 2024 2 commits
  3. 13 Aug, 2024 3 commits
  4. 12 Aug, 2024 11 commits
  5. 11 Aug, 2024 2 commits
  6. 10 Aug, 2024 11 commits
    • Jakub Kicinski's avatar
      Merge branch 'mlx5-misc-fixes-2024-08-08' · eb755a95
      Jakub Kicinski authored
      Tariq Toukan says:
      
      ====================
      mlx5 misc fixes 2024-08-08
      
      This patchset provides misc bug fixes from the team to the mlx5 core and
      Eth drivers.
      ====================
      
      Link: https://patch.msgid.link/20240808144107.2095424-1-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eb755a95
    • Gal Pressman's avatar
      net/mlx5e: Fix queue stats access to non-existing channels splat · 0b4a4534
      Gal Pressman authored
      The queue stats API queries the queues according to the
      real_num_[tr]x_queues, in case the device is down and channels were not
      yet created, don't try to query their statistics.
      
      To trigger the panic, run this command before the interface is brought
      up:
      ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml --dump qstats-get --json '{"ifindex": 4}'
      
      BUG: kernel NULL pointer dereference, address: 0000000000000c00
      PGD 0 P4D 0
      Oops: Oops: 0000 [#1] SMP PTI
      CPU: 3 UID: 0 PID: 977 Comm: python3 Not tainted 6.10.0+ #40
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      RIP: 0010:mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
      Code: fc 55 48 63 ee 53 48 89 d3 e8 40 3d 70 e1 85 c0 74 58 4c 89 ef e8 d4 07 04 00 84 c0 75 41 49 8b 84 24 f8 39 00 00 48 8b 04 e8 <48> 8b 90 00 0c 00 00 48 03 90 40 0a 00 00 48 89 53 08 48 8b 90 08
      RSP: 0018:ffff888116be37d0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff888116be3868 RCX: 0000000000000004
      RDX: ffff88810ada4000 RSI: 0000000000000000 RDI: ffff888109df09c0
      RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000004
      R10: ffff88813461901c R11: ffffffffffffffff R12: ffff888109df0000
      R13: ffff888109df09c0 R14: ffff888116be38d0 R15: 0000000000000000
      FS:  00007f4375d5c740(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000c00 CR3: 0000000106ada006 CR4: 0000000000370eb0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       ? __die+0x1f/0x60
       ? page_fault_oops+0x14e/0x3d0
       ? exc_page_fault+0x73/0x130
       ? asm_exc_page_fault+0x22/0x30
       ? mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
       netdev_nl_stats_by_netdev+0x2a6/0x4c0
       ? __rmqueue_pcplist+0x351/0x6f0
       netdev_nl_qstats_get_dumpit+0xc4/0x1b0
       genl_dumpit+0x2d/0x80
       netlink_dump+0x199/0x410
       __netlink_dump_start+0x1aa/0x2c0
       genl_family_rcv_msg_dumpit+0x94/0xf0
       ? __pfx_genl_start+0x10/0x10
       ? __pfx_genl_dumpit+0x10/0x10
       ? __pfx_genl_done+0x10/0x10
       genl_rcv_msg+0x116/0x2b0
       ? __pfx_netdev_nl_qstats_get_dumpit+0x10/0x10
       ? __pfx_genl_rcv_msg+0x10/0x10
       netlink_rcv_skb+0x54/0x100
       genl_rcv+0x24/0x40
       netlink_unicast+0x21a/0x340
       netlink_sendmsg+0x1f4/0x440
       __sys_sendto+0x1b6/0x1c0
       ? do_sock_setsockopt+0xc3/0x180
       ? __sys_setsockopt+0x60/0xb0
       __x64_sys_sendto+0x20/0x30
       do_syscall_64+0x50/0x110
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7f43757132b0
      Code: c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 41 54 48 83 ec 20
      RSP: 002b:00007ffd258da048 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007ffd258da0f8 RCX: 00007f43757132b0
      RDX: 000000000000001c RSI: 00007f437464b850 RDI: 0000000000000003
      RBP: 00007f4375085de0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f43751a6147
       </TASK>
      Modules linked in: netconsole xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core zram zsmalloc mlx5_core fuse [last unloaded: netconsole]
      CR2: 0000000000000c00
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
      Code: fc 55 48 63 ee 53 48 89 d3 e8 40 3d 70 e1 85 c0 74 58 4c 89 ef e8 d4 07 04 00 84 c0 75 41 49 8b 84 24 f8 39 00 00 48 8b 04 e8 <48> 8b 90 00 0c 00 00 48 03 90 40 0a 00 00 48 89 53 08 48 8b 90 08
      RSP: 0018:ffff888116be37d0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff888116be3868 RCX: 0000000000000004
      RDX: ffff88810ada4000 RSI: 0000000000000000 RDI: ffff888109df09c0
      RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000004
      R10: ffff88813461901c R11: ffffffffffffffff R12: ffff888109df0000
      R13: ffff888109df09c0 R14: ffff888116be38d0 R15: 0000000000000000
      FS:  00007f4375d5c740(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000c00 CR3: 0000000106ada006 CR4: 0000000000370eb0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 7b66ae53 ("net/mlx5e: Add per queue netdev-genl stats")
      Signed-off-by: default avatarGal Pressman <gal@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarJoe Damato <jdamato@fastly.com>
      Link: https://patch.msgid.link/20240808144107.2095424-6-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0b4a4534
    • Cosmin Ratiu's avatar
      net/mlx5e: Correctly report errors for ethtool rx flows · cbc796be
      Cosmin Ratiu authored
      Previously, an ethtool rx flow with no attrs would not be added to the
      NIC as it has no rules to configure the hw with, but it would be
      reported as successful to the caller (return code 0). This is confusing
      for the user as ethtool then reports "Added rule $num", but no rule was
      actually added.
      
      This change corrects that by instead reporting these wrong rules as
      -EINVAL.
      
      Fixes: b29c61da ("net/mlx5e: Ethtool steering flow validation refactoring")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Reviewed-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://patch.msgid.link/20240808144107.2095424-5-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cbc796be
    • Dragos Tatulea's avatar
      net/mlx5e: Take state lock during tx timeout reporter · e6b5afd3
      Dragos Tatulea authored
      mlx5e_safe_reopen_channels() requires the state lock taken. The
      referenced changed in the Fixes tag removed the lock to fix another
      issue. This patch adds it back but at a later point (when calling
      mlx5e_safe_reopen_channels()) to avoid the deadlock referenced in the
      Fixes tag.
      
      Fixes: eab0da38 ("net/mlx5e: Fix possible deadlock on mlx5e_tx_timeout_work")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Link: https://lore.kernel.org/all/ZplpKq8FKi3vwfxv@gmail.com/T/Reviewed-by: default avatarBreno Leitao <leitao@debian.org>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://patch.msgid.link/20240808144107.2095424-4-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e6b5afd3
    • Dragos Tatulea's avatar
      net/mlx5e: SHAMPO, Increase timeout to improve latency · ab6013a5
      Dragos Tatulea authored
      During latency tests (netperf TCP_RR) a 30% degradation of HW GRO vs SW
      GRO was observed. This is due to SHAMPO triggering timeout filler CQEs
      instead of delivering the CQE for the packet.
      
      Having a short timeout for SHAMPO doesn't bring any benefits as it is
      the driver that does the merging, not the hardware. On the contrary, it
      can have a negative impact: additional filler CQEs are generated due to
      the timeout. As there is no way to disable this timeout, this change
      sets it to the maximum value.
      
      Instead of using the packet_merge.timeout parameter which is also used
      for LRO, set the value directly when filling in the rest of the SHAMPO
      parameters in mlx5e_build_rq_param().
      
      Fixes: 99be5617 ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://patch.msgid.link/20240808144107.2095424-3-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ab6013a5
    • Tariq Toukan's avatar
      net/mlx5: SD, Do not query MPIR register if no sd_group · c31fe2b5
      Tariq Toukan authored
      Unconditionally calling the MPIR query on BF separate mode yields the FW
      syndrome below [1]. Do not call it unless admin clearly specified the SD
      group, i.e. expressing the intention of using the multi-PF netdev
      feature.
      
      This fix covers cases not covered in
      commit fca3b479 ("net/mlx5: Do not query MPIR on embedded CPU function").
      
      [1]
      mlx5_cmd_out_err:808:(pid 8267): ACCESS_REG(0x805) op_mod(0x1) failed,
      status bad system state(0x4), syndrome (0x685f19), err(-5)
      
      Fixes: 678eb448 ("net/mlx5: SD, Implement basic query and instantiation")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Link: https://patch.msgid.link/20240808144107.2095424-2-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c31fe2b5
    • Jakub Kicinski's avatar
      Merge branch 'don-t-take-hw-uso-path-when-packets-can-t-be-checksummed-by-device' · d2438c16
      Jakub Kicinski authored
      Jakub Sitnicki says:
      
      ====================
      Don't take HW USO path when packets can't be checksummed by device
      
      This series addresses a recent regression report from syzbot [1].
      
      After enabling UDP_SEGMENT for egress devices which don't support checksum
      offload [2], we need to tighten down the checks which let packets take the
      HW USO path.
      
      The fix consists of two parts:
      
      1. don't let devices offer USO without checksum offload, and
      2. force software USO fallback in presence of IPv6 extension headers.
      
      [1] https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/
      [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=10154dbded6d6a2fecaebdfda206609de0f121a9
      
      v3: https://lore.kernel.org/r/20240807-udp-gso-egress-from-tunnel-v3-0-8828d93c5b45@cloudflare.com
      v2: https://lore.kernel.org/r/20240801-udp-gso-egress-from-tunnel-v2-0-9a2af2f15d8d@cloudflare.com
      v1: https://lore.kernel.org/r/20240725-udp-gso-egress-from-tunnel-v1-0-5e5530ead524@cloudflare.com
      ====================
      
      Link: https://patch.msgid.link/20240808-udp-gso-egress-from-tunnel-v4-0-f5c5b4149ab9@cloudflare.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d2438c16
    • Jakub Sitnicki's avatar
      selftests/net: Add coverage for UDP GSO with IPv6 extension headers · 1d2c46c1
      Jakub Sitnicki authored
      After enabling UDP GSO for devices not offering checksum offload, we have
      hit a regression where a bad offload warning can be triggered when sending
      a datagram with IPv6 extension headers.
      
      Extend the UDP GSO IPv6 tests to cover this scenario.
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://patch.msgid.link/20240808-udp-gso-egress-from-tunnel-v4-3-f5c5b4149ab9@cloudflare.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1d2c46c1
    • Jakub Sitnicki's avatar
      udp: Fall back to software USO if IPv6 extension headers are present · 30b03f2a
      Jakub Sitnicki authored
      In commit 10154dbd ("udp: Allow GSO transmit from devices with no
      checksum offload") we have intentionally allowed UDP GSO packets marked
      CHECKSUM_NONE to pass to the GSO stack, so that they can be segmented and
      checksummed by a software fallback when the egress device lacks these
      features.
      
      What was not taken into consideration is that a CHECKSUM_NONE skb can be
      handed over to the GSO stack also when the egress device advertises the
      tx-udp-segmentation / NETIF_F_GSO_UDP_L4 feature.
      
      This will happen when there are IPv6 extension headers present, which we
      check for in __ip6_append_data(). Syzbot has discovered this scenario,
      producing a warning as below:
      
        ip6tnl0: caps=(0x00000006401d7869, 0x00000006401d7869)
        WARNING: CPU: 0 PID: 5112 at net/core/dev.c:3293 skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
        Modules linked in:
        CPU: 0 PID: 5112 Comm: syz-executor391 Not tainted 6.10.0-rc7-syzkaller-01603-g80ab5445 #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
        RIP: 0010:skb_warn_bad_offload+0x166/0x1a0 net/core/dev.c:3291
        [...]
        Call Trace:
         <TASK>
         __skb_gso_segment+0x3be/0x4c0 net/core/gso.c:127
         skb_gso_segment include/net/gso.h:83 [inline]
         validate_xmit_skb+0x585/0x1120 net/core/dev.c:3661
         __dev_queue_xmit+0x17a4/0x3e90 net/core/dev.c:4415
         neigh_output include/net/neighbour.h:542 [inline]
         ip6_finish_output2+0xffa/0x1680 net/ipv6/ip6_output.c:137
         ip6_finish_output+0x41e/0x810 net/ipv6/ip6_output.c:222
         ip6_send_skb+0x112/0x230 net/ipv6/ip6_output.c:1958
         udp_v6_send_skb+0xbf5/0x1870 net/ipv6/udp.c:1292
         udpv6_sendmsg+0x23b3/0x3270 net/ipv6/udp.c:1588
         sock_sendmsg_nosec net/socket.c:730 [inline]
         __sock_sendmsg+0xef/0x270 net/socket.c:745
         ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
         ___sys_sendmsg net/socket.c:2639 [inline]
         __sys_sendmmsg+0x3b2/0x740 net/socket.c:2725
         __do_sys_sendmmsg net/socket.c:2754 [inline]
         __se_sys_sendmmsg net/socket.c:2751 [inline]
         __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2751
         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
         do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
         entry_SYSCALL_64_after_hwframe+0x77/0x7f
         [...]
         </TASK>
      
      We are hitting the bad offload warning because when an egress device is
      capable of handling segmentation offload requested by
      skb_shinfo(skb)->gso_type, the chain of gso_segment callbacks won't produce
      any segment skbs and return NULL. See the skb_gso_ok() branch in
      {__udp,tcp,sctp}_gso_segment helpers.
      
      To fix it, force a fallback to software USO when processing a packet with
      IPv6 extension headers, since we don't know if these can checksummed by
      all devices which offer USO.
      
      Fixes: 10154dbd ("udp: Allow GSO transmit from devices with no checksum offload")
      Reported-by: syzbot+e15b7e15b8a751a91d9a@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/all/000000000000e1609a061d5330ce@google.com/Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://patch.msgid.link/20240808-udp-gso-egress-from-tunnel-v4-2-f5c5b4149ab9@cloudflare.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30b03f2a
    • Jakub Sitnicki's avatar
      net: Make USO depend on CSUM offload · 2b2bc3ba
      Jakub Sitnicki authored
      UDP segmentation offload inherently depends on checksum offload. It should
      not be possible to disable checksum offload while leaving USO enabled.
      Enforce this dependency in code.
      
      There is a single tx-udp-segmentation feature flag to indicate support for
      both IPv4/6, hence the devices wishing to support USO must offer checksum
      offload for both IP versions.
      
      Fixes: 10154dbd ("udp: Allow GSO transmit from devices with no checksum offload")
      Suggested-by: default avatarWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://patch.msgid.link/20240808-udp-gso-egress-from-tunnel-v4-1-f5c5b4149ab9@cloudflare.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2b2bc3ba
    • Eric Dumazet's avatar
      gtp: pull network headers in gtp_dev_xmit() · 3a3be7ff
      Eric Dumazet authored
      syzbot/KMSAN reported use of uninit-value in get_dev_xmit() [1]
      
      We must make sure the IPv4 or Ipv6 header is pulled in skb->head
      before accessing fields in them.
      
      Use pskb_inet_may_pull() to fix this issue.
      
      [1]
      BUG: KMSAN: uninit-value in ipv6_pdp_find drivers/net/gtp.c:220 [inline]
       BUG: KMSAN: uninit-value in gtp_build_skb_ip6 drivers/net/gtp.c:1229 [inline]
       BUG: KMSAN: uninit-value in gtp_dev_xmit+0x1424/0x2540 drivers/net/gtp.c:1281
        ipv6_pdp_find drivers/net/gtp.c:220 [inline]
        gtp_build_skb_ip6 drivers/net/gtp.c:1229 [inline]
        gtp_dev_xmit+0x1424/0x2540 drivers/net/gtp.c:1281
        __netdev_start_xmit include/linux/netdevice.h:4913 [inline]
        netdev_start_xmit include/linux/netdevice.h:4922 [inline]
        xmit_one net/core/dev.c:3580 [inline]
        dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3596
        __dev_queue_xmit+0x358c/0x5610 net/core/dev.c:4423
        dev_queue_xmit include/linux/netdevice.h:3105 [inline]
        packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276
        packet_snd net/packet/af_packet.c:3145 [inline]
        packet_sendmsg+0x90e3/0xa3a0 net/packet/af_packet.c:3177
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        __sys_sendto+0x685/0x830 net/socket.c:2204
        __do_sys_sendto net/socket.c:2216 [inline]
        __se_sys_sendto net/socket.c:2212 [inline]
        __x64_sys_sendto+0x125/0x1d0 net/socket.c:2212
        x64_sys_call+0x3799/0x3c10 arch/x86/include/generated/asm/syscalls_64.h:45
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Uninit was created at:
        slab_post_alloc_hook mm/slub.c:3994 [inline]
        slab_alloc_node mm/slub.c:4037 [inline]
        kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4080
        kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:583
        __alloc_skb+0x363/0x7b0 net/core/skbuff.c:674
        alloc_skb include/linux/skbuff.h:1320 [inline]
        alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6526
        sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2815
        packet_alloc_skb net/packet/af_packet.c:2994 [inline]
        packet_snd net/packet/af_packet.c:3088 [inline]
        packet_sendmsg+0x749c/0xa3a0 net/packet/af_packet.c:3177
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        __sys_sendto+0x685/0x830 net/socket.c:2204
        __do_sys_sendto net/socket.c:2216 [inline]
        __se_sys_sendto net/socket.c:2212 [inline]
        __x64_sys_sendto+0x125/0x1d0 net/socket.c:2212
        x64_sys_call+0x3799/0x3c10 arch/x86/include/generated/asm/syscalls_64.h:45
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      CPU: 0 UID: 0 PID: 7115 Comm: syz.1.515 Not tainted 6.11.0-rc1-syzkaller-00043-g94ede2a3 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
      
      Fixes: 999cb275 ("gtp: add IPv6 support")
      Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Harald Welte <laforge@gnumonks.org>
      Reviewed-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Link: https://patch.msgid.link/20240808132455.3413916-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3a3be7ff
  7. 09 Aug, 2024 5 commits
    • Foster Snowhill's avatar
      usbnet: ipheth: fix carrier detection in modes 1 and 4 · 67927a1b
      Foster Snowhill authored
      Apart from the standard "configurations", "interfaces" and "alternate
      interface settings" in USB, iOS devices also have a notion of
      "modes". In different modes, the device exposes a different set of
      available configurations.
      
      Depending on the iOS version, and depending on the current mode, the
      length and contents of the carrier state control message differs:
      
      * 1 byte (seen on iOS 4.2.1, 8.4):
          * 03: carrier off (mode 0)
          * 04: carrier on (mode 0)
      * 3 bytes (seen on iOS 10.3.4, 15.7.6):
          * 03 03 03: carrier off (mode 0)
          * 04 04 03: carrier on (mode 0)
      * 4 bytes (seen on iOS 16.5, 17.6):
          * 03 03 03 00: carrier off (mode 0)
          * 04 03 03 00: carrier off (mode 1)
          * 06 03 03 00: carrier off (mode 4)
          * 04 04 03 04: carrier on (mode 0 and 1)
          * 06 04 03 04: carrier on (mode 4)
      
      Before this change, the driver always used the first byte of the
      response to determine carrier state.
      
      From this larger sample, the first byte seems to indicate the number of
      available USB configurations in the current mode (with the exception of
      the default mode 0), and in some cases (namely mode 1 and 4) does not
      correlate with the carrier state.
      
      Previous logic erroneously counted `04 03 03 00` as "carrier on" and
      `06 04 03 04` as "carrier off" on iOS versions that support mode 1 and
      mode 4 respectively.
      
      Only modes 0, 1 and 4 expose the USB Ethernet interfaces necessary for
      the ipheth driver.
      
      Check the second byte of the control message where possible, and fall
      back to checking the first byte on older iOS versions.
      Signed-off-by: default avatarFoster Snowhill <forst@pen.gy>
      Tested-by: default avatarGeorgi Valkov <gvalkov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67927a1b
    • Foster Snowhill's avatar
      usbnet: ipheth: do not stop RX on failing RX callback · 74efed51
      Foster Snowhill authored
      RX callbacks can fail for multiple reasons:
      
      * Payload too short
      * Payload formatted incorrecly (e.g. bad NCM framing)
      * Lack of memory
      
      None of these should cause the driver to seize up.
      
      Make such failures non-critical and continue processing further
      incoming URBs.
      Signed-off-by: default avatarFoster Snowhill <forst@pen.gy>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74efed51
    • Foster Snowhill's avatar
      usbnet: ipheth: drop RX URBs with no payload · 94d7eeb6
      Foster Snowhill authored
      On iPhone 15 Pro Max one can observe periodic URBs with no payload
      on the "bulk in" (RX) endpoint. These don't seem to do anything
      meaningful. Reproduced on iOS 17.5.1 and 17.6.
      
      This behaviour isn't observed on iPhone 11 on the same iOS version. The
      nature of these zero-length URBs is so far unknown.
      
      Drop RX URBs with no payload.
      Signed-off-by: default avatarFoster Snowhill <forst@pen.gy>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94d7eeb6
    • Foster Snowhill's avatar
      usbnet: ipheth: remove extraneous rx URB length check · 655b46d7
      Foster Snowhill authored
      Rx URB length was already checked in ipheth_rcvbulk_callback_legacy()
      and ipheth_rcvbulk_callback_ncm(), depending on the current mode.
      The check in ipheth_rcvbulk_callback() was thus mostly a duplicate.
      
      The only place in ipheth_rcvbulk_callback() where we care about the URB
      length is for the initial control frame. These frames are always 4 bytes
      long. This has been checked as far back as iOS 4.2.1 on iPhone 3G.
      
      Remove the extraneous URB length check. For control frames, check for
      the specific 4-byte length instead.
      Signed-off-by: default avatarFoster Snowhill <forst@pen.gy>
      Tested-by: default avatarGeorgi Valkov <gvalkov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      655b46d7
    • Oliver Neukum's avatar
      usbnet: ipheth: race between ipheth_close and error handling · e5876b08
      Oliver Neukum authored
      ipheth_sndbulk_callback() can submit carrier_work
      as a part of its error handling. That means that
      the driver must make sure that the work is cancelled
      after it has made sure that no more URB can terminate
      with an error condition.
      
      Hence the order of actions in ipheth_close() needs
      to be inverted.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarFoster Snowhill <forst@pen.gy>
      Tested-by: default avatarGeorgi Valkov <gvalkov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5876b08
  8. 08 Aug, 2024 5 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ee9a43b7
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth.
      
        Current release - regressions:
      
         - eth: bnxt_en: fix memory out-of-bounds in bnxt_fill_hw_rss_tbl() on
           older chips
      
        Current release - new code bugs:
      
         - ethtool: fix off-by-one error / kdoc contradicting the code for max
           RSS context IDs
      
         - Bluetooth: hci_qca:
            - QCA6390: fix support on non-DT platforms
            - QCA6390: don't call pwrseq_power_off() twice
            - fix a NULL-pointer derefence at shutdown
      
         - eth: ice: fix incorrect assigns of FEC counters
      
        Previous releases - regressions:
      
         - mptcp: fix handling endpoints with both 'signal' and 'subflow'
           flags set
      
         - virtio-net: fix changing ring count when vq IRQ coalescing not
           supported
      
         - eth: gve: fix use of netif_carrier_ok() during reconfig / reset
      
        Previous releases - always broken:
      
         - eth: idpf: fix bugs in queue re-allocation on reconfig / reset
      
         - ethtool: fix context creation with no parameters
      
        Misc:
      
         - linkwatch: use system_unbound_wq to ease RTNL contention"
      
      * tag 'net-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (41 commits)
        net: dsa: microchip: disable EEE for KSZ8567/KSZ9567/KSZ9896/KSZ9897.
        ethtool: Fix context creation with no parameters
        net: ethtool: fix off-by-one error in max RSS context IDs
        net: pse-pd: tps23881: include missing bitfield.h header
        net: fec: Stop PPS on driver remove
        net: bcmgenet: Properly overlay PHY and MAC Wake-on-LAN capabilities
        l2tp: fix lockdep splat
        net: stmmac: dwmac4: fix PCS duplex mode decode
        idpf: fix UAFs when destroying the queues
        idpf: fix memleak in vport interrupt configuration
        idpf: fix memory leaks and crashes while performing a soft reset
        bnxt_en : Fix memory out-of-bounds in bnxt_fill_hw_rss_tbl()
        net: dsa: bcm_sf2: Fix a possible memory leak in bcm_sf2_mdio_register()
        net/smc: add the max value of fallback reason count
        Bluetooth: hci_sync: avoid dup filtering when passive scanning with adv monitor
        Bluetooth: l2cap: always unlock channel in l2cap_conless_channel()
        Bluetooth: hci_qca: fix a NULL-pointer derefence at shutdown
        Bluetooth: hci_qca: fix QCA6390 support on non-DT platforms
        Bluetooth: hci_qca: don't call pwrseq_power_off() twice for QCA6390
        ice: Fix incorrect assigns of FEC counts
        ...
      ee9a43b7
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 9466b6ae
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Have reading of event format files test if the metadata still exists.
      
         When a event is freed, a flag (EVENT_FILE_FL_FREED) in the metadata
         is set to state that it is to prevent any new references to it from
         happening while waiting for existing references to close. When the
         last reference closes, the metadata is freed. But the "format" was
         missing a check to this flag (along with some other files) that
         allowed new references to happen, and a use-after-free bug to occur.
      
       - Have the trace event meta data use the refcount infrastructure
         instead of relying on its own atomic counters.
      
       - Have tracefs inodes use alloc_inode_sb() for allocation instead of
         using kmem_cache_alloc() directly.
      
       - Have eventfs_create_dir() return an ERR_PTR instead of NULL as the
         callers expect a real object or an ERR_PTR.
      
       - Have release_ei() use call_srcu() and not call_rcu() as all the
         protection is on SRCU and not RCU.
      
       - Fix ftrace_graph_ret_addr() to use the task passed in and not
         current.
      
       - Fix overflow bug in get_free_elt() where the counter can overflow the
         integer and cause an infinite loop.
      
       - Remove unused function ring_buffer_nr_pages()
      
       - Have tracefs freeing use the inode RCU infrastructure instead of
         creating its own.
      
         When the kernel had randomize structure fields enabled, the rcu field
         of the tracefs_inode was overlapping the rcu field of the inode
         structure, and corrupting it. Instead, use the destroy_inode()
         callback to do the initial cleanup of the code, and then have
         free_inode() free it.
      
      * tag 'trace-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracefs: Use generic inode RCU for synchronizing freeing
        ring-buffer: Remove unused function ring_buffer_nr_pages()
        tracing: Fix overflow in get_free_elt()
        function_graph: Fix the ret_stack used by ftrace_graph_ret_addr()
        eventfs: Use SRCU for freeing eventfs_inodes
        eventfs: Don't return NULL in eventfs_create_dir()
        tracefs: Fix inode allocation
        tracing: Use refcount for trace_event_file reference counter
        tracing: Have format file honor EVENT_FILE_FL_FREED
      9466b6ae
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-08-08' of git://evilpiepirate.org/bcachefs · b3f5620f
      Linus Torvalds authored
      Pull bcachefs fixes from Kent Overstreet:
       "Assorted little stuff:
      
         - lockdep fixup for lockdep_set_notrack_class()
      
         - we can now remove a device when using erasure coding without
           deadlocking, though we still hit other issues
      
         - the 'allocator stuck' timeout is now configurable, and messages are
           ratelimited. The default timeout has been increased from 10 seconds
           to 30"
      
      * tag 'bcachefs-2024-08-08' of git://evilpiepirate.org/bcachefs:
        bcachefs: Use bch2_wait_on_allocator() in btree node alloc path
        bcachefs: Make allocator stuck timeout configurable, ratelimit messages
        bcachefs: Add missing path_traverse() to btree_iter_next_node()
        bcachefs: ec should not allocate from ro devs
        bcachefs: Improved allocator debugging for ec
        bcachefs: Add missing bch2_trans_begin() call
        bcachefs: Add a comment for bucket helper types
        bcachefs: Don't rely on implicit unsigned -> signed integer conversion
        lockdep: Fix lockdep_set_notrack_class() for CONFIG_LOCK_STAT
        bcachefs: Fix double free of ca->buckets_nouse
      b3f5620f
    • Linus Torvalds's avatar
      module: warn about excessively long module waits · cb5b81bc
      Linus Torvalds authored
      Russell King reported that the arm cbc(aes) crypto module hangs when
      loaded, and Herbert Xu bisected it to commit 9b9879fc ("modules:
      catch concurrent module loads, treat them as idempotent"), and noted:
      
       "So what's happening here is that the first modprobe tries to load a
        fallback CBC implementation, in doing so it triggers a load of the
        exact same module due to module aliases.
      
        IOW we're loading aes-arm-bs which provides cbc(aes). However, this
        needs a fallback of cbc(aes) to operate, which is made out of the
        generic cbc module + any implementation of aes, or ecb(aes). The
        latter happens to also be provided by aes-arm-cb so that's why it
        tries to load the same module again"
      
      So loading the aes-arm-bs module ends up wanting to recursively load
      itself, and the recursive load then ends up waiting for the original
      module load to complete.
      
      This is a regression, in that it used to be that we just tried to load
      the module multiple times, and then as we went on to install it the
      second time we would instead just error out because the module name
      already existed.
      
      That is actually also exactly what the original "catch concurrent loads"
      patch did in commit 9828ed3f ("module: error out early on concurrent
      load of the same module file"), but it turns out that it ends up being
      racy, in that erroring out before the module has been fully initialized
      will cause failures in dependent module loading.
      
      See commit ac2263b5 (which was the revert of that "error out early")
      commit for details about why erroring out before the module has been
      initialized is actually fundamentally racy.
      
      Now, for the actual recursive module load (as opposed to just
      concurrently loading the same module twice), the race is not an issue.
      
      At the same time it's hard for the kernel to see that this is recursion,
      because the module load is always done from a usermode helper, so the
      recursion is not some simple callchain within the kernel.
      
      End result: this is not the real fix, but this at least adds a warning
      for the situation (admittedly much too late for all the debugging pain
      that Russell and Herbert went through) and if we can come to a
      resolution on how to detect the recursion properly, this re-organizes
      the code to make that easier.
      
      Link: https://lore.kernel.org/all/ZrFHLqvFqhzykuYw@shell.armlinux.org.uk/Reported-by: default avatarRussell King <linux@armlinux.org.uk>
      Debugged-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb5b81bc
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.11-1' of... · cf6d429e
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Enable general EFI poweroff method to make poweroff usable on
        hardwares which lack ACPI S5, use accessors to page table entries
        instead of direct dereference to avoid potential problems, and two
        trivial kvm cleanups"
      
      * tag 'loongarch-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: KVM: Remove undefined a6 argument comment for kvm_hypercall()
        LoongArch: KVM: Remove unnecessary definition of KVM_PRIVATE_MEM_SLOTS
        LoongArch: Use accessors to page table entries instead of direct dereference
        LoongArch: Enable general EFI poweroff method
      cf6d429e