1. 16 Jun, 2023 24 commits
    • Leon Romanovsky's avatar
      net/mlx5e: Drop XFRM state lock when modifying flow steering · c75b9425
      Leon Romanovsky authored
      XFRM state which is changed to be XFRM_STATE_EXPIRED doesn't really
      need to hold lock while modifying flow steering rules to drop traffic.
      
      That state can be deleted only and as such mlx5e_ipsec_handle_tx_limit()
      work will be canceled anyway and won't run in parallel.
      
      Fixes: b2f7b01d ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c75b9425
    • Patrisious Haddad's avatar
      net/mlx5e: Fix ESN update kernel panic · fef06678
      Patrisious Haddad authored
      Previously during mlx5e_ipsec_handle_event the driver tried to execute
      an operation that could sleep, while holding a spinlock, which caused
      the kernel panic mentioned below.
      
      Move the function call that can sleep outside of the spinlock context.
      
       Call Trace:
       <TASK>
       dump_stack_lvl+0x49/0x6c
       __schedule_bug.cold+0x42/0x4e
       schedule_debug.constprop.0+0xe0/0x118
       __schedule+0x59/0x58a
       ? __mod_timer+0x2a1/0x3ef
       schedule+0x5e/0xd4
       schedule_timeout+0x99/0x164
       ? __pfx_process_timeout+0x10/0x10
       __wait_for_common+0x90/0x1da
       ? __pfx_schedule_timeout+0x10/0x10
       wait_func+0x34/0x142 [mlx5_core]
       mlx5_cmd_invoke+0x1f3/0x313 [mlx5_core]
       cmd_exec+0x1fe/0x325 [mlx5_core]
       mlx5_cmd_do+0x22/0x50 [mlx5_core]
       mlx5_cmd_exec+0x1c/0x40 [mlx5_core]
       mlx5_modify_ipsec_obj+0xb2/0x17f [mlx5_core]
       mlx5e_ipsec_update_esn_state+0x69/0xf0 [mlx5_core]
       ? wake_affine+0x62/0x1f8
       mlx5e_ipsec_handle_event+0xb1/0xc0 [mlx5_core]
       process_one_work+0x1e2/0x3e6
       ? __pfx_worker_thread+0x10/0x10
       worker_thread+0x54/0x3ad
       ? __pfx_worker_thread+0x10/0x10
       kthread+0xda/0x101
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x29/0x37
       </TASK>
       BUG: workqueue leaked lock or atomic: kworker/u256:4/0x7fffffff/189754#012     last function: mlx5e_ipsec_handle_event [mlx5_core]
       CPU: 66 PID: 189754 Comm: kworker/u256:4 Kdump: loaded Tainted: G        W          6.2.0-2596.20230309201517_5.el8uek.rc1.x86_64 #2
       Hardware name: Oracle Corporation ORACLE SERVER X9-2/ASMMBX9-2, BIOS 61070300 08/17/2022
       Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_event [mlx5_core]
       Call Trace:
       <TASK>
       dump_stack_lvl+0x49/0x6c
       process_one_work.cold+0x2b/0x3c
       ? __pfx_worker_thread+0x10/0x10
       worker_thread+0x54/0x3ad
       ? __pfx_worker_thread+0x10/0x10
       kthread+0xda/0x101
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x29/0x37
       </TASK>
       BUG: scheduling while atomic: kworker/u256:4/189754/0x00000000
      
      Fixes: cee137a6 ("net/mlx5e: Handle ESN update events")
      Signed-off-by: default avatarPatrisious Haddad <phaddad@nvidia.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      fef06678
    • Leon Romanovsky's avatar
      net/mlx5e: Don't delay release of hardware objects · cf5bb023
      Leon Romanovsky authored
      XFRM core provides two callbacks to release resources, one is .xdo_dev_policy_delete()
      and another is .xdo_dev_policy_free(). This separation allows delayed release so
      "ip xfrm policy free" commands won't starve. Unfortunately, mlx5 command interface
      can't run in .xdo_dev_policy_free() callbacks as the latter runs in ATOMIC context.
      
       BUG: scheduling while atomic: swapper/7/0/0x00000100
       Modules linked in: act_mirred act_tunnel_key cls_flower sch_ingress vxlan mlx5_vdpa vringh vhost_iotlb vdpa rpcrdma rdma_ucm ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay mlx5_core zram zsmalloc fuse
       CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.3.0+ #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Call Trace:
        <IRQ>
        dump_stack_lvl+0x33/0x50
        __schedule_bug+0x4e/0x60
        __schedule+0x5d5/0x780
        ? __mod_timer+0x286/0x3d0
        schedule+0x50/0x90
        schedule_timeout+0x7c/0xf0
        ? __bpf_trace_tick_stop+0x10/0x10
        __wait_for_common+0x88/0x190
        ? usleep_range_state+0x90/0x90
        cmd_exec+0x42e/0xb40 [mlx5_core]
        mlx5_cmd_do+0x1e/0x40 [mlx5_core]
        mlx5_cmd_exec+0x18/0x30 [mlx5_core]
        mlx5_cmd_delete_fte+0xa8/0xd0 [mlx5_core]
        del_hw_fte+0x60/0x120 [mlx5_core]
        mlx5_del_flow_rules+0xec/0x270 [mlx5_core]
        ? default_send_IPI_single_phys+0x26/0x30
        mlx5e_accel_ipsec_fs_del_pol+0x1a/0x60 [mlx5_core]
        mlx5e_xfrm_free_policy+0x15/0x20 [mlx5_core]
        xfrm_policy_destroy+0x5a/0xb0
        xfrm4_dst_destroy+0x7b/0x100
        dst_destroy+0x37/0x120
        rcu_core+0x2d6/0x540
        __do_softirq+0xcd/0x273
        irq_exit_rcu+0x82/0xb0
        sysvec_apic_timer_interrupt+0x72/0x90
        </IRQ>
        <TASK>
        asm_sysvec_apic_timer_interrupt+0x16/0x20
       RIP: 0010:default_idle+0x13/0x20
       Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 7a 4d ee 00 85 c0 7e 07 0f 00 2d 2f 98 2e 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 40 b4 02 00
       RSP: 0018:ffff888100843ee0 EFLAGS: 00000242
       RAX: 0000000000000001 RBX: ffff888100812b00 RCX: 4000000000000000
       RDX: 0000000000000001 RSI: 0000000000000083 RDI: 000000000002d2ec
       RBP: 0000000000000007 R08: 00000021daeded59 R09: 0000000000000001
       R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
        default_idle_call+0x30/0xb0
        do_idle+0x1c1/0x1d0
        cpu_startup_entry+0x19/0x20
        start_secondary+0xfe/0x120
        secondary_startup_64_no_verify+0xf3/0xfb
        </TASK>
       bad: scheduling from the idle thread!
      
      Fixes: a5b8ca94 ("net/mlx5e: Add XFRM policy offload logic")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      cf5bb023
    • Saeed Mahameed's avatar
      net/mlx5: Free IRQ rmap and notifier on kernel shutdown · 314ded53
      Saeed Mahameed authored
      The kernel IRQ system needs the irq affinity notifier to be clear
      before attempting to free the irq, see WARN_ON log below.
      
      On a normal driver unload we don't have this issue since we do the
      complete cleanup of the irq resources.
      
      To fix this, put the important resources cleanup in a helper function
      and use it in both normal driver unload and shutdown flows.
      
      [ 4497.498434] ------------[ cut here ]------------
      [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340
      [ 4497.499193] Modules linked in:
      [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.4.0-rc4+ #10
      [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      [ 4497.500518] Workqueue: events do_poweroff
      [ 4497.500849] RIP: 0010:free_irq+0x295/0x340
      [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008
      [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282
      [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000
      [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023
      [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260
      [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000
      [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600
      [ 4497.504804] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      [ 4497.505302] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0
      [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 4497.507002] Call Trace:
      [ 4497.507158]  <TASK>
      [ 4497.507299]  ? free_irq+0x295/0x340
      [ 4497.507522]  ? __warn+0x7c/0x130
      [ 4497.507740]  ? free_irq+0x295/0x340
      [ 4497.507963]  ? report_bug+0x171/0x1a0
      [ 4497.508197]  ? handle_bug+0x3c/0x70
      [ 4497.508417]  ? exc_invalid_op+0x17/0x70
      [ 4497.508662]  ? asm_exc_invalid_op+0x1a/0x20
      [ 4497.508926]  ? free_irq+0x295/0x340
      [ 4497.509146]  mlx5_irq_pool_free_irqs+0x48/0x90
      [ 4497.509421]  mlx5_irq_table_free_irqs+0x38/0x50
      [ 4497.509714]  mlx5_core_eq_free_irqs+0x27/0x40
      [ 4497.509984]  shutdown+0x7b/0x100
      [ 4497.510184]  pci_device_shutdown+0x30/0x60
      [ 4497.510440]  device_shutdown+0x14d/0x240
      [ 4497.510698]  kernel_power_off+0x30/0x70
      [ 4497.510938]  process_one_work+0x1e6/0x3e0
      [ 4497.511183]  worker_thread+0x49/0x3b0
      [ 4497.511407]  ? __pfx_worker_thread+0x10/0x10
      [ 4497.511679]  kthread+0xe0/0x110
      [ 4497.511879]  ? __pfx_kthread+0x10/0x10
      [ 4497.512114]  ret_from_fork+0x29/0x50
      [ 4497.512342]  </TASK>
      
      Fixes: 9c2d0801 ("net/mlx5: Free irqs only on shutdown callback")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      314ded53
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Fix wrong action data allocation in decap action · ef4c5afc
      Yevgeny Kliteynik authored
      When TUNNEL_L3_TO_L2 decap action was created, a pointer to a local
      variable was passed as its HW action data, resulting in attempt to
      free invalid address:
      
        BUG: KASAN: invalid-free in mlx5dr_action_destroy+0x318/0x410 [mlx5_core]
      
      Fixes: 4781df92 ("net/mlx5: DR, Move STEv0 modify header logic")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: default avatarAlex Vesker <valex@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ef4c5afc
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Support SW created encap actions for FW table · 87cd0649
      Yevgeny Kliteynik authored
      In some cases, steering might need to use SW-created action in
      FW table, which results in wrong packet reformat being used:
      
        mlx5_core 0000:81:00.1: mlx5_cmd_check:756:(pid 1154):
            SET_FLOW_TABLE_ENTRY(0×936) op_mod(0×0) failed,
            status bad resource(0×5), syndrome (0xf2ff71)
      
      This patch adds support for usage of SW-created packet reformat (encap)
      actions in FW tables, and adds clear error flow for attempt to use
      SW-created modify header on FW tables.
      
      Fixes: 6a48faee ("net/mlx5: Add direct rule fs_cmd implementation")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: default avatarErez Shitrit <erezsh@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      87cd0649
    • Chris Mi's avatar
      net/mlx5e: TC, Cleanup ct resources for nic flow · fb7be476
      Chris Mi authored
      The cited commit removes special handling of CT action. But it
      removes too much. Pre ct/ct_nat tables and some other resources
      are not destroyed due to the cited commit.
      
      Fix it by adding it back.
      
      Fixes: 08fe94ec ("net/mlx5e: TC, Remove special handling of CT action")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      fb7be476
    • Chris Mi's avatar
      net/mlx5e: TC, Add null pointer check for hardware miss support · b100573a
      Chris Mi authored
      The cited commits add hardware miss support to tc action. But if
      the rules can't be offloaded, the pointers are null and system
      will panic when accessing them.
      
      Fix it by checking null pointer.
      
      Fixes: 08fe94ec ("net/mlx5e: TC, Remove special handling of CT action")
      Fixes: 67027828 ("net/mlx5e: TC, Set CT miss to the specific ct action instance")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b100573a
    • Eli Cohen's avatar
      net/mlx5: Fix driver load with single msix vector · 0ab999d4
      Eli Cohen authored
      When a PCI device has just one msix vector available, we want to share
      this vector between async and completion events. Current code fails to
      do that assuming it will always have at least one dedicated vector for
      completion events. Fix this by detecting when the pool contains just a
      single vector.
      
      Fixes: 3354822c ("net/mlx5: Use dynamic msix vectors allocation")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      0ab999d4
    • Maxim Mikityanskiy's avatar
      net/mlx5e: xsk: Set napi_id to support busy polling on XSK RQ · 62a522d3
      Maxim Mikityanskiy authored
      The cited commit missed setting napi_id on XSK RQs, it only affected
      regular RQs. Add the missing part to support socket busy polling on XSK
      RQs.
      
      Fixes: a2740f52 ("net/mlx5e: xsk: Set napi_id to support busy polling")
      Signed-off-by: default avatarMaxim Mikityanskiy <maxtram95@gmail.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      62a522d3
    • Maxim Mikityanskiy's avatar
      net/mlx5e: XDP, Allow growing tail for XDP multi buffer · 4e7401fc
      Maxim Mikityanskiy authored
      The cited commits missed passing frag_size to __xdp_rxq_info_reg, which
      is required by bpf_xdp_adjust_tail to support growing the tail pointer
      in fragmented packets. Pass the missing parameter when the current RQ
      mode allows XDP multi buffer.
      
      Fixes: ea5d49bd ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ")
      Fixes: 9cb9482e ("net/mlx5e: Use fragments of the same size in non-linear legacy RQ with XDP")
      Signed-off-by: default avatarMaxim Mikityanskiy <maxtram95@gmail.com>
      Cc: Tariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4e7401fc
    • Jakub Kicinski's avatar
      Merge branch 'check-if-fips-mode-is-enabled-when-running-selftests' · d4e06728
      Jakub Kicinski authored
      Magali Lemes says:
      
      ====================
      Check if FIPS mode is enabled when running selftests
      
      Some test cases from net/tls, net/fcnal-test and net/vrf-xfrm-tests
      that rely on cryptographic functions to work and use non-compliant FIPS
      algorithms fail in FIPS mode.
      
      In order to allow these tests to pass in a wider set of kernels,
       - for net/tls, skip the test variants that use the ChaCha20-Poly1305
      and SM4 algorithms, when FIPS mode is enabled;
       - for net/fcnal-test, skip the MD5 tests, when FIPS mode is enabled;
       - for net/vrf-xfrm-tests, replace the algorithms that are not
      FIPS-compliant with compliant ones.
      
      v1: https://lore.kernel.org/netdev/20230607174302.19542-1-magali.lemes@canonical.com/
      v2: https://lore.kernel.org/netdev/20230609164324.497813-1-magali.lemes@canonical.com/
      v3: https://lore.kernel.org/netdev/20230612125107.73795-1-magali.lemes@canonical.com/
      ====================
      
      Link: https://lore.kernel.org/r/20230613123222.631897-1-magali.lemes@canonical.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d4e06728
    • Magali Lemes's avatar
      selftests: net: fcnal-test: check if FIPS mode is enabled · d7a2fc14
      Magali Lemes authored
      There are some MD5 tests which fail when the kernel is in FIPS mode,
      since MD5 is not FIPS compliant. Add a check and only run those tests
      if FIPS mode is not enabled.
      
      Fixes: f0bee1eb ("fcnal-test: Add TCP MD5 tests")
      Fixes: 5cad8bce ("fcnal-test: Add TCP MD5 tests for VRF")
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarMagali Lemes <magali.lemes@canonical.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d7a2fc14
    • Magali Lemes's avatar
      selftests: net: vrf-xfrm-tests: change authentication and encryption algos · cb43c60e
      Magali Lemes authored
      The vrf-xfrm-tests tests use the hmac(md5) and cbc(des3_ede)
      algorithms for performing authentication and encryption, respectively.
      This causes the tests to fail when fips=1 is set, since these algorithms
      are not allowed in FIPS mode. Therefore, switch from hmac(md5) and
      cbc(des3_ede) to hmac(sha1) and cbc(aes), which are FIPS compliant.
      
      Fixes: 3f251d74 ("selftests: Add tests for vrf and xfrms")
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarMagali Lemes <magali.lemes@canonical.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cb43c60e
    • Magali Lemes's avatar
      selftests: net: tls: check if FIPS mode is enabled · d113c395
      Magali Lemes authored
      TLS selftests use the ChaCha20-Poly1305 and SM4 algorithms, which are not
      FIPS compliant. When fips=1, this set of tests fails. Add a check and only
      run these tests if not in FIPS mode.
      
      Fixes: 4f336e88 ("selftests/tls: add CHACHA20-POLY1305 to tls selftests")
      Fixes: e506342a ("selftests/tls: add SM4 GCM/CCM to tls selftests")
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarMagali Lemes <magali.lemes@canonical.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d113c395
    • Magali Lemes's avatar
      selftests/harness: allow tests to be skipped during setup · 372b304c
      Magali Lemes authored
      Before executing each test from a fixture, FIXTURE_SETUP is run once.
      When SKIP is used in FIXTURE_SETUP, the setup function returns early
      but the test still proceeds to run, unless another SKIP macro is used
      within the test definition, leading to some code repetition. Therefore,
      allow tests to be skipped directly from the setup function.
      Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarMagali Lemes <magali.lemes@canonical.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      372b304c
    • Linus Torvalds's avatar
      Merge tag 'net-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 40f71e7c
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from wireless, and netfilter.
      
        Selftests excluded - we have 58 patches and diff of +442/-199, which
        isn't really small but perhaps with the exception of the WiFi locking
        change it's old(ish) bugs.
      
        We have no known problems with v6.4.
      
        The selftest changes are rather large as MPTCP folks try to apply
        Greg's guidance that selftest from torvalds/linux should be able to
        run against stable kernels.
      
        Last thing I should call out is the DCCP/UDP-lite deprecation notices.
        We are fairly sure those are dead, but if we're wrong reverting them
        back in won't be fun.
      
        Current release - regressions:
      
         - wifi:
            - cfg80211: fix double lock bug in reg_wdev_chan_valid()
            - iwlwifi: mvm: spin_lock_bh() to fix lockdep regression
      
        Current release - new code bugs:
      
         - handshake: remove fput() that causes use-after-free
      
        Previous releases - regressions:
      
         - sched: cls_u32: fix reference counter leak leading to overflow
      
         - sched: cls_api: fix lockup on flushing explicitly created chain
      
        Previous releases - always broken:
      
         - nf_tables: integrate pipapo into commit protocol
      
         - nf_tables: incorrect error path handling with NFT_MSG_NEWRULE, fix
           dangling pointer on failure
      
         - ping6: fix send to link-local addresses with VRF
      
         - sched: act_pedit: parse L3 header for L4 offset, the skb may not
           have the offset saved
      
         - sched: act_ct: fix promotion of offloaded unreplied tuple
      
         - sched: refuse to destroy an ingress and clsact Qdiscs if there are
           lockless change operations in flight
      
         - wifi: mac80211: fix handful of bugs in multi-link operation
      
         - ipvlan: fix bound dev checking for IPv6 l3s mode
      
         - eth: enetc: correct the indexes of highest and 2nd highest TCs
      
         - eth: ice: fix XDP memory leak when NIC is brought up and down
      
        Misc:
      
         - add deprecation notices for UDP-lite and DCCP
      
         - selftests: mptcp: skip tests not supported by old kernels
      
         - sctp: handle invalid error codes without calling BUG()"
      
      * tag 'net-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
        dccp: Print deprecation notice.
        udplite: Print deprecation notice.
        octeon_ep: Add missing check for ioremap
        selftests/ptp: Fix timestamp printf format for PTP_SYS_OFFSET
        net: ethernet: stmicro: stmmac: fix possible memory leak in __stmmac_open
        net: tipc: resize nlattr array to correct size
        sfc: fix XDP queues mode with legacy IRQ
        net: macsec: fix double free of percpu stats
        net: lapbether: only support ethernet devices
        MAINTAINERS: add reviewers for SMC Sockets
        s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit()
        net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo frames
        net/sched: cls_api: Fix lockup on flushing explicitly created chain
        ice: Fix ice module unload
        net/handshake: remove fput() that causes use-after-free
        selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate step
        net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting
        net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs
        net/sched: act_ct: Fix promotion of offloaded unreplied tuple
        wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression
        ...
      40f71e7c
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.4-1' of... · 627d8586
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Some trivial bug fixes for v6.4-rc7"
      
      * tag 'loongarch-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: Fix debugfs_create_dir() error checking
        LoongArch: Avoid uninitialized alignment_mask
        LoongArch: Fix perf event id calculation
        LoongArch: Fix the write_fcsr() macro
        LoongArch: Let pmd_present() return true when splitting pmd
      627d8586
    • Linus Torvalds's avatar
      Merge tag 'for-6.4/dm-fixes' of... · 0e306952
      Linus Torvalds authored
      Merge tag 'for-6.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM thinp discard performance regression introduced during this
         merge window where DM core was splitting large discards every 128K
         (max_sectors_kb) rather than every 64M (discard_max_bytes).
      
       - Extend DM core LOCKFS fix, made during 6.4 merge, to also fix race
         between do_mount and dm's do_suspend (in addition to the earlier
         fix's do_mount race with dm's do_resume).
      
       - Fix DM thin metadata operations to first check if the thin-pool is in
         "fail_io" mode; otherwise UAF can occur.
      
       - Fix DM thinp's call to __blkdev_issue_discard to use GFP_NOIO rather
         than GFP_NOWAIT (__blkdev_issue_discard cannot handle NULL return
         from bio_alloc).
      
      * tag 'for-6.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: use op specific max_sectors when splitting abnormal io
        dm thin: fix issue_discard to pass GFP_NOIO to __blkdev_issue_discard
        dm thin metadata: check fail_io before using data_sm
        dm: don't lock fs when the map is NULL during suspend or resume
      0e306952
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 93fd8eb0
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "This is an unusually large bunch of bug fixes for the later rc cycle,
        rxe and mlx5 both dumped a lot of things at once. rxe continues to fix
        itself, and mlx5 is fixing a bunch of "queue counters" related bugs.
      
        There is one highly notable bug fix regarding the qkey. This small
        security check was missed in the original 2005 implementation and it
        allows some significant issues.
      
        Summary:
      
         - Two rtrs bug fixes for error unwind bugs
      
         - Several rxe bug fixes:
            * Incorrect Rx packet validation
            * Using memory without a refcount
            * Syzkaller found use before initialization
            * Regression fix for missing locking with the tasklet conversion
              from this merge window
      
         - Have bnxt report the correct link properties to userspace, this was
           a regression in v6.3
      
         - Several mlx5 bug fixes:
            * Kernel crash triggerable by userspace for the RAW ethernet
              profile
            * Defend against steering refcounting issues created by userspace
            * Incorrect change of QP port affinity parameters in some LAG
              configurations
      
         - Fix mlx5 Q counters:
            * Do not over allocate Q counters to allow userspace to use the
              full port capacity
            * Kernel crash triggered by eswitch due to mis-use of Q counters
            * Incorrect mlx5_device for Q counters in some LAG configurations
      
         - Properly implement the IBA spec restricting privileged qkeys to
           root
      
         - Always an error when reading from a disassociated device's event
           queue
      
         - isert bug fixes:
            * Avoid a deadlock with the CM handler and CM ID destruction
            * Correct list corruption due to incorrect locking
            * Fix a use after free around connection tear down"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/rxe: Fix rxe_cq_post
        IB/isert: Fix incorrect release of isert connection
        IB/isert: Fix possible list corruption in CMA handler
        IB/isert: Fix dead lock in ib_isert
        RDMA/mlx5: Fix affinity assignment
        IB/uverbs: Fix to consider event queue closing also upon non-blocking mode
        RDMA/uverbs: Restrict usage of privileged QKEYs
        RDMA/cma: Always set static rate to 0 for RoCE
        RDMA/mlx5: Fix Q-counters query in LAG mode
        RDMA/mlx5: Remove vport Q-counters dependency on normal Q-counters
        RDMA/mlx5: Fix Q-counters per vport allocation
        RDMA/mlx5: Create an indirect flow table for steering anchor
        RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions
        RDMA/rxe: Fix the use-before-initialization error of resp_pkts
        RDMA/bnxt_re: Fix reporting active_{speed,width} attributes
        RDMA/rxe: Fix ref count error in check_rkey()
        RDMA/rxe: Fix packet length checks
        RDMA/rtrs: Fix rxe_dealloc_pd warning
        RDMA/rtrs: Fix the last iu->buf leak in err path
      93fd8eb0
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · b7feaa49
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A few more driver specific fixes.
      
        The DesignWare fix is for an issue introduced by conversion to the
        chip select accessor functions and is pretty important but the other
        two are less severe"
      
      * tag 'spi-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: dw: Replace incorrect spi_get_chipselect with set
        spi: fsl-dspi: avoid SCK glitches with continuous transfers
        spi: cadence-quadspi: Add missing check for dma_set_mask
      b7feaa49
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.4-rc6' of... · eee71c34
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fix from Mark Brown:
       "The set of regulators described for the Qualcomm PM8550 just seems to
        have been completely wrong and would likely not have worked at all if
        anything tried to actually configure anything except for enabling and
        disabling at runtime"
      
      * tag 'regulator-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: qcom-rpmh: Fix regulators for PM8550
      eee71c34
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 231a1e31
      Linus Torvalds authored
      Pull regmap fix from Mark Brown:
       "Another fix for the maple tree cache, Takashi noticed that unlike
        other caches the maple tree cache didn't check for read only registers
        before trying to sync which would result in spurious syncs for read
        only registers where we don't have a default.
      
        This was due to the check being open coded in the caches, we now check
        in the shared 'does this register need sync' function so that is fixed
        for this and future caches"
      
      * tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: regcache: Don't sync read-only registers
      231a1e31
    • Linus Torvalds's avatar
      Merge tag 'media/v6.4-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · c926a55f
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "A fix for dvb-core to avoid a race condition during DVB board
        registration"
      
      * tag 'media/v6.4-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"
      c926a55f
  2. 15 Jun, 2023 16 commits
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 62d87796
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Fix two regressions in ext4, one report by syzkaller[1], and reported
        by multiple users (and tracked by regzbot[2])"
      
      [1] https://syzkaller.appspot.com/bug?extid=4acc7d910e617b360859
      [2] https://linux-regtracking.leemhuis.info/regzbot/regression/ZIauBR7YiV3rVAHL@glitch/
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: drop the call to ext4_error() from ext4_get_group_info()
        Revert "ext4: remove unnecessary check in ext4_bg_num_gdb_nometa"
      62d87796
    • Linus Torvalds's avatar
      Merge tag '6.4-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 7a043feb
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Eight, mostly small, smb3 client fixes:
      
         - important fix for deferred close oops (race with unmount) found
           with xfstest generic/098 to some servers
      
         - important reconnect fix
      
         - fix problem with max_credits mount option
      
         - two multichannel (interface related) fixes
      
         - one trivial removal of confusing comment
      
         - two small debugging improvements (to better spot crediting
           problems)"
      
      * tag '6.4-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: add a warning when the in-flight count goes negative
        cifs: fix lease break oops in xfstest generic/098
        cifs: fix max_credits implementation
        cifs: fix sockaddr comparison in iface_cmp
        smb/client: print "Unknown" instead of bogus link speed value
        cifs: print all credit counters in DebugData
        cifs: fix status checks in cifs_tree_connect
        smb: remove obsolete comment
      7a043feb
    • Jakub Kicinski's avatar
      Merge branch 'udplite-dccp-print-deprecation-notice' · 8f0e3703
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      udplite/dccp: Print deprecation notice.
      
      UDP-Lite is assumed to have no users for 7 years, and DCCP is
      orphaned for 7 years too.
      
      Let's add deprecation notice and see if anyone responds to it.
      ====================
      
      Link: https://lore.kernel.org/r/20230614194705.90673-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8f0e3703
    • Kuniyuki Iwashima's avatar
      dccp: Print deprecation notice. · b144fcaf
      Kuniyuki Iwashima authored
      DCCP was marked as Orphan in the MAINTAINERS entry 2 years ago in commit
      054c4610 ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS").  It says
      we haven't heard from the maintainer for five years, so DCCP is not well
      maintained for 7 years now.
      
      Recently DCCP only receives updates for bugs, and major distros disable it
      by default.
      
      Removing DCCP would allow for better organisation of TCP fields to reduce
      the number of cache lines hit in the fast path.
      
      Let's add a deprecation notice when DCCP socket is created and schedule its
      removal to 2025.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b144fcaf
    • Kuniyuki Iwashima's avatar
      udplite: Print deprecation notice. · be28c14a
      Kuniyuki Iwashima authored
      Recently syzkaller reported a 7-year-old null-ptr-deref [0] that occurs
      when a UDP-Lite socket tries to allocate a buffer under memory pressure.
      
      Someone should have stumbled on the bug much earlier if UDP-Lite had been
      used in a real app.  Also, we do not always need a large UDP-Lite workload
      to hit the bug since UDP and UDP-Lite share the same memory accounting
      limit.
      
      Removing UDP-Lite would simplify UDP code removing a bunch of conditionals
      in fast path.
      
      Let's add a deprecation notice when UDP-Lite socket is created and schedule
      its removal to 2025.
      
      Link: https://lore.kernel.org/netdev/20230523163305.66466-1-kuniyu@amazon.com/ [0]
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      be28c14a
    • Jiasheng Jiang's avatar
      octeon_ep: Add missing check for ioremap · 9a36e2d4
      Jiasheng Jiang authored
      Add check for ioremap() and return the error if it fails in order to
      guarantee the success of ioremap().
      
      Fixes: 862cd659 ("octeon_ep: Add driver framework and device initialization")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Link: https://lore.kernel.org/r/20230615033400.2971-1-jiasheng@iscas.ac.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9a36e2d4
    • Alex Maftei's avatar
      selftests/ptp: Fix timestamp printf format for PTP_SYS_OFFSET · 76a4c8b8
      Alex Maftei authored
      Previously, timestamps were printed using "%lld.%u" which is incorrect
      for nanosecond values lower than 100,000,000 as they're fractional
      digits, therefore leading zeros are meaningful.
      
      This patch changes the format strings to "%lld.%09u" in order to add
      leading zeros to the nanosecond value.
      
      Fixes: 568ebc59 ("ptp: add the PTP_SYS_OFFSET ioctl to the testptp program")
      Fixes: 4ec54f95 ("ptp: Fix compiler warnings in the testptp utility")
      Fixes: 6ab0e475 ("Documentation: fix misc. warnings")
      Signed-off-by: default avatarAlex Maftei <alex.maftei@amd.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Link: https://lore.kernel.org/r/20230615083404.57112-1-alex.maftei@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      76a4c8b8
    • Christian Marangi's avatar
      net: ethernet: stmicro: stmmac: fix possible memory leak in __stmmac_open · 30134b7c
      Christian Marangi authored
      Fix a possible memory leak in __stmmac_open when stmmac_init_phy fails.
      It's also needed to free everything allocated by stmmac_setup_dma_desc
      and not just the dma_conf struct.
      
      Drop free_dma_desc_resources from __stmmac_open and correctly call
      free_dma_desc_resources on each user of __stmmac_open on error.
      Reported-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Fixes: ba39b344 ("net: ethernet: stmicro: stmmac: generate stmmac dma conf before open")
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarJose Abreu <Jose.Abreu@synopsys.com>
      Link: https://lore.kernel.org/r/20230614091714.15912-1-ansuelsmth@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30134b7c
    • Lin Ma's avatar
      net: tipc: resize nlattr array to correct size · 44194cb1
      Lin Ma authored
      According to nla_parse_nested_deprecated(), the tb[] is supposed to the
      destination array with maxtype+1 elements. In current
      tipc_nl_media_get() and __tipc_nl_media_set(), a larger array is used
      which is unnecessary. This patch resize them to a proper size.
      
      Fixes: 1e55417d ("tipc: add media set to new netlink api")
      Fixes: 46f15c67 ("tipc: add media get/dump to new netlink api")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Link: https://lore.kernel.org/r/20230614120604.1196377-1-linma@zju.edu.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      44194cb1
    • Mike Snitzer's avatar
      dm: use op specific max_sectors when splitting abnormal io · be04c14a
      Mike Snitzer authored
      Split abnormal IO in terms of the corresponding operation specific
      max_sectors (max_discard_sectors, max_secure_erase_sectors or
      max_write_zeroes_sectors).
      
      This fixes a significant dm-thinp discard performance regression that
      was introduced with commit e2dd8aca ("dm bio prison v1: improve
      concurrent IO performance"). Relative to discard: max_discard_sectors
      is used instead of max_sectors; which fixes excessive discard splitting
      (e.g. max_sectors=128K vs max_discard_sectors=64M).
      
      Tested by discarding an 1 Petabyte dm-thin device:
      lvcreate -V 1125899906842624B -T test/pool -n thin
      time blkdiscard /dev/test/thin
      
      Before this fix (splitting discards every 128K): ~116m
       After this fix (splitting discards every 64M) : 0m33.460s
      Reported-by: default avatarZorro Lang <zlang@redhat.com>
      Fixes: 06961c48 ("dm: split discards further if target sets max_discard_granularity")
      Requires: 13f6facf ("dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE")
      Fixes: e2dd8aca ("dm bio prison v1: improve concurrent IO performance")
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      be04c14a
    • Mike Snitzer's avatar
      dm thin: fix issue_discard to pass GFP_NOIO to __blkdev_issue_discard · 722d9082
      Mike Snitzer authored
      issue_discard() passes GFP_NOWAIT to __blkdev_issue_discard() despite
      its code assuming bio_alloc() always succeeds.
      
      Commit 3dba53a9 ("dm thin: use __blkdev_issue_discard for async
      discard support") clearly shows where things went bad:
      
      Before commit 3dba53a9, dm-thin.c's open-coded
      __blkdev_issue_discard_async() properly handled using GFP_NOWAIT.
      Unfortunately __blkdev_issue_discard() doesn't and it was missed
      during review.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      722d9082
    • Li Lingfeng's avatar
      dm thin metadata: check fail_io before using data_sm · cb65b282
      Li Lingfeng authored
      Must check pmd->fail_io before using pmd->data_sm since
      pmd->data_sm may be destroyed by other processes.
      
             P1(kworker)                             P2(message)
      do_worker
       process_prepared
        process_prepared_discard_passdown_pt2
         dm_pool_dec_data_range
                                          pool_message
                                           commit
                                            dm_pool_commit_metadata
                                              ↓
                                             // commit failed
                                            metadata_operation_failed
                                             abort_transaction
                                              dm_pool_abort_metadata
                                               __open_or_format_metadata
                                                 ↓
                                                dm_sm_disk_open
                                                  ↓
                                                 // open failed
                                                 // pmd->data_sm is NULL
          dm_sm_dec_blocks
            ↓
           // try to access pmd->data_sm --> UAF
      
      As shown above, if dm_pool_commit_metadata() and
      dm_pool_abort_metadata() fail in pool_message process, kworker may
      trigger UAF.
      
      Fixes: be500ed7 ("dm space maps: improve performance with inc/dec on ranges of blocks")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLi Lingfeng <lilingfeng3@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      cb65b282
    • Li Lingfeng's avatar
      dm: don't lock fs when the map is NULL during suspend or resume · 2760904d
      Li Lingfeng authored
      As described in commit 38d11da5 ("dm: don't lock fs when the map is
      NULL in process of resume"), a deadlock may be triggered between
      do_resume() and do_mount().
      
      This commit preserves the fix from commit 38d11da5 but moves it to
      where it also serves to fix a similar deadlock between do_suspend()
      and do_mount().  It does so, if the active map is NULL, by clearing
      DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both
      do_suspend() and do_resume().
      
      Fixes: 38d11da5 ("dm: don't lock fs when the map is NULL in process of resume")
      Signed-off-by: default avatarLi Lingfeng <lilingfeng3@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      2760904d
    • Íñigo Huguet's avatar
      sfc: fix XDP queues mode with legacy IRQ · e84a1e1e
      Íñigo Huguet authored
      In systems without MSI-X capabilities, xdp_txq_queues_mode is calculated
      in efx_allocate_msix_channels, but when enabling MSI-X fails, it was not
      changed to a proper default value. This was leading to the driver
      thinking that it has dedicated XDP queues, when it didn't.
      
      Fix it by setting xdp_txq_queues_mode to the correct value if the driver
      fallbacks to MSI or legacy IRQ mode. The correct value is
      EFX_XDP_TX_QUEUES_BORROWED because there are no XDP dedicated queues.
      
      The issue can be easily visible if the kernel is started with pci=nomsi,
      then a call trace is shown. It is not shown only with sfc's modparam
      interrupt_mode=2. Call trace example:
       WARNING: CPU: 2 PID: 663 at drivers/net/ethernet/sfc/efx_channels.c:828 efx_set_xdp_channels+0x124/0x260 [sfc]
       [...skip...]
       Call Trace:
        <TASK>
        efx_set_channels+0x5c/0xc0 [sfc]
        efx_probe_nic+0x9b/0x15a [sfc]
        efx_probe_all+0x10/0x1a2 [sfc]
        efx_pci_probe_main+0x12/0x156 [sfc]
        efx_pci_probe_post_io+0x18/0x103 [sfc]
        efx_pci_probe.cold+0x154/0x257 [sfc]
        local_pci_probe+0x42/0x80
      
      Fixes: 6215b608 ("sfc: last resort fallback for lack of xdp tx queues")
      Reported-by: default avatarYanghang Liu <yanghliu@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Acked-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e84a1e1e
    • Fedor Pchelkin's avatar
      net: macsec: fix double free of percpu stats · 0c0cf3db
      Fedor Pchelkin authored
      Inside macsec_add_dev() we free percpu macsec->secy.tx_sc.stats and
      macsec->stats on some of the memory allocation failure paths. However, the
      net_device is already registered to that moment: in macsec_newlink(), just
      before calling macsec_add_dev(). This means that during unregister process
      its priv_destructor - macsec_free_netdev() - will be called and will free
      the stats again.
      
      Remove freeing percpu stats inside macsec_add_dev() because
      macsec_free_netdev() will correctly free the already allocated ones. The
      pointers to unallocated stats stay NULL, and free_percpu() treats that
      correctly.
      
      Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
      
      Fixes: 0a28bfd4 ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support")
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: default avatarFedor Pchelkin <pchelkin@ispras.ru>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c0cf3db
    • Eric Dumazet's avatar
      net: lapbether: only support ethernet devices · 9eed321c
      Eric Dumazet authored
      It probbaly makes no sense to support arbitrary network devices
      for lapbether.
      
      syzbot reported:
      
      skbuff: skb_under_panic: text:ffff80008934c100 len:44 put:40 head:ffff0000d18dd200 data:ffff0000d18dd1ea tail:0x16 end:0x140 dev:bond1
      kernel BUG at net/core/skbuff.c:200 !
      Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 5643 Comm: dhcpcd Not tainted 6.4.0-rc5-syzkaller-g4641cff8e810 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : skb_panic net/core/skbuff.c:196 [inline]
      pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
      lr : skb_panic net/core/skbuff.c:196 [inline]
      lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
      sp : ffff8000973b7260
      x29: ffff8000973b7270 x28: ffff8000973b7360 x27: dfff800000000000
      x26: ffff0000d85d8150 x25: 0000000000000016 x24: ffff0000d18dd1ea
      x23: ffff0000d18dd200 x22: 000000000000002c x21: 0000000000000140
      x20: 0000000000000028 x19: ffff80008934c100 x18: ffff8000973b68a0
      x17: 0000000000000000 x16: ffff80008a43bfbc x15: 0000000000000202
      x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001
      x11: 0000000000000201 x10: 0000000000000000 x9 : f22f7eb937cced00
      x8 : f22f7eb937cced00 x7 : 0000000000000001 x6 : 0000000000000001
      x5 : ffff8000973b6b78 x4 : ffff80008df9ee80 x3 : ffff8000805974f4
      x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086
      Call trace:
      skb_panic net/core/skbuff.c:196 [inline]
      skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
      skb_push+0xf0/0x108 net/core/skbuff.c:2409
      ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1383
      dev_hard_header include/linux/netdevice.h:3137 [inline]
      lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257
      lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447
      lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149
      lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251
      lapb_establish_data_link+0x94/0xec
      lapb_device_event+0x348/0x4e0
      notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
      raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
      __dev_notify_flags+0x2bc/0x544
      dev_change_flags+0xd0/0x15c net/core/dev.c:8643
      devinet_ioctl+0x858/0x17e4 net/ipv4/devinet.c:1150
      inet_ioctl+0x2ac/0x4d8 net/ipv4/af_inet.c:979
      sock_do_ioctl+0x134/0x2dc net/socket.c:1201
      sock_ioctl+0x4ec/0x858 net/socket.c:1318
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:870 [inline]
      __se_sys_ioctl fs/ioctl.c:856 [inline]
      __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:856
      __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
      invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
      el0_svc_common+0x138/0x244 arch/arm64/kernel/syscall.c:142
      do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:191
      el0_svc+0x4c/0x160 arch/arm64/kernel/entry-common.c:647
      el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      Code: aa1803e6 aa1903e7 a90023f5 947730f5 (d4210000)
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Martin Schiller <ms@dev.tdt.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9eed321c