- 26 Jul, 2023 26 commits
-
-
Chris Mi authored
The cited commit sets ft prio to fs_base_prio. But if ignore_flow_level it not supported, ft prio must be set based on tc filter prio. Otherwise, all the ft prio are the same on the same chain. It is invalid if ignore_flow_level is not supported. Fix it by setting ft prio based on tc filter prio and setting fs_base_prio to 0 for fdb. Fixes: 8e80e564 ("net/mlx5: fs_chains: Refactor to detach chains from tc usage") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jianbo Liu authored
There are DEK objects cached in DEK pool after kTLS is used, and they are freed only in mlx5e_ktls_cleanup(). mlx5e_destroy_mdev_resources() is called in mlx5e_suspend() to free mdev resources, including protection domain (PD). However, PD is still referenced by the cached DEK objects in this case, because profile->cleanup() (and therefore mlx5e_ktls_cleanup()) is called after mlx5e_suspend() during devlink reload. So the following FW syndrome is generated: mlx5_cmd_out_err:803:(pid 12948): DEALLOC_PD(0x801) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0xef0c8a), err(-22) To avoid this syndrome, move DEK pool destruction to mlx5e_ktls_cleanup_tx(), which is called by profile->cleanup_tx(). And move pool creation to mlx5e_ktls_init_tx() for symmetry. Fixes: f741db1a ("net/mlx5e: kTLS, Improve connection rate by using fast update encryption key") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Vlad Buslov authored
As suggested during code review set the access rights for bridge 'fdb' debugfs file to root-only. Fixes: 791eb782 ("net/mlx5: Bridge, expose FDB state via debugfs") Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20230619120515.5045132a@kernel.org/Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Dragos Tatulea authored
When the regular rq is reactivated after the XSK socket is closed it could be reading stale cqes which eventually corrupts the rq. This leads to no more traffic being received on the regular rq and a crash on the next close or deactivation of the rq. Kal Cuttler Conely reported this issue as a crash on the release path when the xdpsock sample program is stopped (killed) and restarted in sequence while traffic is running. This patch flushes all cqes when during the rq flush. The cqe flushing is done in the reset state of the rq. mlx5e_rq_to_ready code is moved into the flush function to allow for this. Fixes: 082a9edf ("net/mlx5e: xsk: Flush RQ on XSK activation to save memory") Reported-by: Kal Cutter Conley <kal.conley@dectris.com> Closes: https://lore.kernel.org/xdp-newbies/CAHApi-nUAs4TeFWUDV915CZJo07XVg2Vp63-no7UDfj6wur9nQ@mail.gmail.comSigned-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Dragos Tatulea authored
The below crash can be encountered when using xdpsock in rx mode for legacy rq: the buffer gets released in the XDP_REDIRECT path, and then once again in the driver. This fix sets the flag to avoid releasing on the driver side. XSK handling of buffers for legacy rq was relying on the caller to set the skip release flag. But the referenced fix started using fragment counts for pages instead of the skip flag. Crash log: general protection fault, probably for non-canonical address 0xffff8881217e3a: 0000 [#1] SMP CPU: 0 PID: 14 Comm: ksoftirqd/0 Not tainted 6.5.0-rc1+ #31 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:bpf_prog_03b13f331978c78c+0xf/0x28 Code: ... RSP: 0018:ffff88810082fc98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888138404901 RCX: c0ffffc900027cbc RDX: ffffffffa000b514 RSI: 00ffff8881217e32 RDI: ffff888138404901 RBP: ffff88810082fc98 R08: 0000000000091100 R09: 0000000000000006 R10: 0000000000000800 R11: 0000000000000800 R12: ffffc9000027a000 R13: ffff8881217e2dc0 R14: ffff8881217e2910 R15: ffff8881217e2f00 FS: 0000000000000000(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564cb2e2cde0 CR3: 000000010e603004 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? die_addr+0x32/0x80 ? exc_general_protection+0x192/0x390 ? asm_exc_general_protection+0x22/0x30 ? 0xffffffffa000b514 ? bpf_prog_03b13f331978c78c+0xf/0x28 mlx5e_xdp_handle+0x48/0x670 [mlx5_core] ? dev_gro_receive+0x3b5/0x6e0 mlx5e_xsk_skb_from_cqe_linear+0x6e/0x90 [mlx5_core] mlx5e_handle_rx_cqe+0x55/0x100 [mlx5_core] mlx5e_poll_rx_cq+0x87/0x6e0 [mlx5_core] mlx5e_napi_poll+0x45e/0x6b0 [mlx5_core] __napi_poll+0x25/0x1a0 net_rx_action+0x28a/0x300 __do_softirq+0xcd/0x279 ? sort_range+0x20/0x20 run_ksoftirqd+0x1a/0x20 smpboot_thread_fn+0xa2/0x130 kthread+0xc9/0xf0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Modules linked in: mlx5_ib mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay zram zsmalloc fuse [last unloaded: mlx5_core] ---[ end trace 0000000000000000 ]--- Fixes: 7abd955a ("net/mlx5e: RX, Fix page_pool page fragment tracking for XDP") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jianbo Liu authored
For IP tunnel encapsulation in ECMP (Equal-Cost Multipath) mode, as the flow is duplicated to the peer eswitch, the related neighbour information on the peer uplink representor is created as well. In the cited commit, eswitch devcom unpair is moved to uplink unload API, specifically the profile->cleanup_tx. If there is a encap rule offloaded in ECMP mode, when one eswitch does unpair (because of unloading the driver, for instance), and the peer rule from the peer eswitch is going to be deleted, the use-after-free error is triggered while accessing neigh info, as it is already cleaned up in uplink's profile->disable, which is before its profile->cleanup_tx. To fix this issue, move the neigh cleanup to profile's cleanup_tx callback, and after mlx5e_cleanup_uplink_rep_tx is called. The neigh init is moved to init_tx for symmeter. [ 2453.376299] BUG: KASAN: slab-use-after-free in mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.379125] Read of size 4 at addr ffff888127af9008 by task modprobe/2496 [ 2453.381542] CPU: 7 PID: 2496 Comm: modprobe Tainted: G B 6.4.0-rc7+ #15 [ 2453.383386] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 2453.384335] Call Trace: [ 2453.384625] <TASK> [ 2453.384891] dump_stack_lvl+0x33/0x50 [ 2453.385285] print_report+0xc2/0x610 [ 2453.385667] ? __virt_addr_valid+0xb1/0x130 [ 2453.386091] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.386757] kasan_report+0xae/0xe0 [ 2453.387123] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.387798] mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core] [ 2453.388465] mlx5e_rep_encap_entry_detach+0xa6/0xe0 [mlx5_core] [ 2453.389111] mlx5e_encap_dealloc+0xa7/0x100 [mlx5_core] [ 2453.389706] mlx5e_tc_tun_encap_dests_unset+0x61/0xb0 [mlx5_core] [ 2453.390361] mlx5_free_flow_attr_actions+0x11e/0x340 [mlx5_core] [ 2453.391015] ? complete_all+0x43/0xd0 [ 2453.391398] ? free_flow_post_acts+0x38/0x120 [mlx5_core] [ 2453.392004] mlx5e_tc_del_fdb_flow+0x4ae/0x690 [mlx5_core] [ 2453.392618] mlx5e_tc_del_fdb_peers_flow+0x308/0x370 [mlx5_core] [ 2453.393276] mlx5e_tc_clean_fdb_peer_flows+0xf5/0x140 [mlx5_core] [ 2453.393925] mlx5_esw_offloads_unpair+0x86/0x540 [mlx5_core] [ 2453.394546] ? mlx5_esw_offloads_set_ns_peer.isra.0+0x180/0x180 [mlx5_core] [ 2453.395268] ? down_write+0xaa/0x100 [ 2453.395652] mlx5_esw_offloads_devcom_event+0x203/0x530 [mlx5_core] [ 2453.396317] mlx5_devcom_send_event+0xbb/0x190 [mlx5_core] [ 2453.396917] mlx5_esw_offloads_devcom_cleanup+0xb0/0xd0 [mlx5_core] [ 2453.397582] mlx5e_tc_esw_cleanup+0x42/0x120 [mlx5_core] [ 2453.398182] mlx5e_rep_tc_cleanup+0x15/0x30 [mlx5_core] [ 2453.398768] mlx5e_cleanup_rep_tx+0x6c/0x80 [mlx5_core] [ 2453.399367] mlx5e_detach_netdev+0xee/0x120 [mlx5_core] [ 2453.399957] mlx5e_netdev_change_profile+0x84/0x170 [mlx5_core] [ 2453.400598] mlx5e_vport_rep_unload+0xe0/0xf0 [mlx5_core] [ 2453.403781] mlx5_eswitch_unregister_vport_reps+0x15e/0x190 [mlx5_core] [ 2453.404479] ? mlx5_eswitch_register_vport_reps+0x200/0x200 [mlx5_core] [ 2453.405170] ? up_write+0x39/0x60 [ 2453.405529] ? kernfs_remove_by_name_ns+0xb7/0xe0 [ 2453.405985] auxiliary_bus_remove+0x2e/0x40 [ 2453.406405] device_release_driver_internal+0x243/0x2d0 [ 2453.406900] ? kobject_put+0x42/0x2d0 [ 2453.407284] bus_remove_device+0x128/0x1d0 [ 2453.407687] device_del+0x240/0x550 [ 2453.408053] ? waiting_for_supplier_show+0xe0/0xe0 [ 2453.408511] ? kobject_put+0xfa/0x2d0 [ 2453.408889] ? __kmem_cache_free+0x14d/0x280 [ 2453.409310] mlx5_rescan_drivers_locked.part.0+0xcd/0x2b0 [mlx5_core] [ 2453.409973] mlx5_unregister_device+0x40/0x50 [mlx5_core] [ 2453.410561] mlx5_uninit_one+0x3d/0x110 [mlx5_core] [ 2453.411111] remove_one+0x89/0x130 [mlx5_core] [ 2453.411628] pci_device_remove+0x59/0xf0 [ 2453.412026] device_release_driver_internal+0x243/0x2d0 [ 2453.412511] ? parse_option_str+0x14/0x90 [ 2453.412915] driver_detach+0x7b/0xf0 [ 2453.413289] bus_remove_driver+0xb5/0x160 [ 2453.413685] pci_unregister_driver+0x3f/0xf0 [ 2453.414104] mlx5_cleanup+0xc/0x20 [mlx5_core] Fixes: 2be5bd42 ("net/mlx5: Handle pairing of E-switch via uplink un/load APIs") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Amir Tzin authored
Moving to switchdev mode with ntuple offload on causes the kernel to crash since fs->arfs is freed during nic profile cleanup flow. Ntuple offload is not supported in switchdev mode and it is already unset by mlx5 fix feature ndo in switchdev mode. Verify fs->arfs is valid before disabling it. trace: [] RIP: 0010:_raw_spin_lock_bh+0x17/0x30 [] arfs_del_rules+0x44/0x1a0 [mlx5_core] [] mlx5e_arfs_disable+0xe/0x20 [mlx5_core] [] mlx5e_handle_feature+0x3d/0xb0 [mlx5_core] [] ? __rtnl_unlock+0x25/0x50 [] mlx5e_set_features+0xfe/0x160 [mlx5_core] [] __netdev_update_features+0x278/0xa50 [] ? netdev_run_todo+0x5e/0x2a0 [] netdev_update_features+0x22/0x70 [] ? _cond_resched+0x15/0x30 [] mlx5e_attach_netdev+0x12a/0x1e0 [mlx5_core] [] mlx5e_netdev_attach_profile+0xa1/0xc0 [mlx5_core] [] mlx5e_netdev_change_profile+0x77/0xe0 [mlx5_core] [] mlx5e_vport_rep_load+0x1ed/0x290 [mlx5_core] [] mlx5_esw_offloads_rep_load+0x88/0xd0 [mlx5_core] [] esw_offloads_load_rep.part.38+0x31/0x50 [mlx5_core] [] esw_offloads_enable+0x6c5/0x710 [mlx5_core] [] mlx5_eswitch_enable_locked+0x1bb/0x290 [mlx5_core] [] mlx5_devlink_eswitch_mode_set+0x14f/0x320 [mlx5_core] [] devlink_nl_cmd_eswitch_set_doit+0x94/0x120 [] genl_family_rcv_msg_doit.isra.17+0x113/0x150 [] genl_family_rcv_msg+0xb7/0x170 [] ? devlink_nl_cmd_port_split_doit+0x100/0x100 [] genl_rcv_msg+0x47/0xa0 [] ? genl_family_rcv_msg+0x170/0x170 [] netlink_rcv_skb+0x4c/0x130 [] genl_rcv+0x24/0x40 [] netlink_unicast+0x19a/0x230 [] netlink_sendmsg+0x204/0x3d0 [] sock_sendmsg+0x50/0x60 Fixes: 90b22b9b ("net/mlx5e: Disable Rx ntuple offload for uplink representor") Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Chris Mi authored
The cited commit holds encap tbl lock unconditionally when setting up dests. But it may cause the following deadlock: PID: 1063722 TASK: ffffa062ca5d0000 CPU: 13 COMMAND: "handler8" #0 [ffffb14de05b7368] __schedule at ffffffffa1d5aa91 #1 [ffffb14de05b7410] schedule at ffffffffa1d5afdb #2 [ffffb14de05b7430] schedule_preempt_disabled at ffffffffa1d5b528 #3 [ffffb14de05b7440] __mutex_lock at ffffffffa1d5d6cb #4 [ffffb14de05b74e8] mutex_lock_nested at ffffffffa1d5ddeb #5 [ffffb14de05b74f8] mlx5e_tc_tun_encap_dests_set at ffffffffc12f2096 [mlx5_core] #6 [ffffb14de05b7568] post_process_attr at ffffffffc12d9fc5 [mlx5_core] #7 [ffffb14de05b75a0] mlx5e_tc_add_fdb_flow at ffffffffc12de877 [mlx5_core] #8 [ffffb14de05b75f0] __mlx5e_add_fdb_flow at ffffffffc12e0eef [mlx5_core] #9 [ffffb14de05b7660] mlx5e_tc_add_flow at ffffffffc12e12f7 [mlx5_core] #10 [ffffb14de05b76b8] mlx5e_configure_flower at ffffffffc12e1686 [mlx5_core] #11 [ffffb14de05b7720] mlx5e_rep_indr_offload at ffffffffc12e3817 [mlx5_core] #12 [ffffb14de05b7730] mlx5e_rep_indr_setup_tc_cb at ffffffffc12e388a [mlx5_core] #13 [ffffb14de05b7740] tc_setup_cb_add at ffffffffa1ab2ba8 #14 [ffffb14de05b77a0] fl_hw_replace_filter at ffffffffc0bdec2f [cls_flower] #15 [ffffb14de05b7868] fl_change at ffffffffc0be6caa [cls_flower] #16 [ffffb14de05b7908] tc_new_tfilter at ffffffffa1ab71f0 [1031218.028143] wait_for_completion+0x24/0x30 [1031218.028589] mlx5e_update_route_decap_flows+0x9a/0x1e0 [mlx5_core] [1031218.029256] mlx5e_tc_fib_event_work+0x1ad/0x300 [mlx5_core] [1031218.029885] process_one_work+0x24e/0x510 Actually no need to hold encap tbl lock if there is no encap action. Fix it by checking if encap action exists or not before holding encap tbl lock. Fixes: 37c3b9fa ("net/mlx5e: Prevent encap offload when neigh update is running") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Currently, whenever a user is setting migratable port fn attr, the driver is always turn migratable capability on. Fix it by honor the user input Fixes: e5b9642a ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yuanjun Gong authored
mlx5e_ipsec_remove_trailer() should return an error code if function pskb_trim() returns an unexpected value. Fixes: 2ac9cfe7 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Zhengchao Shao authored
The memory pointed to by the priv->rx_res pointer is not freed in the error path of mlx5e_init_rep_rx, which can lead to a memory leak. Fix by freeing the memory in the error path, thereby making the error path identical to mlx5e_cleanup_rep_rx(). Fixes: af8bbf73 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Zhengchao Shao authored
when mlx5_cmd_exec failed in mlx5dr_cmd_create_reformat_ctx, the memory pointed by 'in' is not released, which will cause memory leak. Move memory release after mlx5_cmd_exec. Fixes: 1d918647 ("net/mlx5: DR, Add direct rule command utilities") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Zhengchao Shao authored
In function macsec_fs_tx_create_crypto_table_groups(), when the ft->g memory is successfully allocated but the 'in' memory fails to be allocated, the memory pointed to by ft->g is released once. And in function macsec_fs_tx_create(), macsec_fs_tx_destroy() is called to release the memory pointed to by ft->g again. This will cause double free problem. Fixes: e467b283 ("net/mlx5e: Add MACsec TX steering rules") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jakub Kicinski authored
Arkadiusz Kubalewski says: ==================== tools: ynl-gen: fix parse multi-attr enum attribute Fix the issues with parsing enums in ynl.py script. ==================== Link: https://lore.kernel.org/r/20230725101642.267248-1-arkadiusz.kubalewski@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arkadiusz Kubalewski authored
When attribute is enum type and marked as multi-attr, the netlink respond is not parsed, fails with stack trace: Traceback (most recent call last): File "/net-next/tools/net/ynl/./test.py", line 520, in <module> main() File "/net-next/tools/net/ynl/./test.py", line 488, in main dplls=dplls_get(282574471561216) File "/net-next/tools/net/ynl/./test.py", line 48, in dplls_get reply=act(args) File "/net-next/tools/net/ynl/./test.py", line 41, in act reply = ynl.dump(args.dump, attrs) File "/net-next/tools/net/ynl/lib/ynl.py", line 598, in dump return self._op(method, vals, dump=True) File "/net-next/tools/net/ynl/lib/ynl.py", line 584, in _op rsp_msg = self._decode(gm.raw_attrs, op.attr_set.name) File "/net-next/tools/net/ynl/lib/ynl.py", line 451, in _decode self._decode_enum(rsp, attr_spec) File "/net-next/tools/net/ynl/lib/ynl.py", line 408, in _decode_enum value = enum.entries_by_val[raw].name TypeError: unhashable type: 'list' error: 1 Redesign _decode_enum(..) to take a enum int value and translate it to either a bitmask or enum name as expected. Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20230725101642.267248-3-arkadiusz.kubalewski@intel.comReviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arkadiusz Kubalewski authored
Remove wrong index adjustment, which is leftover from adding support for sparse enums. enum.entries_by_val() function shall not subtract the start-value, as it is indexed with real enum value. Fixes: c311aaa7 ("tools: ynl: fix enum-as-flags in the generic CLI") Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20230725101642.267248-2-arkadiusz.kubalewski@intel.comReviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Muhammad Husaini Zulkifli authored
The Xeon validation group has been carrying out some loaded tests with various HW configurations, and they have seen some transmit queue time out happening during the test. This will cause the reset adapter function to be called by igc_tx_timeout(). Similar race conditions may arise when the interface is being brought down and up in igc_reinit_locked(), an interrupt being generated, and igc_clean_tx_irq() being called to complete the TX. When the igc_tx_timeout() function is invoked, this patch will turn off all TX ring HW queues during igc_down() process. TX ring HW queues will be activated again during the igc_configure_tx_ring() process when performing the igc_up() procedure later. This patch also moved existing igc_disable_tx_ring_hw() to avoid using forward declaration. Kernel trace: [ 7678.747813] ------------[ cut here ]------------ [ 7678.757914] NETDEV WATCHDOG: enp1s0 (igc): transmit queue 2 timed out [ 7678.770117] WARNING: CPU: 0 PID: 13 at net/sched/sch_generic.c:525 dev_watchdog+0x1ae/0x1f0 [ 7678.784459] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 7678.784496] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 7679.200403] RIP: 0010:dev_watchdog+0x1ae/0x1f0 [ 7679.210201] Code: 28 e9 53 ff ff ff 4c 89 e7 c6 05 06 42 b9 00 01 e8 17 d1 fb ff 44 89 e9 4c 89 e6 48 c7 c7 40 ad fb 81 48 89 c2 e8 52 62 82 ff <0f> 0b e9 72 ff ff ff 65 8b 05 80 7d 7c 7e 89 c0 48 0f a3 05 0a c1 [ 7679.245438] RSP: 0018:ffa00000001f7d90 EFLAGS: 00010282 [ 7679.256021] RAX: 0000000000000000 RBX: ff11000109938440 RCX: 0000000000000000 [ 7679.268710] RDX: ff11000361e26cd8 RSI: ff11000361e1b880 RDI: ff11000361e1b880 [ 7679.281314] RBP: ffa00000001f7da8 R08: ff1100035f8fffe8 R09: 0000000000027ffb [ 7679.293840] R10: 0000000000001f0a R11: ff1100035f840000 R12: ff11000109938000 [ 7679.306276] R13: 0000000000000002 R14: dead000000000122 R15: ffa00000001f7e18 [ 7679.318648] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 7679.332064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7679.342757] CR2: 00007ffff7fca168 CR3: 000000013b08a006 CR4: 0000000000471ef8 [ 7679.354984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 7679.367207] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 7679.379370] PKRU: 55555554 [ 7679.386446] Call Trace: [ 7679.393152] <TASK> [ 7679.399363] ? __pfx_dev_watchdog+0x10/0x10 [ 7679.407870] call_timer_fn+0x31/0x110 [ 7679.415698] expire_timers+0xb2/0x120 [ 7679.423403] run_timer_softirq+0x179/0x1e0 [ 7679.431532] ? __schedule+0x2b1/0x820 [ 7679.439078] __do_softirq+0xd1/0x295 [ 7679.446426] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 7679.454867] run_ksoftirqd+0x22/0x30 [ 7679.462058] smpboot_thread_fn+0xb7/0x160 [ 7679.469670] kthread+0xcd/0xf0 [ 7679.476097] ? __pfx_kthread+0x10/0x10 [ 7679.483211] ret_from_fork+0x29/0x50 [ 7679.490047] </TASK> [ 7679.495204] ---[ end trace 0000000000000000 ]--- [ 7679.503179] igc 0000:01:00.0 enp1s0: Register Dump [ 7679.511230] igc 0000:01:00.0 enp1s0: Register Name Value [ 7679.519892] igc 0000:01:00.0 enp1s0: CTRL 181c0641 [ 7679.528782] igc 0000:01:00.0 enp1s0: STATUS 40280683 [ 7679.537551] igc 0000:01:00.0 enp1s0: CTRL_EXT 10000040 [ 7679.546284] igc 0000:01:00.0 enp1s0: MDIC 180a3800 [ 7679.554942] igc 0000:01:00.0 enp1s0: ICR 00000081 [ 7679.563503] igc 0000:01:00.0 enp1s0: RCTL 04408022 [ 7679.571963] igc 0000:01:00.0 enp1s0: RDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.583075] igc 0000:01:00.0 enp1s0: RDH[0-3] 00000068 000000b6 0000000f 00000031 [ 7679.594162] igc 0000:01:00.0 enp1s0: RDT[0-3] 00000066 000000b2 0000000e 00000030 [ 7679.605174] igc 0000:01:00.0 enp1s0: RXDCTL[0-3] 02040808 02040808 02040808 02040808 [ 7679.616196] igc 0000:01:00.0 enp1s0: RDBAL[0-3] 1bb7c000 1bb7f000 1bb82000 0ef33000 [ 7679.627242] igc 0000:01:00.0 enp1s0: RDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.638256] igc 0000:01:00.0 enp1s0: TCTL a503f0fa [ 7679.646607] igc 0000:01:00.0 enp1s0: TDBAL[0-3] 2ba4a000 1bb6f000 1bb74000 1bb79000 [ 7679.657609] igc 0000:01:00.0 enp1s0: TDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.668551] igc 0000:01:00.0 enp1s0: TDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.679470] igc 0000:01:00.0 enp1s0: TDH[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.690406] igc 0000:01:00.0 enp1s0: TDT[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.701264] igc 0000:01:00.0 enp1s0: TXDCTL[0-3] 02100108 02100108 02100108 02100108 [ 7679.712123] igc 0000:01:00.0 enp1s0: Reset adapter [ 7683.085967] igc 0000:01:00.0 enp1s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX [ 8086.945561] ------------[ cut here ]------------ Entering kdb (current=0xffffffff8220b200, pid 0) on processor 0 Oops: (null) due to oops @ 0xffffffff81573888 RIP: 0010:dql_completed+0x148/0x160 Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? igc_poll+0x1a9/0x14d0 [igc] __napi_poll+0x2e/0x1b0 net_rx_action+0x126/0x250 __do_softirq+0xd1/0x295 irq_exit_rcu+0xc5/0xf0 common_interrupt+0x86/0xa0 </IRQ> <TASK> asm_common_interrupt+0x27/0x40 RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 cpuidle_enter+0x2e/0x50 call_cpuidle+0x23/0x40 do_idle+0x1be/0x220 cpu_startup_entry+0x20/0x30 rest_init+0xb5/0xc0 arch_call_rest_init+0xe/0x30 start_kernel+0x448/0x760 x86_64_start_kernel+0x109/0x150 secondary_startup_64_no_verify+0xe0/0xeb </TASK> more> [0]kdb> [0]kdb> [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, type go a second time if you really want to continue [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, attempting to continue [ 8086.955689] refcount_t: underflow; use-after-free. [ 8086.955697] WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110 [ 8086.955706] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.955751] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 8086.955784] RIP: 0010:refcount_warn_saturate+0xc2/0x110 [ 8086.955788] Code: 01 e8 82 e7 b4 ff 0f 0b 5d c3 cc cc cc cc 80 3d 68 c6 eb 00 00 75 81 48 c7 c7 a0 87 f6 81 c6 05 58 c6 eb 00 01 e8 5e e7 b4 ff <0f> 0b 5d c3 cc cc cc cc 80 3d 42 c6 eb 00 00 0f 85 59 ff ff ff 48 [ 8086.955790] RSP: 0018:ffa0000000003da0 EFLAGS: 00010286 [ 8086.955793] RAX: 0000000000000000 RBX: ff1100011da40ee0 RCX: ff11000361e1b888 [ 8086.955794] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ff11000361e1b880 [ 8086.955795] RBP: ffa0000000003da0 R08: 80000000ffff9f45 R09: ffa0000000003d28 [ 8086.955796] R10: ff1100035f840000 R11: 0000000000000028 R12: ff11000319ff8000 [ 8086.955797] R13: ff1100011bb79d60 R14: 00000000ffffffd6 R15: ff1100037039cb00 [ 8086.955798] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955800] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955801] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955803] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955803] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955804] PKRU: 55555554 [ 8086.955805] Call Trace: [ 8086.955806] <IRQ> [ 8086.955808] tcp_wfree+0x112/0x130 [ 8086.955814] skb_release_head_state+0x24/0xa0 [ 8086.955818] napi_consume_skb+0x9c/0x160 [ 8086.955821] igc_poll+0x5d8/0x14d0 [igc] [ 8086.955835] __napi_poll+0x2e/0x1b0 [ 8086.955839] net_rx_action+0x126/0x250 [ 8086.955843] __do_softirq+0xd1/0x295 [ 8086.955846] irq_exit_rcu+0xc5/0xf0 [ 8086.955851] common_interrupt+0x86/0xa0 [ 8086.955857] </IRQ> [ 8086.955857] <TASK> [ 8086.955858] asm_common_interrupt+0x27/0x40 [ 8086.955862] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955866] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955867] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955869] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955870] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955871] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955872] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955873] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955875] cpuidle_enter+0x2e/0x50 [ 8086.955880] call_cpuidle+0x23/0x40 [ 8086.955884] do_idle+0x1be/0x220 [ 8086.955887] cpu_startup_entry+0x20/0x30 [ 8086.955889] rest_init+0xb5/0xc0 [ 8086.955892] arch_call_rest_init+0xe/0x30 [ 8086.955895] start_kernel+0x448/0x760 [ 8086.955898] x86_64_start_kernel+0x109/0x150 [ 8086.955900] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955904] </TASK> [ 8086.955904] ---[ end trace 0000000000000000 ]--- [ 8086.955912] ------------[ cut here ]------------ [ 8086.955913] kernel BUG at lib/dynamic_queue_limits.c:27! [ 8086.955918] invalid opcode: 0000 [#1] SMP [ 8086.955922] RIP: 0010:dql_completed+0x148/0x160 [ 8086.955925] Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc [ 8086.955927] RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 [ 8086.955928] RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 [ 8086.955929] RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 [ 8086.955930] RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 [ 8086.955931] R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 [ 8086.955932] R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 [ 8086.955933] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955935] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955937] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955938] PKRU: 55555554 [ 8086.955939] Call Trace: [ 8086.955939] <IRQ> [ 8086.955940] ? igc_poll+0x1a9/0x14d0 [igc] [ 8086.955949] __napi_poll+0x2e/0x1b0 [ 8086.955952] net_rx_action+0x126/0x250 [ 8086.955956] __do_softirq+0xd1/0x295 [ 8086.955958] irq_exit_rcu+0xc5/0xf0 [ 8086.955961] common_interrupt+0x86/0xa0 [ 8086.955964] </IRQ> [ 8086.955965] <TASK> [ 8086.955965] asm_common_interrupt+0x27/0x40 [ 8086.955968] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955971] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955972] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955973] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955974] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955974] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955975] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955976] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955978] cpuidle_enter+0x2e/0x50 [ 8086.955981] call_cpuidle+0x23/0x40 [ 8086.955984] do_idle+0x1be/0x220 [ 8086.955985] cpu_startup_entry+0x20/0x30 [ 8086.955987] rest_init+0xb5/0xc0 [ 8086.955990] arch_call_rest_init+0xe/0x30 [ 8086.955992] start_kernel+0x448/0x760 [ 8086.955994] x86_64_start_kernel+0x109/0x150 [ 8086.955996] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955998] </TASK> [ 8086.955999] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.956029] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [16762.543675] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.593 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.595 msecs [16762.543673] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.495 msecs [16762.543679] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.598 msecs [16762.543690] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.605 msecs [16762.543684] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543693] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.613 msecs [16762.543784] ---[ end trace 0000000000000000 ]--- [16762.849099] RIP: 0010:dql_completed+0x148/0x160 PANIC: Fatal exception in interrupt Fixes: 9b275176 ("igc: Add ndo_tx_timeout support") Tested-by: Alejandra Victoria Alcaraz <alejandra.victoria.alcaraz@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Marangi authored
The qca8k switch doesn't support using 0 as VID and require a default VID to be always set. MDB add/del function doesn't currently handle this and are currently setting the default VID. Fix this by correctly handling this corner case and internally use the default VID for VID 0 case. Fixes: ba8f870d ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Marangi authored
On deleting an MDB entry for a port, fdb_search_and_del is used. An FDB entry can't be modified so it needs to be deleted and readded again with the new portmap (and the port deleted as requested) We use the SEARCH operator to search the entry to edit by vid and mac address and then we check the aging if we actually found an entry. Currently the code suffer from a bug where the searched fdb entry is never read again with the found values (if found) resulting in the code always returning -EINVAL as aging was always 0. Fix this by correctly read the fdb entry after it was searched. Fixes: ba8f870d ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Marangi authored
On inserting a mdb entry, fdb_search_and_insert is used to add a port to the qca8k target entry in the FDB db. A FDB entry can't be modified so it needs to be removed and insert again with the new values. To detect if an entry already exist, the SEARCH operation is used and we check the aging of the entry. If the entry is not 0, the entry exist and we proceed to delete it. Current code have 2 main problem: - The condition to check if the FDB entry exist is wrong and should be the opposite. - When a FDB entry doesn't exist, aging was never actually set to the STATIC value resulting in allocating an invalid entry. Fix both problem by adding aging support to the function, calling the function with STATIC as aging by default and finally by correct the condition to check if the entry actually exist. Fixes: ba8f870d ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Marangi authored
The qca8xxx switch supports 2 way to write reg values, a slow way using mdio and a fast way by sending specially crafted mgmt packet to read/write reg. The fast way can support up to 32 bytes of data as eth packet are used to send/receive. This correctly works for almost the entire regmap of the switch but with the use of some kernel selftests for dsa drivers it was found a funny and interesting hw defect/limitation. For some specific reg, bulk write won't work and will result in writing only part of the requested regs resulting in half data written. This was especially hard to track and discover due to the total strangeness of the problem and also by the specific regs where this occurs. This occurs in the specific regs of the ATU table, where multiple entry needs to be written to compose the entire entry. It was discovered that with a bulk write of 12 bytes on QCA8K_REG_ATU_DATA0 only QCA8K_REG_ATU_DATA0 and QCA8K_REG_ATU_DATA2 were written, but QCA8K_REG_ATU_DATA1 was always zero. Tcpdump was used to make sure the specially crafted packet was correct and this was confirmed. The problem was hard to track as the lack of QCA8K_REG_ATU_DATA1 resulted in an entry somehow possible as the first bytes of the mac address are set in QCA8K_REG_ATU_DATA0 and the entry type is set in QCA8K_REG_ATU_DATA2. Funlly enough writing QCA8K_REG_ATU_DATA1 results in the same problem with QCA8K_REG_ATU_DATA2 empty and QCA8K_REG_ATU_DATA1 and QCA8K_REG_ATU_FUNC correctly written. A speculation on the problem might be that there are some kind of indirection internally when accessing these regs and they can't be accessed all together, due to the fact that it's really a table mapped somewhere in the switch SRAM. Even more funny is the fact that every other reg was tested with all kind of combination and they are not affected by this problem. Read operation was also tested and always worked so it's not affected by this problem. The problem is not present if we limit writing a single reg at times. To handle this hardware defect, enable use_single_write so that bulk api can correctly split the write in multiple different operation effectively reverting to a non-bulk write. Cc: Mark Brown <broonie@kernel.org> Fixes: c766e077 ("net: dsa: qca8k: convert to regmap read/write API") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Last year, the code that manages GSI channel transactions switched from using spinlock-protected linked lists to using indexes into the ring buffer used for a channel. Recently, Google reported seeing transaction reference count underflows occasionally during shutdown. Doug Anderson found a way to reproduce the issue reliably, and bisected the issue to the commit that eliminated the linked lists and the lock. The root cause was ultimately determined to be related to unused transactions being committed as part of the modem shutdown cleanup activity. Unused transactions are not normally expected (except in error cases). The modem uses some ranges of IPA-resident memory, and whenever it shuts down we zero those ranges. In ipa_filter_reset_table() a transaction is allocated to zero modem filter table entries. If hashing is not supported, hashed table memory should not be zeroed. But currently nothing prevents that, and the result is an unused transaction. Something similar occurs when we zero routing table entries for the modem. By preventing any attempt to clear hashed tables when hashing is not supported, the reference count underflow is avoided in this case. Note that there likely remains an issue with properly freeing unused transactions (if they occur due to errors). This patch addresses only the underflows that Google originally reported. Cc: <stable@vger.kernel.org> # 6.1.x Fixes: d338ae28 ("net: ipa: kill all other transaction lists") Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20230724224055.1688854-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Kuniyuki Iwashima says: ==================== net: Fix error/warning by -fstrict-flex-arrays=3. df8fc4e9 ("kbuild: Enable -fstrict-flex-arrays=3") started applying strict rules for standard string functions (strlen(), memcpy(), etc.) if CONFIG_FORTIFY_SOURCE=y. This series fixes two false positives caught by syzkaller. v2: https://lore.kernel.org/netdev/20230720004410.87588-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20230719185322.44255-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20230724213425.22920-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
syzkaller found a warning in packet_getname() [0], where we try to copy 16 bytes to sockaddr_ll.sll_addr[8]. Some devices (ip6gre, vti6, ip6tnl) have 16 bytes address expressed by struct in6_addr. Also, Infiniband has 32 bytes as MAX_ADDR_LEN. The write seems to overflow, but actually not since we use struct sockaddr_storage defined in __sys_getsockname() and its size is 128 (_K_SS_MAXSIZE) bytes. Thus, we have sufficient room after sll_addr[] as __data[]. To avoid the warning, let's add a flex array member union-ed with sll_addr. Another option would be to use strncpy() and limit the copied length to sizeof(sll_addr), but it will return the partial address and break an application that passes sockaddr_storage to getsockname(). [0]: memcpy: detected field-spanning write (size 16) of single field "sll->sll_addr" at net/packet/af_packet.c:3604 (size 8) WARNING: CPU: 0 PID: 255 at net/packet/af_packet.c:3604 packet_getname+0x25c/0x3a0 net/packet/af_packet.c:3604 Modules linked in: CPU: 0 PID: 255 Comm: syz-executor750 Not tainted 6.5.0-rc1-00330-g60cc1f7d #4 Hardware name: linux,dummy-virt (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : packet_getname+0x25c/0x3a0 net/packet/af_packet.c:3604 lr : packet_getname+0x25c/0x3a0 net/packet/af_packet.c:3604 sp : ffff800089887bc0 x29: ffff800089887bc0 x28: ffff000010f80f80 x27: 0000000000000003 x26: dfff800000000000 x25: ffff700011310f80 x24: ffff800087d55000 x23: dfff800000000000 x22: ffff800089887c2c x21: 0000000000000010 x20: ffff00000de08310 x19: ffff800089887c20 x18: ffff800086ab1630 x17: 20646c6569662065 x16: 6c676e697320666f x15: 0000000000000001 x14: 1fffe0000d56d7ca x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 3e60944c3da92b00 x8 : 3e60944c3da92b00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff8000898874f8 x4 : ffff800086ac99e0 x3 : ffff8000803f8808 x2 : 0000000000000001 x1 : 0000000100000000 x0 : 0000000000000000 Call trace: packet_getname+0x25c/0x3a0 net/packet/af_packet.c:3604 __sys_getsockname+0x168/0x24c net/socket.c:2042 __do_sys_getsockname net/socket.c:2057 [inline] __se_sys_getsockname net/socket.c:2054 [inline] __arm64_sys_getsockname+0x7c/0x94 net/socket.c:2054 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x240 arch/arm64/kernel/syscall.c:139 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:188 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:647 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Fixes: df8fc4e9 ("kbuild: Enable -fstrict-flex-arrays=3") Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230724213425.22920-3-kuniyu@amazon.comReviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
syzkaller found a bug in unix_bind_bsd() [0]. We can reproduce it by bind()ing a socket on a path with length 108. 108 is the size of sun_addr of struct sockaddr_un and is the maximum valid length for the pathname socket. When calling bind(), we use struct sockaddr_storage as the actual buffer size, so terminating sun_addr[108] with null is legitimate as done in unix_mkname_bsd(). However, strlen(sunaddr) for such a case causes fortify_panic() if CONFIG_FORTIFY_SOURCE=y. __fortify_strlen() has no idea about the actual buffer size and see the string as unterminated. Let's use strnlen() to allow sun_addr to be unterminated at 107. [0]: detected buffer overflow in __fortify_strlen kernel BUG at lib/string_helpers.c:1031! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 255 Comm: syz-executor296 Not tainted 6.5.0-rc1-00330-g60cc1f7d #4 Hardware name: linux,dummy-virt (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : fortify_panic+0x1c/0x20 lib/string_helpers.c:1030 lr : fortify_panic+0x1c/0x20 lib/string_helpers.c:1030 sp : ffff800089817af0 x29: ffff800089817af0 x28: ffff800089817b40 x27: 1ffff00011302f68 x26: 000000000000006e x25: 0000000000000012 x24: ffff800087e60140 x23: dfff800000000000 x22: ffff800089817c20 x21: ffff800089817c8e x20: 000000000000006c x19: ffff00000c323900 x18: ffff800086ab1630 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000001 x14: 1ffff00011302eb8 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 64a26b65474d2a00 x8 : 64a26b65474d2a00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff800089817438 x4 : ffff800086ac99e0 x3 : ffff800080f19e8c x2 : 0000000000000001 x1 : 0000000100000000 x0 : 000000000000002c Call trace: fortify_panic+0x1c/0x20 lib/string_helpers.c:1030 _Z16__fortify_strlenPKcU25pass_dynamic_object_size1 include/linux/fortify-string.h:217 [inline] unix_bind_bsd net/unix/af_unix.c:1212 [inline] unix_bind+0xba8/0xc58 net/unix/af_unix.c:1326 __sys_bind+0x1ac/0x248 net/socket.c:1792 __do_sys_bind net/socket.c:1803 [inline] __se_sys_bind net/socket.c:1801 [inline] __arm64_sys_bind+0x7c/0x94 net/socket.c:1801 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x134/0x240 arch/arm64/kernel/syscall.c:139 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:188 el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:647 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: aa0003e1 d0000e80 91030000 97ffc91a (d4210000) Fixes: df8fc4e9 ("kbuild: Enable -fstrict-flex-arrays=3") Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230724213425.22920-2-kuniyu@amazon.comReviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Lin Ma authored
The previous commit 954d1fa1 ("macvlan: Add netlink attribute for broadcast cutoff") added one additional attribute named IFLA_MACVLAN_BC_CUTOFF to allow broadcast cutfoff. However, it forgot to describe the nla_policy at macvlan_policy (drivers/net/macvlan.c). Hence, this suppose NLA_S32 (4 bytes) integer can be faked as empty (0 bytes) by a malicious user, which could leads to OOB in heap just like CVE-2023-3773. To fix it, this commit just completes the nla_policy description for IFLA_MACVLAN_BC_CUTOFF. This enforces the length check and avoids the potential OOB read. Fixes: 954d1fa1 ("macvlan: Add netlink attribute for broadcast cutoff") Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230723080205.3715164-1-linma@zju.edu.cnSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 25 Jul, 2023 8 commits
-
-
Vincent Whitchurch authored
commit a3a57bf0 ("net: stmmac: work around sporadic tx issue on link-up") worked around a problem with TX sometimes not working after a link-up by avoiding a redundant write to MAC_CTRL_REG (aka GMAC_CONFIG), since the IP appeared to have problems with handling multiple writes to that register in some cases. That commit however only added the work around to dwmac_lib.c (apart from the common code in stmmac_main.c), but my systems with version 4.21a of the IP exhibit the same problem, so add the work around to dwmac4_lib.c too. Fixes: a3a57bf0 ("net: stmmac: work around sporadic tx issue on link-up") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230721-stmmac-tx-workaround-v1-1-9411cbd5ee07@axis.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Suman Ghosh authored
As of today, hash extraction support is enabled for all the silicons. Because of which we are facing initialization issues when the silicon does not support hash extraction. During creation of the hardware parsing table for IPv6 address, we need to consider if hash extraction is enabled then extract only 32 bit, otherwise 128 bit needs to be extracted. This patch fixes the issue and configures the hardware parser based on the availability of the feature. Fixes: a95ab935 ("octeontx2-af: Use hashed field in MCAM key") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230721061222.2632521-1-sumang@marvell.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Hangbin Liu says: ==================== Fix up dev flags when add P2P down link When adding p2p interfaces to bond/team. The POINTOPOINT, NOARP flags are not inherit to up devices. Which will trigger IPv6 DAD. Since there is no ethernet MAC address for P2P devices. This will cause unexpected DAD failures. ==================== Link: https://lore.kernel.org/r/20230721040356.3591174-1-liuhangbin@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Hangbin Liu authored
When adding a point to point downlink to team device, we neglected to reset the team's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to team device. Fix this by remove multicast/broadcast flags and add p2p and noarp flags. After removing the none ethernet interface and adding an ethernet interface to team, we need to reset team interface flags. Unlike bonding interface, team do not need restore IFF_MASTER, IFF_SLAVE flags. Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 1d76efe1 ("team: add support for non-ethernet devices") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Hangbin Liu authored
When adding a point to point downlink to the bond, we neglected to reset the bond's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to the bonding. To address this issue, let's reset the bond's flags for P2P interfaces. Before fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188:: 8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 brd 2006:70:10::2 inet6 fe80::200:ff:fe00:0/64 scope link valid_lft forever preferred_lft forever After fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9:: 8: bond0: <POINTOPOINT,NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 inet6 fe80::1/64 scope link valid_lft forever preferred_lft forever Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 872254dd ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueJakub Kicinski authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-07-21 (i40e, iavf) This series contains updates to i40e and iavf drivers. Wang Ming corrects an error check on i40e. Jake unlocks crit_lock on allocation failure to prevent deadlock and stops re-enabling of interrupts when it's not intended for iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: check for removal state before IAVF_FLAG_PF_COMMS_FAILED iavf: fix potential deadlock on allocation failure i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir() ==================== Link: https://lore.kernel.org/r/20230721155812.1292752-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Merge tag 'linux-can-fixes-for-6.5-20230724' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2023-07-24 The first patch is by me and adds a missing set of CAN state to CAN_STATE_STOPPED on close in the gs_usb driver. The last patch is by Eric Dumazet and fixes a lockdep issue in the CAN raw protocol. * tag 'linux-can-fixes-for-6.5-20230724' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: raw: fix lockdep issue in raw_release() can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED ==================== Link: https://lore.kernel.org/r/20230724150141.766047-1-mkl@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jedrzej Jagielski authored
Fix ethtool FDIR logic to not use memory after its release. In the ice_ethtool_fdir.c file there are 2 spots where code can refer to pointers which may be missing. In the ice_cfg_fdir_xtrct_seq() function seg may be freed but even then may be still used by memcpy(&tun_seg[1], seg, sizeof(*seg)). In the ice_add_fdir_ethtool() function struct ice_fdir_fltr *input may first fail to be added via ice_fdir_update_list_entry() but then may be deleted by ice_fdir_update_list_entry. Terminate in both cases when the returned value of the previous operation is other than 0, free memory and don't use it anymore. Reported-by: Michal Schmidt <mschmidt@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2208423 Fixes: cac2a27c ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230721155854.1292805-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 24 Jul, 2023 6 commits
-
-
Stewart Smith authored
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash table with a hash over the source & destination addresses and ports. However, the IPv6 hash is insufficient and can lead to a high rate of collisions. The IPv6 hash used an XOR to fit everything into the 96 bits for the fast jenkins hash, meaning it is possible for an external entity to ensure the hash collides, thus falling back to a linear search in the bucket, which is slow. We take the approach of hash the full length of IPv6 address in __ipv6_addr_jhash() so that all users can benefit from a more secure version. While this may look like it adds overhead, the reality of modern CPUs means that this is unmeasurable in real world scenarios. In simulating with llvm-mca, the increase in cycles for the hashing code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9 on Nehalem (base of ~173). In commit dd6d2910 ("netfilter: conntrack: switch to siphash") netfilter switched from a jenkins hash to a siphash, but even the faster hsiphash is a more significant overhead (~20-30%) in some preliminary testing. So, in this patch, we keep to the more conservative approach to ensure we don't add much overhead per SYN. In testing, this results in a consistently even spread across the connection buckets. In both testing and real-world scenarios, we have not found any measurable performance impact. Fixes: 08dcdbf6 ("ipv6: use a stronger hash for tcp") Signed-off-by: Stewart Smith <trawets@amazon.com> Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230721222410.17914-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Wei Fang authored
According to the implementation of XDP of FEC driver, the XDP path shares the transmit queues with the kernel network stack, so it is possible to lead to a tx timeout event when XDP uses the tx queue pretty much exclusively. And this event will cause the reset of the FEC hardware. To avoid timeout in this case, we use the txq_trans_cond_update() interface to update txq->trans_start to jiffies so that watchdog won't generate a transmit timeout warning. Fixes: 6d6b39f1 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://lore.kernel.org/r/20230721083559.2857312-1-wei.fang@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maciej Żenczykowski authored
currently on 6.4 net/main: # ip link add dummy1 type dummy # echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr # ip link set dummy1 up # ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1 11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic valid_lft 604800sec preferred_lft 86172sec inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever # ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1 (can wait a few seconds if you want to, the above delete isn't [directly] the problem) # ip -6 addr show dev dummy1 11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever # ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1 11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic valid_lft 604797sec preferred_lft 86169sec inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever This patch prevents this new 'global temporary dynamic' address from being created by the deletion of the related (same subnet prefix) 'mngtmpaddr' (which is triggered by there already being no temporary addresses). Cc: Jiri Pirko <jiri@resnulli.us> Fixes: 53bd6749 ("ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses") Reported-by: Xiao Ma <xiaom@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230720160022.1887942-1-maze@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yuanjun Gong authored
in atl1e_tso_csum, it should check the return value of pskb_trim(), and return an error code if an unexpected value is returned by pskb_trim(). Fixes: a6a53252 ("atl1e: Atheros L1E Gigabit Ethernet driver") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230720144219.39285-1-ruc_gongyuanjun@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yuanjun Gong authored
in atl1_tso(), it should check the return value of pskb_trim(), and return an error code if an unexpected value is returned by pskb_trim(). Fixes: 401c0aab ("atl1: simplify tx packet descriptor") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Link: https://lore.kernel.org/r/20230722142511.12448-1-ruc_gongyuanjun@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
David S. Miller authored
Jiri Benc says: ==================== vxlan: fix GRO with VXLAN-GPE The first patch generalizes code for the second patch, which is a fix for broken VXLAN-GPE GRO. Thanks to Paolo for noticing the bug. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-