Commits · fce68b03086fd00eb5a8ba4744f36f0d007d0f9d · Kirill Smelkov / linux

23 Aug, 2023 6 commits

mptcp: add scheduled in mptcp_subflow_context · fce68b03

Geliang Tang authored Aug 21, 2023

This patch adds a new member scheduled in struct mptcp_subflow_context,
which will be set in the MPTCP scheduler context when the scheduler
picks this subflow to send data.

Add a new helper mptcp_subflow_set_scheduled() to set this flag using
WRITE_ONCE().
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-6-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

fce68b03

mptcp: add sched in mptcp_sock · 1730b2b2

Geliang Tang authored Aug 21, 2023

This patch adds a new struct member sched in struct mptcp_sock.
And two helpers mptcp_init_sched() and mptcp_release_sched() to
init and release it.

Init it with the sysctl scheduler in mptcp_init_sock(), copy the
scheduler from the parent in mptcp_sk_clone(), and release it in
__mptcp_destroy_sock().
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-5-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1730b2b2

mptcp: add a new sysctl scheduler · e3b2870b

Geliang Tang authored Aug 21, 2023

This patch adds a new sysctl, named scheduler, to support for selection
of different schedulers. Export mptcp_get_scheduler helper to get this
sysctl.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-4-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e3b2870b

mptcp: add struct mptcp_sched_ops · 740ebe35

Geliang Tang authored Aug 21, 2023

This patch defines struct mptcp_sched_ops, which has three struct members,
name, owner and list, and four function pointers: init(), release() and
get_subflow().

The scheduler function get_subflow() have a struct mptcp_sched_data
parameter, which contains a reinject flag for retrans or not, a subflows
number and a mptcp_subflow_context array.

Add the scheduler registering, unregistering and finding functions to add,
delete and find a packet scheduler on the global list mptcp_sched_list.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-3-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

740ebe35

mptcp: drop last_snd and MPTCP_RESET_SCHEDULER · ebc1e08f

Geliang Tang authored Aug 21, 2023

Since the burst check conditions have moved out of the function
mptcp_subflow_get_send(), it makes all msk->last_snd useless.
This patch drops them as well as the macro MPTCP_RESET_SCHEDULER.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-2-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ebc1e08f

mptcp: refactor push_pending logic · c5b4297d

Geliang Tang authored Aug 21, 2023

To support redundant package schedulers more easily, this patch refactors
__mptcp_push_pending() logic from:

For each dfrag:
	While sends succeed:
		Call the scheduler (selects subflow and msk->snd_burst)
		Update subflow locks (push/release/acquire as needed)
		Send the dfrag data with mptcp_sendmsg_frag()
		Update already_sent, snd_nxt, snd_burst
	Update msk->first_pending
Push/release on final subflow

->

While first_pending isn't empty:
	Call the scheduler (selects subflow and msk->snd_burst)
	Update subflow locks (push/release/acquire as needed)
	For each pending dfrag:
		While sends succeed:
			Send the dfrag data with mptcp_sendmsg_frag()
			Update already_sent, snd_nxt, snd_burst
		Update msk->first_pending
		Break if required by msk->snd_burst / etc
	Push/release on final subflow

Refactors __mptcp_subflow_push_pending logic from:

For each dfrag:
	While sends succeed:
		Call the scheduler (selects subflow and msk->snd_burst)
		Send the dfrag data with mptcp_subflow_delegate(), break
		Send the dfrag data with mptcp_sendmsg_frag()
		Update dfrag->already_sent, msk->snd_nxt, msk->snd_burst
	Update msk->first_pending

->

While first_pending isn't empty:
	Call the scheduler (selects subflow and msk->snd_burst)
	Send the dfrag data with mptcp_subflow_delegate(), break
	Send the dfrag data with mptcp_sendmsg_frag()
	For each pending dfrag:
		While sends succeed:
			Send the dfrag data with mptcp_sendmsg_frag()
			Update already_sent, snd_nxt, snd_burst
		Update msk->first_pending
		Break if required by msk->snd_burst / etc

Move the duplicate code from __mptcp_push_pending() and
__mptcp_subflow_push_pending() into a new helper function, named
__subflow_push_pending(). Simplify __mptcp_push_pending() and
__mptcp_subflow_push_pending() by invoking this helper.

Also move the burst check conditions out of the function
mptcp_subflow_get_send(), check them in __subflow_push_pending() in
the inner "for each pending dfrag" loop.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-1-0c860fb256a8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

c5b4297d

22 Aug, 2023 12 commits

Merge tag 'mlx5-updates-2023-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 98173633

Jakub Kicinski authored Aug 22, 2023

Saeed Mahameed says:

====================
mlx5-updates-2023-08-16

1) aRFS ethtool stats

Improve aRFS observability by adding new set of counters. Each Rx
ring will have this set of counters listed below.
These counters are exposed through ethtool -S.

1.1) arfs_add: number of times a new rule has been created.
1.2) arfs_request_in: number of times a rule  was requested to move from
   its current Rx ring to a new Rx ring (incremented on the destination
   Rx ring).
1.3) arfs_request_out: number of times a rule  was requested to move out
   from its current Rx ring (incremented on source/current Rx ring).
1.4) arfs_expired: number of times a rule has been expired by the
   kernel and removed from HW.
1.5) arfs_err: number of times a rule creation or modification has
   failed.

2) Supporting inline WQE when possible in SW steering

3) Misc cleanups and fixups to net-next branch

* tag 'mlx5-updates-2023-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Devcom, only use devcom after NULL check in mlx5_devcom_send_event()
  net/mlx5: DR, Supporting inline WQE when possible
  net/mlx5: Rename devlink port ops struct for PFs/VFs
  net/mlx5: Remove VPORT_UPLINK handling from devlink_port.c
  net/mlx5: Call mlx5_esw_offloads_rep_load/unload() for uplink port directly
  net/mlx5: Update dead links in Kconfig documentation
  net/mlx5: Remove health syndrome enum duplication
  net/mlx5: DR, Remove unneeded local variable
  net/mlx5: DR, Fix code indentation
  net/mlx5: IRQ, consolidate irq and affinity mask allocation
  net/mlx5e: Fix spelling mistake "Faided" -> "Failed"
  net/mlx5e: aRFS, Introduce ethtool stats
  net/mlx5e: aRFS, Warn if aRFS table does not exist for aRFS rule
  net/mlx5e: aRFS, Prevent repeated kernel rule migrations requests
====================

Link: https://lore.kernel.org/r/20230821175739.81188-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

98173633

vrf: Remove unnecessary RCU-bh critical section · 504fc6f4

Ido Schimmel authored Aug 21, 2023

dev_queue_xmit_nit() already uses rcu_read_lock() / rcu_read_unlock()
and nothing suggests that softIRQs should be disabled around it.
Therefore, remove the rcu_read_lock_bh() / rcu_read_unlock_bh()
surrounding it.

Tested using [1] with lockdep enabled.

[1]
 #!/bin/bash

 ip link add name vrf1 up type vrf table 100
 ip link add name veth0 type veth peer name veth1
 ip link set dev veth1 master vrf1
 ip link set dev veth0 up
 ip link set dev veth1 up
 ip address add 192.0.2.1/24 dev veth0
 ip address add 192.0.2.2/24 dev veth1
 ip rule add pref 32765 table local
 ip rule del pref 0
 tcpdump -i vrf1 -c 20 -w /dev/null &
 sleep 10
 ping -i 0.1 -c 10 -q 192.0.2.2
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230821142339.1889961-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

504fc6f4

vxlan: vnifilter: Use GFP_KERNEL instead of GFP_ATOMIC · 63c11dc2

Ido Schimmel authored Aug 21, 2023

The function is not called from an atomic context so use GFP_KERNEL
instead of GFP_ATOMIC. The allocation of the per-CPU stats is already
performed with GFP_KERNEL.

Tested using test_vxlan_vnifiltering.sh with CONFIG_DEBUG_ATOMIC_SLEEP.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230821141923.1889776-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

63c11dc2

net: ethernet: ti: Remove unused declarations · a491add1

Yue Haibing authored Aug 21, 2023

Commit e8609e69 ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK")
removed am65_cpsw_nuss_adjust_link() but not its declaration.
Commit 84640e27 ("net: netcp: Add Keystone NetCP core ethernet driver")
declared but never implemented netcp_device_find_module().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Roger Quadros <rogerq@kernel.org>
Link: https://lore.kernel.org/r/20230821134029.40084-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a491add1

net: microchip: Remove unused declarations · dff96d7c

Yue Haibing authored Aug 21, 2023

Commit 264a9c5c ("net: sparx5: Remove unused GLAG handling in PGID")
removed sparx5_pgid_alloc_glag() but not its declaration.
Commit 27d293cc ("net: microchip: sparx5: Add support for rule count by cookie")
removed vcap_rule_iter() but not its declaration.
Commit 8beef08f ("net: microchip: sparx5: Adding initial VCAP API support")
declared but never implemented vcap_api_set_client().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230821135556.43224-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

dff96d7c

ionic: Remove unused declarations · efa47e80

Yue Haibing authored Aug 21, 2023

Commit fbfb8031 ("ionic: Add hardware init and device commands")
declared but never implemented ionic_q_rewind()/ionic_set_dma_mask().
Commit 969f8439 ("ionic: sync the filters in the work task")
declared but never implemented ionic_rx_filters_need_sync().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20230821134717.51936-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

efa47e80

net: mscc: ocelot: Remove unused declarations · 49e62a04

Yue Haibing authored Aug 21, 2023

Commit 6c30384e ("net: mscc: ocelot: register devlink ports")
declared but never implemented ocelot_devlink_init() and
ocelot_devlink_teardown().
Commit 20968054 ("net: mscc: ocelot: automatically detect VCAP constants")
declared but never implemented ocelot_detect_vcap_constants().
Commit 403ffc2c ("net: mscc: ocelot: add support for preemptible traffic classes")
declared but never implemented ocelot_port_update_preemptible_tcs().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230821130218.19096-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

49e62a04

net: dsa: microchip: Remove unused declarations · 73582f09

Yue Haibing authored Aug 21, 2023

Commit 91a98917 ("net: dsa: microchip: move switch chip_id detection to ksz_common")
removed ksz8_switch_detect() but not its declaration.
Commit 6ec23aaa ("net: dsa: microchip: move ksz_dev_ops to ksz_common.c")
declared but never implemented other functions.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230821125501.19624-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

73582f09

bonding: update port speed when getting bond speed · 691b2bf1

Hangbin Liu authored Aug 21, 2023

Andrew reported a bonding issue that if we put an active-back bond on top
of a 802.3ad bond interface. When the 802.3ad bond's speed/duplex changed
dynamically. The upper bonding interface's speed/duplex can't be changed at
the same time, which will show incorrect speed.

Fix it by updating the port speed when calling ethtool.
Reported-by: Andrew Schorr <ajschorr@alumni.princeton.edu>
Closes: https://lore.kernel.org/netdev/ZEt3hvyREPVdbesO@Laptop-X1/Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20230821101008.797482-1-liuhangbin@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

691b2bf1

net: remove unnecessary input parameter 'how' in ifdown function · 43c28172

Zhengchao Shao authored Aug 21, 2023

When the ifdown function in the dst_ops structure is referenced, the input
parameter 'how' is always true. In the current implementation of the
ifdown interface, ip6_dst_ifdown does not use the input parameter 'how',
xfrm6_dst_ifdown and xfrm4_dst_ifdown functions use the input parameter
'unregister'. But false judgment on 'unregister' in xfrm6_dst_ifdown and
xfrm4_dst_ifdown is false, so remove the input parameter 'how' in ifdown
function.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230821084104.3812233-1-shaozhengchao@huawei.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

43c28172

alx: fix OOB-read compiler warning · 3a198c95

GONG, Ruiqi authored Aug 21, 2023

The following message shows up when compiling with W=1:

In function ‘fortify_memcpy_chk’,
    inlined from ‘alx_get_ethtool_stats’ at drivers/net/ethernet/atheros/alx/ethtool.c:297:2:
./include/linux/fortify-string.h:592:4: error: call to ‘__read_overflow2_field’
declared with attribute warning: detected read beyond size of field (2nd parameter);
maybe use struct_group()? [-Werror=attribute-warning]
  592 |    __read_overflow2_field(q_size_field, size);
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In order to get alx stats altogether, alx_get_ethtool_stats() reads
beyond hw->stats.rx_ok. Fix this warning by directly copying hw->stats,
and refactor the unnecessarily complicated BUILD_BUG_ON btw.
Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230821013218.1614265-1-gongruiqi@huaweicloud.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

3a198c95

net: pcs: lynxi: implement pcs_disable op · 90308679

Daniel Golle authored Aug 18, 2023

When switching from 10GBase-R/5GBase-R/USXGMII to one of the interface
modes provided by mtk-pcs-lynxi we need to make sure to always perform
a full configuration of the PHYA.

Implement pcs_disable op which resets the stored interface mode to
PHY_INTERFACE_MODE_NA to trigger a full reconfiguration once the LynxI
PCS driver had previously been deselected in favor of another PCS
driver such as the to-be-added driver for the USXGMII PCS found in
MT7988.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/f23d1a60d2c9d2fb72e32dcb0eaa5f7e867a3d68.1692327891.git.daniel@makrotopia.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

90308679

21 Aug, 2023 18 commits

Revert "pds_core: Fix some kernel-doc comments" · 7eb6deb3

Jakub Kicinski authored Aug 21, 2023

This reverts commit cb39c357.
Patch was applied to hastily, the problem is already fixed
in Alex's vfio tree:
https://lore.kernel.org/all/20230821112237.105872b5.alex.williamson@redhat.com/Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7eb6deb3

net/mlx5: Devcom, only use devcom after NULL check in mlx5_devcom_send_event() · 7d7c6e8c

Li Zetao authored Aug 14, 2023

There is a warning reported by kernel test robot:

smatch warnings:
drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c:264
	mlx5_devcom_send_event() warn: variable dereferenced before
		IS_ERR check devcom (see line 259)

The reason for the warning is that the pointer is used before check, put
the assignment to comp after devcom check to silence the warning.

Fixes: 88d162b4 ("net/mlx5: Devcom, Infrastructure changes")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202308041028.AkXYDwJ6-lkp@intel.com/Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7d7c6e8c

net/mlx5: DR, Supporting inline WQE when possible · 95c337cc

Itamar Gozlan authored Aug 07, 2023

In WQE (Work Queue Entry), the two types of data segments memories are
pointers and inline data, where inline data is passed directly as
part of the WQE.
For software steering, the maximal inline size should be less than
2*MLX5_SEND_WQE_BB, i.e., the potential data must fit with the required
inline WQE headers.

Two consecutive blocks (MLX5_SEND_WQE_BB) are not guaranteed to reside
on the same memory page. Hence, writes to MLX5_SEND_WQE_BB should be
done separately, i.e., each MLX5_SEND_WQE_BB  should be obtained using
the mlx5_wq_cyc_get_wqe macro.
Signed-off-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

95c337cc

net/mlx5: Rename devlink port ops struct for PFs/VFs · df3822f5

Jiri Pirko authored May 26, 2023

As this struct is only used for devlink ports created for PF/VF,
add it to the name of the variable to distinguish from the SF related
ops struct.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

df3822f5

net/mlx5: Remove VPORT_UPLINK handling from devlink_port.c · 52020903

Jiri Pirko authored May 25, 2023

It is not possible that the functions in devlink_port.c are called for
uplink port. Remove this leftover code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

52020903

net/mlx5: Call mlx5_esw_offloads_rep_load/unload() for uplink port directly · ba3d85f0

Jiri Pirko authored May 24, 2023

For uplink port, mlx5_esw_offloads_load/unload_rep() are currently
called. There are 2 check inside, which effectively make the
functions a simple wrappers of mlx5_esw_offloads_rep_load/unload()
for uplink port. So avoid one check and indirection and call
mlx5_esw_offloads_rep_load/unload() for uplink port directly.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

ba3d85f0

net/mlx5: Update dead links in Kconfig documentation · 6c8f7c43

Rahul Rameshbabu authored Jul 10, 2023

Point to NVIDIA documentation for device specific information now that the
Mellanox documentation site is deprecated. Refer to kernel documentation
sources for generic information not specific to mlx5 devices.
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

6c8f7c43

net/mlx5: Remove health syndrome enum duplication · ab943e2e

Gal Pressman authored Jun 26, 2023

Health syndrome enum values were duplicated in mlx5_ifc and health.c,
the correct place for them is mlx5_ifc.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

ab943e2e

net/mlx5: DR, Remove unneeded local variable · a15e472f

Yevgeny Kliteynik authored Aug 02, 2023

Remove local variable that is already defined outside of
the scope of this block.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a15e472f

net/mlx5: DR, Fix code indentation · f83e2d8a

Yevgeny Kliteynik authored Aug 02, 2023

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

f83e2d8a

net/mlx5: IRQ, consolidate irq and affinity mask allocation · 9e9ff54e

Saeed Mahameed authored Jun 09, 2023

Consolidate the mlx5_irq and mlx5_irq->mask allocation, to simplify
error flows and to match the dealloctation sequence @irq_release for
symmetry.
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>

9e9ff54e

net/mlx5e: Fix spelling mistake "Faided" -> "Failed" · d7cea02a

Colin Ian King authored Aug 04, 2023

There is a spelling mistake in a warning message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

d7cea02a

net/mlx5e: aRFS, Introduce ethtool stats · f98e5158

Adham Faris authored Mar 08, 2023

Improve aRFS observability by adding new set of counters. Each Rx
ring will have this set of counters listed below.
These counters are exposed through ethtool -S.

1) arfs_add: number of times a new rule has been created.
2) arfs_request_in: number of times a rule  was requested to move from
   its current Rx ring to a new Rx ring (incremented on the destination
   Rx ring).
3) arfs_request_out: number of times a rule  was requested to move out
   from its current Rx ring (incremented on source/current Rx ring).
4) arfs_expired: number of times a rule has been expired by the
   kernel and removed from HW.
5) arfs_err: number of times a rule creation or modification has
   failed.

This patch removes rx[i]_xsk_arfs_err counter and its documentation in
mlx5/counters.rst since aRFS activity does not occur in XSK RQ's.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>

f98e5158

net/mlx5e: aRFS, Warn if aRFS table does not exist for aRFS rule · 7653d806

Adham Faris authored May 08, 2023

aRFS tables should be allocated and exist in advance. Driver shouldn't
reach a point where it tries to add aRFS rule to table that does not
exist.

Add warning if driver encounters such situation.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7653d806

net/mlx5e: aRFS, Prevent repeated kernel rule migrations requests · 7a73cf0b

Adham Faris authored Mar 08, 2023

aRFS rule movement requests from one Rx ring to other Rx ring arrive
from the kernel to ensure that packets are steered to the right Rx ring.
In the time interval until satisfying such a request, several more
requests might follow, for the same flow.

This patch detects and prevents repeated aRFS rules movement requests.

In mlx5e_rx_flow_steer() ndo, after finding the aRFS rule that have been
requested to move by the kernel, check if it's already requested to move
by calling work_busy(&arfs_rule->arfs_work) handler. IOW, if this
request is pending to be executed (in the work queue) or it's executing
now but hasn't finished yet, return current filter ID and don't issue a
new transition work.
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7a73cf0b

pds_core: Fix some kernel-doc comments · cb39c357

Yang Li authored Aug 21, 2023

Fix some kernel-doc comments to silence the warnings:

drivers/net/ethernet/amd/pds_core/auxbus.c:18: warning: Function parameter or member 'pf' not described in 'pds_client_register'
drivers/net/ethernet/amd/pds_core/auxbus.c:18: warning: Excess function parameter 'pf_pdev' description in 'pds_client_register'
drivers/net/ethernet/amd/pds_core/auxbus.c:58: warning: Function parameter or member 'pf' not described in 'pds_client_unregister'
drivers/net/ethernet/amd/pds_core/auxbus.c:58: warning: Excess function parameter 'pf_pdev' description in 'pds_client_unregister'
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cb39c357

net: annotate data-races around sk->sk_lingertime · bc1fb82a

Eric Dumazet authored Aug 19, 2023

sk_getsockopt() runs locklessly. This means sk->sk_lingertime
can be read while other threads are changing its value.

Other reads also happen without socket lock being held,
and must be annotated.

Remove preprocessor logic using BITS_PER_LONG, compilers
are smart enough to figure this by themselves.

v2: fixed a clang W=1 (-Wtautological-constant-out-of-range-compare) warning
    (Jakub)

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bc1fb82a

IPv4: add extack info for IPv4 address add/delete · b4672c73

Hangbin Liu authored Aug 18, 2023

Add extack info for IPv4 address add/delete, which would be useful for
users to understand the problem without having to read kernel code.

No extack message for the ifa_local checking in __inet_insert_ifa() as
it has been checked in find_matching_ifa().
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b4672c73

20 Aug, 2023 4 commits

net: stmmac: Check more MAC HW features for XGMAC Core 3.20 · 669a5556

Furong Xu authored Aug 19, 2023

1. XGMAC Core does not have hash_filter definition, it uses
vlhash(VLAN Hash Filtering) instead, skip hash_filter when XGMAC.
2. Show exact size of Hash Table instead of raw register value.
3. Show full description of safety features defined by Synopsys Databook.
4. When safety feature is configured with no parity, or ECC only,
keep FSM Parity Checking disabled.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

669a5556

Merge branch 'ipv6-update-route-when-delete-saddr' · 43bc9bd6

David S. Miller authored Aug 20, 2023

Hangbin Liu says:

====================
ipv6: update route when delete source address

Currently, when remove an address, the IPv6 route will not remove the
prefer source address when the address is bond to other device. Fix this
issue and add related tests as Ido and David suggested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

43bc9bd6

selftests: fib_test: add a test case for IPv6 source address delete · 429b55b4

Hangbin Liu authored Aug 18, 2023

Add a test case for IPv6 source address delete.

As David suggested, add tests:
- Single device using src address
- Two devices with the same source address
- VRF with single device using src address
- VRF with two devices using src address

As Ido points out, in IPv6, the preferred source address is looked up in
the same VRF as the first nexthop device. This will give us similar results
to IPv4 if the route is installed in the same VRF as the nexthop device, but
not when the nexthop device is enslaved to a different VRF. So add tests:
- src address and nexthop dev in same VR
- src address and nexthop device in different VRF

The link local address delete logic is different from the global address.
It should only affect the associate device it bonds to. So add tests cases
for link local address testing.

Here is the test result:

IPv6 delete address route tests
    Single device using src address
    TEST: Prefsrc removed when src address removed on other device      [ OK ]
    Two devices with the same source address
    TEST: Prefsrc not removed when src address exist on other device    [ OK ]
    TEST: Prefsrc removed when src address removed on all devices       [ OK ]
    VRF with single device using src address
    TEST: Prefsrc removed when src address removed on other device      [ OK ]
    VRF with two devices using src address
    TEST: Prefsrc not removed when src address exist on other device    [ OK ]
    TEST: Prefsrc removed when src address removed on all devices       [ OK ]
    src address and nexthop dev in same VRF
    TEST: Prefsrc removed from VRF when source address deleted          [ OK ]
    TEST: Prefsrc in default VRF not removed                            [ OK ]
    TEST: Prefsrc not removed from VRF when source address exist        [ OK ]
    TEST: Prefsrc in default VRF removed                                [ OK ]
    src address and nexthop device in different VRF
    TEST: Prefsrc not removed from VRF when nexthop dev in diff VRF     [ OK ]
    TEST: Prefsrc not removed in default VRF                            [ OK ]
    TEST: Prefsrc removed from VRF when nexthop dev in diff VRF         [ OK ]
    TEST: Prefsrc removed in default VRF                                [ OK ]
    Table ID 0
    TEST: Prefsrc removed from default VRF when source address deleted  [ OK ]
    Link local source route
    TEST: Prefsrc not removed when delete ll addr from other dev        [ OK ]
    TEST: Prefsrc removed when delete ll addr                           [ OK ]
    TEST: Prefsrc not removed when delete ll addr from other dev        [ OK ]
    TEST: Prefsrc removed even ll addr still exist on other dev         [ OK ]

Tests passed:  19
Tests failed:   0
Suggested-by: Ido Schimmel <idosch@idosch.org>
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

429b55b4

ipv6: do not match device when remove source route · b358f57f

Hangbin Liu authored Aug 18, 2023

After deleting an IPv6 address on an interface and cleaning up the
related preferred source entries, it is important to ensure that all
routes associated with the deleted address are properly cleared. The
current implementation of rt6_remove_prefsrc() only checks the preferred
source addresses bound to the current device. However, there may be
routes that are bound to other devices but still utilize the same
preferred source address.

To address this issue, it is necessary to also delete entries that are
bound to other interfaces but share the same source address with the
current device. Failure to delete these entries would leave routes that
are bound to the deleted address unclear. Here is an example reproducer
(I have omitted unrelated routes):

+ ip link add dummy1 type dummy
+ ip link add dummy2 type dummy
+ ip link set dummy1 up
+ ip link set dummy2 up
+ ip addr add 1:2:3:4::5/64 dev dummy1
+ ip route add 7:7:7:0::1 dev dummy1 src 1:2:3:4::5
+ ip route add 7:7:7:0::2 dev dummy2 src 1:2:3:4::5
+ ip -6 route show
1:2:3:4::/64 dev dummy1 proto kernel metric 256 pref medium
7:7:7::1 dev dummy1 src 1:2:3:4::5 metric 1024 pref medium
7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium
+ ip addr del 1:2:3:4::5/64 dev dummy1
+ ip -6 route show
7:7:7::1 dev dummy1 metric 1024 pref medium
7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium

As Ido reminds, in IPv6, the preferred source address is looked up in
the same VRF as the first nexthop device, which is different with IPv4.
So, while removing the device checking, we also need to add an
ipv6_chk_addr() check to make sure the address does not exist on the other
devices of the rt nexthop device's VRF.

After fix:
+ ip addr del 1:2:3:4::5/64 dev dummy1
+ ip -6 route show
7:7:7::1 dev dummy1 metric 1024 pref medium
7:7:7::2 dev dummy2 metric 1024 pref medium
Reported-by: Thomas Haller <thaller@redhat.com>
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2170513Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b358f57f