Commits · 678eb448055a45ea9dc63377c9f0ad7f2dd90f98 · Kirill Smelkov / linux

07 Mar, 2024 20 commits

net/mlx5: SD, Implement basic query and instantiation · 678eb448

Tariq Toukan authored Feb 14, 2024

Add implementation for querying the MPIR register for Socket-Direct
attributes, and instantiating a SD struct accordingly.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

678eb448

net/mlx5: SD, Introduce SD lib · 75a54396

Tariq Toukan authored Feb 14, 2024

Add Socket-Direct API with empty/minimal implementation.
We fill-in the implementation gradually in downstream patches.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

75a54396

net/mlx5: Add MPIR bit in mcam_access_reg · a0873a5d

Tariq Toukan authored Feb 14, 2024

Add a cap bit in mcam_access_reg to check for MPIR support.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a0873a5d

ethtool: remove ethtool_eee_use_linkmodes · d7933a2c

Heiner Kallweit authored Mar 05, 2024

After 292fac46 ("net: ethtool: eee: Remove legacy _u32 from keee")
this function has no user any longer.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/b4ff9b51-092b-4d44-bfce-c95342a05b51@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d7933a2c

mlxbf_gige: add support to display pause frame counters · c2234161

David Thompson authored Mar 05, 2024

This patch updates the mlxbf_gige driver to support the
"get_pause_stats()" callback, which enables display of
pause frame counters via "ethtool -I -a oob_net0".

The pause frame counters are only enabled if the "counters_en"
bit is asserted in the LLU general config register. The driver
will only report stats, and thus overwrite the default stats
state of ETHTOOL_STAT_NOT_SET, if "counters_en" is asserted.
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20240305212137.3525-1-davthompson@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

c2234161

net: phy: qca807x: fix compilation when CONFIG_GPIOLIB is not set · 1677293e

Robert Marko authored Mar 05, 2024

Kernel bot has discovered that if CONFIG_GPIOLIB is not set compilation
will fail.

Upon investigation the issue is that qca807x_gpio() is guarded by a
preprocessor check but then it is called under
if (IS_ENABLED(CONFIG_GPIOLIB)) in the probe call so the compiler will
error out since qca807x_gpio() has not been declared if CONFIG_GPIOLIB has
not been set.

Fixes: d1cb613e ("net: phy: qcom: add support for QCA807x PHY Family")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403031332.IGAbZzwq-lkp@intel.com/Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20240305142113.795005-1-robimarko@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1677293e

net: geneve: Remove generic .ndo_get_stats64 · 771d791d

Breno Leitao authored Mar 05, 2024

Commit 3e2f544d ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.

Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240305172911.502058-2-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

771d791d

net: geneve: Leverage core stats allocator · f5f07d06

Breno Leitao authored Mar 05, 2024

With commit 34d21de9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the geneve driver and leverage the network
core allocation instead.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240305172911.502058-1-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f5f07d06

net: gtp: Move net_device assigned in setup · 81154bb8

Breno Leitao authored Mar 05, 2024

Assign netdev to gtp->dev at setup time, so, we can get rid of
gtp_dev_init() completely.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/r/20240305121524.2254533-3-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

81154bb8

net: gtp: Remove generic .ndo_get_stats64 · 13957a0b

Breno Leitao authored Mar 05, 2024

Commit 3e2f544d ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.

Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/r/20240305121524.2254533-2-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

13957a0b

net: gtp: Leverage core stats allocator · 660e5aae

Breno Leitao authored Mar 05, 2024

With commit 34d21de9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the gtp driver and leverage the network
core allocation instead.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/r/20240305121524.2254533-1-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

660e5aae

net: macsec: Leverage core stats allocator · 1d03d51e

Breno Leitao authored Mar 05, 2024

With commit 34d21de9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the macsec driver and leverage the network
core allocation instead.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20240305113728.1974944-1-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1d03d51e

dt-bindings: net: renesas,etheravb: Add support for R-Car V4M · d6620629

Thanh Quan authored Mar 05, 2024

Document support for the Renesas Ethernet AVB (EtherAVB-IF) block in the
Renesas R-Car V4M (R8A779H0) SoC.
Signed-off-by: Thanh Quan <thanh.quan.xn@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/0212b57ba1005bb9b5a922f8f25cc67a7bc15f30.1709631152.git.geert+renesas@glider.beSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d6620629

sr9800: Add check for usbnet_get_endpoints · 07161b24

Chen Ni authored Mar 05, 2024

Add check for usbnet_get_endpoints() and return the error if it fails
in order to transfer the error.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 19a38d8e ("USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support")
Link: https://lore.kernel.org/r/20240305075927.261284-1-nichen@iscas.ac.cnSigned-off-by: Jakub Kicinski <kuba@kernel.org>

07161b24

selftests/harness: Fix TEST_F()'s vfork handling · 41cca054

Mickaël Salaün authored Mar 05, 2024

Always run fixture setup in the grandchild process, and by default also
run the teardown in the same process.  However, this change makes it
possible to run the teardown in a parent process when
_metadata->teardown_parent is set to true (e.g. in fixture setup).

Fix TEST_SIGNAL() by forwarding grandchild's signal to its parent.  Fix
seccomp tests by running the test setup in the parent of the test
thread, as expected by the related test code.  Fix Landlock tests by
waiting for the grandchild before processing _metadata.

Use of exit(3) in tests should be OK because the environment in which
the vfork(2) call happen is already dedicated to the running test (with
flushed stdio, setpgrp() call), see __run_test() and the call to fork(2)
just before running the setup/test/teardown.  Even if the test
configures its own exit handlers, they will not be run by the parent
because it never calls exit(3), and the test function either ends with a
call to _exit(2) or a signal.

Cc: Günther Noack <gnoack@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Fixes: 0710a1a7 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()")
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Reported-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240305201029.1331333-1-mic@digikod.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>

41cca054

Merge branch 'mptcp-some-clean-up-patches' · a2f24c8a

Jakub Kicinski authored Mar 06, 2024

Matthieu Baerts says:

====================
mptcp: some clean-up patches

Here are some clean-up patches for MPTCP:

- Patch 1 drops duplicated header inclusions.

- Patch 2 updates PM 'set_flags' interface, to make it more similar to
  others.

- Patch 3 adds some error messages for the PM 'set_flags' command to
  help the userspace understanding what's wrong in case of error.

- Patch 4 simplifies __lookup_addr() function from pm_netlink.c.

Except for the 3rd patch, the behaviour is not supposed to be modified.
====================

Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-0-c436ba5e569b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a2f24c8a

mptcp: drop lookup_by_id in lookup_addr · af250c27

Geliang Tang authored Mar 05, 2024

When the lookup_by_id parameter of __lookup_addr() is true, it's the same
as __lookup_addr_by_id(), it can be replaced by __lookup_addr_by_id()
directly. So drop this parameter, let __lookup_addr() only looks up address
on the local address list by comparing addresses in it, not address ids.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-4-c436ba5e569b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

af250c27

mptcp: set error messages for set_flags · a4d68b16

Geliang Tang authored Mar 05, 2024

In addition to returning the error value, this patch also sets an error
messages with GENL_SET_ERR_MSG or NL_SET_ERR_MSG_ATTR both for pm_netlink.c
and pm_userspace.c. It will help the userspace to identify the issue.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-3-c436ba5e569b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a4d68b16

mptcp: update set_flags interfaces · 6a42477f

Geliang Tang authored Mar 05, 2024

This patch updates set_flags interfaces, make it more similar to the
interfaces of dump_addr and get_addr:

mptcp_pm_set_flags(struct sk_buff *skb, struct genl_info *info)
mptcp_pm_nl_set_flags(struct sk_buff *skb, struct genl_info *info)
mptcp_userspace_pm_set_flags(struct sk_buff *skb, struct genl_info *info)
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-2-c436ba5e569b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

6a42477f

mptcp: drop duplicate header inclusions · d5dfbfa2

Geliang Tang authored Mar 05, 2024

The headers net/tcp.h, net/genetlink.h and uapi/linux/mptcp.h are included
in protocol.h already, no need to include them again directly. This patch
removes these duplicate header inclusions.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240305-upstream-net-next-20240304-mptcp-misc-cleanup-v1-1-c436ba5e569b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d5dfbfa2

06 Mar, 2024 20 commits

inet: Add getsockopt support for IP_ROUTER_ALERT and IPV6_ROUTER_ALERT · eeb78df4

Juntong Deng authored Mar 04, 2024

Currently getsockopt does not support IP_ROUTER_ALERT and
IPV6_ROUTER_ALERT, and we are unable to get the values of these two
socket options through getsockopt.

This patch adds getsockopt support for IP_ROUTER_ALERT and
IPV6_ROUTER_ALERT.
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eeb78df4

Merge branch 'ynl-small-recv' · edf7468d

David S. Miller authored Mar 06, 2024

Jakub Kicinski says:

====================
tools: ynl: add --dbg-small-recv for easier kernel testing

When testing netlink dumps I usually hack some user space up
to constrain its user space buffer size (iproute2, ethtool or ynl).
Netlink will try to fill the messages up, so since these apps use
large buffers by default, the dumps are rarely fragmented.

I was hoping to figure out a way to create a selftest for dump
testing, but so far I have no idea how to do that in a useful
and generic way.

Until someone does that, make manual dump testing easier with YNL.
Create a special option for limiting the buffer size, so I don't
have to make the same edits each time, and maybe others will benefit,
too :)

Example:

  $ ./cli.py [...] --dbg-small-recv >/dev/null
  Recv: read 3712 bytes, 29 messages
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
    [...]
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
  Recv: read 3968 bytes, 31 messages
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
    [...]
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
  Recv: read 532 bytes, 5 messages
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
    [...]
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
     nl_len = 20 (4) nl_flags = 0x2 nl_type = 3

Now let's make the DONE not fit in the last message:

  $ ./cli.py [...] --dbg-small-recv 4499 >/dev/null
  Recv: read 3712 bytes, 29 messages
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
    [...]
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
  Recv: read 4480 bytes, 35 messages
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
    [...]
     nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
  Recv: read 20 bytes, 1 messages
     nl_len = 20 (4) nl_flags = 0x2 nl_type = 3

A real test would also have to check the messages are complete
and not duplicated. That part has to be done manually right now.

Note that the first message is always conservatively sized by the kernel.
Still, I think this is good enough to be useful.

v2:
 - patch 2:
   - move the recv_size setting up
   - change the default to 0 so that cli.py doesn't have to worry
     what the "unset" value is
v1: https://lore.kernel.org/all/20240301230542.116823-1-kuba@kernel.org/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

edf7468d

tools: ynl: add --dbg-small-recv for easier kernel testing · c0111878

Jakub Kicinski authored Mar 04, 2024

Most "production" netlink clients use large buffers to
make dump efficient, which means that handling of dump
continuation in the kernel is not very well tested.

Add an option for debugging / testing handling of dumps.
It enables printing of extra netlink-level debug and
lowers the recv() buffer size in one go. When used
without any argument (--dbg-small-recv) it picks
a very small default (4000), explicit size can be set,
too (--dbg-small-recv 5000).

Example:

$ ./cli.py [...] --dbg-small-recv
Recv: read 3712 bytes, 29 messages
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
 [...]
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
Recv: read 3968 bytes, 31 messages
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
 [...]
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
Recv: read 532 bytes, 5 messages
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
 [...]
   nl_len = 128 (112) nl_flags = 0x0 nl_type = 19
   nl_len = 20 (4) nl_flags = 0x2 nl_type = 3

(the [...] are edits to shorten the commit message).

Note that the first message of the dump is sized conservatively
by the kernel.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0111878

tools: ynl: support debug printing messages · a6a41521

Jakub Kicinski authored Mar 04, 2024

For manual debug, allow printing the netlink level messages
to stderr.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6a41521

tools: ynl: allow setting recv() size · 7c93a887

Jakub Kicinski authored Mar 04, 2024

Make the size of the buffer we use for recv() configurable.
The details of the buffer sizing in netlink are somewhat
arcane, we could spend a lot of time polishing this API.
Let's just leave some hopefully helpful comments for now.
This is a for-developers-only feature, anyway.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c93a887

tools: ynl: move the new line in NlMsg __repr__ · 7df7231d

Jakub Kicinski authored Mar 04, 2024

We add the new line even if message has no error or extack,
which leads to print(nl_msg) ending with two new lines.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

7df7231d

Merge branch 'tools-ynl-make-clean' · b206acf1

David S. Miller authored Mar 06, 2024

Jakub Kicinski says:

====================
tools: ynl: clean up make clean

First change renames the clean target which removes build results,
to a more common name. Second one add missing .PHONY targets.
Third one ensures that clean deletes __pycache__.

v2: add patch 2
v1: https://lore.kernel.org/all/20240301235609.147572-1-kuba@kernel.org/

====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b206acf1

tools: ynl: remove __pycache__ during clean · 72fa191b

Jakub Kicinski authored Mar 04, 2024

Build process uses python to generate the user space code.
Remove __pycache__ on make clean.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

72fa191b

tools: ynl: add distclean to .PHONY in all makefiles · 1d8617b2

Jakub Kicinski authored Mar 04, 2024

Donald points out most YNL makefiles are missing distclean
in .PHONY, even tho generated/Makefile does list it.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d8617b2

tools: ynl: rename make hardclean -> distclean · 4e887471

Jakub Kicinski authored Mar 04, 2024

The make target to remove all generated files used to be called
"hardclean" because it deleted files which were tracked by git.
We no longer track generated user space files, so use the more
common "distclean" name.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e887471

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · db72b6fc

David S. Miller authored Mar 06, 2024

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-03-04 (ice)

This series contains updates to ice driver only.

Jake changes the driver to use relative VSI index for VF VSIs as the VF
driver has no direct use of the VSI number on ice hardware. He also
reworks some Tx/Rx functions to clarify their uses, cleans up some style
issues, and utilizes kernel helper functions.

Maciej removes a redundant call to disable Tx queues on ifdown and
removes some unnecessary devm usages.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

db72b6fc

Merge branch 'ravb-cleanups' · 39a096d6

David S. Miller authored Mar 06, 2024

Niklas Söderlund says:

====================
ravb: Align Rx descriptor setup and maintenance

When RZ/G2L support was added the Rx code path was split in two, one to
support R-Car and one to support RZ/G2L. One reason for this is that
R-Car uses the extended Rx descriptor format, while RZ/G2L uses the
normal descriptor format.

In many aspects this is not needed as the extended descriptor format is
just a normal descriptor with extra metadata (timestamsp) appended. And
the R-Car SoCs can also use normal descriptors if hardware timestamps
were not desired. This split has led to RZ/G2L gaining support for
split descriptors in the Rx path while R-Car still lacks this.

This series is the first step in trying to merge the R-Car and RZ/G2L Rx
paths so features and bugs corrected in one will benefit the other.

The first patch in the series clarifies that the driver now supports
either normal or extended descriptors, not both at the same time by
grouping them in a union. This is the foundation that later patches will
build on the aligning the two Rx paths.

Patches 2-5 deals with correcting small issues in the Rx frame and
descriptor sizes that either were incorrect at the time they were added
in 2017 (my bad) or concepts built on-top of this initial incorrect
design.

While finally patch 6 merges the R-Car and RZ/G2L for Rx descriptor
setup and maintenance.

When this work has landed I plan to follow up with more work aligning
the rest of the Rx code paths and hopefully bring split descriptor
support to the R-Car SoCs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

39a096d6

ravb: Unify Rx ring maintenance code paths · 644d037b

Niklas Söderlund authored Mar 04, 2024

The R-Car and RZ/G2L Rx code paths were split in two separate
implementations when support for RZ/G2L was added due to the fact that
R-Car uses the extended descriptor format while RZ/G2L uses normal
descriptors. This has led to a duplication of Rx logic with the only
difference being the different Rx descriptors types used. The
implementation however neglects to take into account that extended
descriptors are normal descriptors with additional metadata at the end
to carry hardware timestamp information.

The hardware timestamp information is only consumed in the R-Car Rx
loop and all the maintenance code around the Rx ring can be shared
between the two implementations if the difference in descriptor length
is carefully considered.

This change merges the two implementations for Rx ring maintenance by
adding a method to access both types of descriptors as normal
descriptors, as this part covers all the fields needed for Rx ring
maintenance the only difference between using normal or extended
descriptor is the size of the memory region to allocate/free and the
step size between each descriptor in the ring.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

644d037b

ravb: Move maximum Rx descriptor data usage to info struct · 555419b2

Niklas Söderlund authored Mar 04, 2024

To make it possible to merge the R-Car and RZ/G2L code paths move the
maximum usable size of a single Rx descriptor data slice into the
hardware information instead of using two different defines in the two
different code paths.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

555419b2

ravb: Use the max frame size from hardware info for RZ/G2L · 49686338

Niklas Söderlund authored Mar 04, 2024

Remove the define describing the RZ/G2L maximum frame size and only use
the information in the hardware information struct. This will make it
easier to merge the R-Car and RZ/G2L code paths.

There is no functional change as both the define and the maximum frame
length in the hardware information is set to 8K.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

49686338

ravb: Create helper to allocate skb and align it · cfbad647

Niklas Söderlund authored Mar 04, 2024

The EtherAVB device requires the SKB data to be aligned to 128 bytes.
The alignment is done by allocating an skb 128 bytes larger than the
maximum frame size supported by the device and adjusting the headroom to
fit the requirement.

This code has been refactored a few times and small issues have been
added along the way. The issues are not harmful but prevent merging
parts of the Rx code which have been split in two implementations with
the addition of RZ/G2L support, a device that supports larger frame
sizes.

This change removes the need for duplicated and somewhat inaccurate
hardware alignment constrains stored in the hardware information struct
by creating a helper to handle the allocation of an skb and alignment of
an skb data.

For the R-Car class of devices the maximum frame size is 4K and each
descriptor is limited to 2K of data. The current implementation does not
support split descriptors, this limits the frame size to 2K. The
current hardware information however records the descriptor size just
under 2K due to bad understanding of the device when larger MTUs where
added.

For the RZ/G2L device the maximum frame size is 8K and each descriptor
is limited to 4K of data. The current hardware information records this
correctly, but it gets the alignment constrains wrong as just aligns it
by 128, it does not extend it by 128 bytes to allow the full frame to be
stored. This works because the RZ/G2L device supports split descriptors
and allocates each skb to 8K and aligns each 4K descriptor in this
space.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

cfbad647

ravb: Make it clear the information relates to maximum frame size · e82700b8

Niklas Söderlund authored Mar 04, 2024

The struct member rx_max_buf_size was added before split descriptor
support was added. It is unclear if the value describes the full skb
frame buffer or the data descriptor buffer which can be combined into a
single skb.

Rename it to make it clear it referees to the maximum frame size and can
cover multiple descriptors.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

e82700b8

ravb: Group descriptor types used in Rx ring · 4123c3fb

Niklas Söderlund authored Mar 04, 2024

The Rx ring can either be made up of normal or extended descriptors, not
a mix of the two at the same time. Make this explicit by grouping the
two variables in a rx_ring union.

The extension of the storage for more than one queue of normal
descriptors from a single to NUM_RX_QUEUE queues have no practical
effect. But aids in making the code readable as the code that uses it
already piggyback on other members of struct ravb_private that are
arrays of max length NUM_RX_QUEUE, e.g. rx_desc_dma. This will also make
further refactoring easier.

While at it, rename the normal descriptor Rx ring to make it clear it's
not strictly related to the GbEthernet E-MAC IP found in RZ/G2L, normal
descriptors could be used on R-Car SoCs too.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

4123c3fb

Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · dbb0b6ca

David S. Miller authored Mar 06, 2024

From: Tony Nguyen <anthony.l.nguyen@intel.com>
To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	edumazet@google.com, netdev@vger.kernel.org
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>, alan.brady@intel.com
Tony Nguyen says:

====================
idpf: refactor virtchnl messages

Alan Brady says:

The motivation for this series has two primary goals. We want to enable
support of multiple simultaneous messages and make the channel more
robust. The way it works right now, the driver can only send and receive
a single message at a time and if something goes really wrong, it can
lead to data corruption and strange bugs.

To start the series, we introduce an idpf_virtchnl.h file. This reduces
the burden on idpf.h which is overloaded with struct and function
declarations.

The conversion works by conceptualizing a send and receive as a
"virtchnl transaction" (idpf_vc_xn) and introducing a "transaction
manager" (idpf_vc_xn_manager). The vcxn_mngr will init a ring of
transactions from which the driver will pop from a bitmap of free
transactions to track in-flight messages. Instead of needing to handle a
complicated send/recv for every a message, the driver now just needs to
fill out a xn_params struct and hand it over to idpf_vc_xn_exec which
will take care of all the messy bits. Once a message is sent and
receives a reply, we leverage the completion API to signal the received
buffer is ready to be used (assuming success, or an error code
otherwise).

At a low-level, this implements the "sw cookie" field of the virtchnl
message descriptor to enable this. We have 16 bits we can put whatever
we want and the recipient is required to apply the same cookie to the
reply for that message.  We use the first 8 bits as an index into the
array of transactions to enable fast lookups and we use the second 8
bits as a salt to make sure each cookie is unique for that message. As
transactions are received in arbitrary order, it's possible to reuse a
transaction index and the salt guards against index conflicts to make
certain the lookup is correct. As a primitive example, say index 1 is
used with salt 1. The message times out without receiving a reply so
index 1 is renewed to be ready for a new transaction, we report the
timeout, and send the message again. Since index 1 is free to be used
again now, index 1 is again sent but now salt is 2. This time we do get
a reply, however it could be that the reply is _actually_ for the
previous send index 1 with salt 1.  Without the salt we would have no
way of knowing for sure if it's the correct reply, but with we will know
for certain.

Through this conversion we also get several other benefits. We can now
more appropriately handle asynchronously sent messages by providing
space for a callback to be defined. This notably allows us to handle MAC
filter failures better; previously we could potentially have stale,
failed filters in our list, which shouldn't really have a major impact
but is obviously not correct. I also managed to remove fairly
significant more lines than I added which is a win in my book.

Additionally, this converts some variables to use auto-variables where
appropriate. This makes the alloc paths much cleaner and less prone to
memory leaks. We also fix a few virtchnl related bugs while we're here.

====================
Signed-off-by: David S. Miller <davem@davemloft.net>

dbb0b6ca

Merge branch 'netlink-emsgsize' · 784ee615

David S. Miller authored Mar 06, 2024

Jakub Kicinski says:

====================
netlink: handle EMSGSIZE errors in the core

Ido discovered some time back that we usually force NLMSG_DONE
to be delivered in a separate recv() syscall, even if it would
fit into the same skb as data messages. He made nexthop try
to fit DONE with data in commit 8743aeff ("nexthop: Fix
infinite nexthop bucket dump when using maximum nexthop ID"),
and nobody has complained so far.

We have since also tried to follow the same pattern in new
genetlink families, but explaining to people, or even remembering
the correct handling ourselves is tedious.

Let the netlink socket layer consume -EMSGSIZE errors.
Practically speaking most families use this error code
as "dump needs more space", anyway.

v2:
 - init err to 0 in last patch
v1: https://lore.kernel.org/all/20240301012845.2951053-1-kuba@kernel.org/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

784ee615