Commits · 17c8da5a3423c9f3800a256dbaa685bae18e1264 · Kirill Smelkov / linux

28 Aug, 2023 5 commits

net/mlx5: Add IFC bits to support IPsec enable/disable · 17c8da5a

Leon Romanovsky authored Aug 24, 2023

Add hardware definitions to allow to control IPSec capabilities.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-6-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

17c8da5a

net/mlx5e: Rewrite IPsec vs. TC block interface · e2537341

Leon Romanovsky authored Aug 24, 2023

In the commit 366e4624 ("net/mlx5e: Make IPsec offload work together
with eswitch and TC"), new API to block IPsec vs. TC creation was introduced.

Internally, that API used devlink lock to avoid races with userspace, but it is
not really needed as dev->priv.eswitch is stable and can't be changed. So remove
dependency on devlink lock and move block encap code back to its original place.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-5-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e2537341

net/mlx5: Drop extra layer of locks in IPsec · c46fb773

Leon Romanovsky authored Aug 24, 2023

There is no need in holding devlink lock as it gives nothing
compared to already used write mode_lock.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-4-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

c46fb773

devlink: Expose port function commands to control IPsec packet offloads · 390a24cb

Dima Chumak authored Aug 24, 2023

Expose port function commands to enable / disable IPsec packet offloads,
this is used to control the port IPsec capabilities.

When IPsec packet is disabled for a function of the port (default),
function cannot offload IPsec packet operations (encapsulation and XFRM
policy offload). When enabled, IPsec packet operations can be offloaded
by the function of the port, which includes crypto operation
(Encrypt/Decrypt), IPsec encapsulation and XFRM state and policy
offload.

Example of a PCI VF port which supports IPsec packet offloads:

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
        function:
        hw_addr 00:00:00:00:00:00 roce enable ipsec_packet disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
        function:
        hw_addr 00:00:00:00:00:00 roce enable ipsec_packet enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-3-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

390a24cb

devlink: Expose port function commands to control IPsec crypto offloads · 62b6442c

Dima Chumak authored Aug 24, 2023

Expose port function commands to enable / disable IPsec crypto offloads,
this is used to control the port IPsec capabilities.

When IPsec crypto is disabled for a function of the port (default),
function cannot offload any IPsec crypto operations (Encrypt/Decrypt and
XFRM state offloading). When enabled, IPsec crypto operations can be
offloaded by the function of the port.

Example of a PCI VF port which supports IPsec crypto offloads:

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
        function:
        hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable

$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable

$ devlink port show pci/0000:06:00.0/1
    pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
        function:
        hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-2-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

62b6442c

27 Aug, 2023 13 commits

Merge branch 'iep-drver-timestamping-support' · aa05346d

David S. Miller authored Aug 27, 2023

MD Danish Anwar says:

====================
Introduce IEP driver and packet timestamping support

This series introduces Industrial Ethernet Peripheral (IEP) driver to
support timestamping of ethernet packets and thus support PTP and PPS
for PRU ICSSG ethernet ports.

This series also adds 10M full duplex support for ICSSG ethernet driver.

There are two IEP instances. IEP0 is used for packet timestamping while IEP1
is used for 10M full duplex support.

This is v7 of the series [v1]. It addresses comments made on [v6].
This series is based on linux-next(#next-20230823).

Changes from v6 to v7:
*) Dropped blank line in example section of patch 1.
*) Patch 1 previously had three examples, removed two examples and kept only
   one example as asked by Krzysztof.
*) Added Jacob Keller's RB tag in patch 5.
*) Dropped Roger's RB tags from the patches that he has authored (Patch 3 and 4)

Changes from v5 to v6:
*) Added description of IEP in commit messages of patch 2 as asked by Rob.
*) Described the items constraints properly for iep property in patch 2 as
   asked by Rob.
*) Added Roger and Simon's RB tags.

Changes from v4 to v5:
*) Added comments on why we are using readl / writel instead of regmap_read()
   / write() in icss_iep_gettime() / settime() APIs as asked by Roger.
*) Added Conor's RB tag in patch 1 and 2.

Change from v3 to v4:
*) Changed compatible in iep dt bindings. Now each SoC has their own compatible
   in the binding with "ti,am654-icss-iep" as a fallback as asked by Conor.
*) Addressed Andew's comments and removed helper APIs icss_iep_readl() /
   writel(). Now the settime/gettime APIs directly use readl() / writel().
*) Moved selecting TI_ICSS_IEP in Kconfig from patch 3 to patch 4.
*) Removed forward declaration of icss_iep_of_match in patch 3.
*) Replaced use of of_device_get_match_data() to device_get_match_data() in
   patch 3.
*) Removed of_match_ptr() from patch 3 as it is not needed.

Changes from v2 to v3:
*) Addressed Roger's comment and moved IEP1 related changes in patch 5.
*) Addressed Roger's comment and moved icss_iep.c / .h changes from patch 4
   to patch 3.
*) Added support for multiple timestamping in patch 4 as asked by Roger.
*) Addressed Andrew's comment and added comment in case SPEED_10 in
   icssg_config_ipg() API.
*) Kept compatible as "ti,am654-icss-iep" for all TI K3 SoCs

Changes from v1 to v2:
*) Addressed Simon's comment to fix reverse xmas tree declaration. Some APIs
   in patch 3 and 4 were not following reverse xmas tree variable declaration.
   Fixed it in this version.
*) Addressed Conor's comments and removed unsupported SoCs from compatible
   comment in patch 1.
*) Addded patch 2 which was not part of v1. Patch 2, adds IEP node to dt
   bindings for ICSSG.

[v1] https://lore.kernel.org/all/20230803110153.3309577-1-danishanwar@ti.com/
[v2] https://lore.kernel.org/all/20230807110048.2611456-1-danishanwar@ti.com/
[v3] https://lore.kernel.org/all/20230809114906.21866-1-danishanwar@ti.com/
[v4] https://lore.kernel.org/all/20230814100847.3531480-1-danishanwar@ti.com/
[v5] https://lore.kernel.org/all/20230817114527.1585631-1-danishanwar@ti.com/
[v6] https://lore.kernel.org/all/20230823113254.292603-1-danishanwar@ti.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

aa05346d

net: ti: icssg-prueth: am65x SR2.0 add 10M full duplex support · 443a2367

Grygorii Strashko authored Aug 24, 2023

For AM65x SR2.0 it's required to enable IEP1 in raw 64bit mode which is
used by PRU FW to monitor the link and apply w/a for 10M link issue.
Note. No public errata available yet.

Without this w/a the PRU FW will stuck if link state changes under TX
traffic pressure.

Hence, add support for 10M full duplex for AM65x SR2.0:
 - add new IEP API to enable IEP, but without PTP support
 - add pdata quirk_10m_link_issue to enable 10M link issue w/a.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

443a2367

net: ti: icssg-prueth: add packet timestamping and ptp support · 186734c1

Roger Quadros authored Aug 24, 2023

Add packet timestamping TS and PTP PHC clock support.

For AM65x and AM64x:
 - IEP1 is not used
 - IEP0 is configured in shadow mode with 1ms cycle and shared between
Linux and FW. It provides time and TS in number cycles, so special
conversation in ns is required.
 - IEP0 shared between PRUeth ports.
 - IEP0 supports PPS, periodic output.
 - IEP0 settime() and enabling PPS required FW interraction.
 - RX TS provided with each packet in CPPI5 descriptor.
 - TX TS returned through separate ICSSG hw queues for each port. TX TS
readiness is signaled by INTC IRQ. Only one packet at time can be requested
for TX TS.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Co-developed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

186734c1

net: ti: icss-iep: Add IEP driver · c1e0230e

Roger Quadros authored Aug 24, 2023

Add a driver for Industrial Ethernet Peripheral (IEP) block of PRUSS to
support timestamping of ethernet packets and thus support PTP and PPS
for PRU ethernet ports.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c1e0230e

dt-bindings: net: Add IEP property in ICSSG · b1205627

MD Danish Anwar authored Aug 24, 2023

Add IEP property in ICSSG hardware DT binding document.
ICSSG uses IEP (Industrial Ethernet Peripheral) to support timestamping
of ethernet packets, PTP and PPS.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1205627

dt-bindings: net: Add ICSS IEP · f0035689

MD Danish Anwar authored Aug 24, 2023

Add a DT binding document for the ICSS Industrial Ethernet Peripheral(IEP)
hardware. IEP supports packet timestamping, PTP and PPS.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0035689

Merge branch 'sfc-pedit-offloads' · 2cc88bbc

David S. Miller authored Aug 27, 2023

Pieter Jansen van Vuuren says:

====================
sfc: introduce eth, ipv4 and ipv6 pedit offloads

This set introduces mac source and destination pedit set action offloads.
It also adds offload for ipv4 ttl and ipv6 hop limit pedit set action as
well pedit add actions that would result in the same semantics as
decrementing the ttl and hop limit.

v2:
- fix 'efx_tc_mangle' kdoc which was orphaned when adding 'efx_tc_pedit_add'.
- add description of 'match' in 'efx_tc_mangle' kdoc.
- correct some inconsistent kdoc indentation.

v1: https://lore.kernel.org/netdev/20230823111725.28090-1-pieter.jansen-van-vuuren@amd.com/
====================
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

2cc88bbc

sfc: extend pedit add action to handle decrement ipv6 hop limit · e8e0bd60

Pieter Jansen van Vuuren authored Aug 24, 2023

Extend the pedit add actions to handle this case for ipv6. Similar to ipv4
dec ttl, decrementing ipv6 hop limit can be achieved by adding 0xff to the
hop limit field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e8e0bd60

sfc: introduce pedit add actions on the ipv4 ttl field · 64848f06

Pieter Jansen van Vuuren authored Aug 24, 2023

Introduce pedit add actions and use it to achieve decrement ttl offload.
Decrement ttl can be achieved by adding 0xff to the ttl field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

64848f06

sfc: add decrement ipv6 hop limit by offloading set hop limit actions · 9dbc8d2b

Pieter Jansen van Vuuren authored Aug 24, 2023

Offload pedit set ipv6 hop limit, where the hop limit has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9dbc8d2b

sfc: add decrement ttl by offloading set ipv4 ttl actions · 66f72887

Pieter Jansen van Vuuren authored Aug 24, 2023

Offload pedit set ipv4 ttl field, where the ttl field has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66f72887

sfc: add mac source and destination pedit action offload · 0c676503

Pieter Jansen van Vuuren authored Aug 24, 2023

Introduce the first pedit set offload functionality for the sfc driver.
In addition to this, add offload functionality for both mac source and
destination pedit set actions.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c676503

sfc: introduce ethernet pedit set action infrastructure · 439c4be9

Pieter Jansen van Vuuren authored Aug 24, 2023

Introduce the initial ethernet pedit set action infrastructure in
preparation for adding mac src and dst pedit action offloads.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

439c4be9

26 Aug, 2023 18 commits

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · b32add2d

Jakub Kicinski authored Aug 25, 2023

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-08-24 (igc, e1000e)

This series contains updates to igc and e1000e drivers.

Vinicius adds support for utilizing multiple PTP registers on igc.

Sasha reduces interval time for PTM on igc and adds new device support
on e1000e.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  e1000e: Add support for the next LOM generation
  igc: Decrease PTM short interval from 10 us to 1 us
  igc: Add support for multiple in-flight TX timestamps
====================

Link: https://lore.kernel.org/r/20230824204418.1551093-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b32add2d

doc/netlink: Add delete operation to ovs_vport spec · 52d08fda

Donald Hunter authored Aug 24, 2023

Add del operation to the spec to help with testing.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824142221.71339-1-donald.hunter@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

52d08fda

tools: ynl-gen: fix uAPI generation after tempfile changes · a02430c0

Jakub Kicinski authored Aug 24, 2023

We use a tempfile for code generation, to avoid wiping the target
file out if the code generator crashes. File contents are copied
from tempfile to actual destination at the end of main().

uAPI generation is relatively simple so when generating the uAPI
header we return from main() early, and never reach the "copy code
over" stage. Since commit under Fixes uAPI headers are not updated
by ynl-gen.

Move the copy/commit of the code into CodeWriter, to make it
easier to call at any point in time. Hook it into the destructor
to make sure we don't miss calling it.

Fixes: f65f305a ("tools: ynl-gen: use temporary file for rendering")
Link: https://lore.kernel.org/r/20230824212431.1683612-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a02430c0

Merge branch 'stmmac-cleanups' · f5e17b47

Jakub Kicinski authored Aug 25, 2023

Russell King says:

====================
stmmac cleanups

One of the comments I had on Feiyang Chen's series was concerning the
initialisation of phylink... and so I've decided to do something about
it, cleaning it up a bit.

This series:

1) adds a new phylink function to limit the MAC capabilities according
   to a maximum speed. This allows us to greatly simplify stmmac's
   initialisation of phylink's mac capabilities.

2) everywhere that uses priv->plat->phylink_node first converts this
   to a fwnode before doing anything with it. This is silly. Let's
   instead store it as a fwnode to eliminate these conversions in
   multiple places.

3) clean up passing the fwnode to phylink - it might as well happen
   at the phylink_create() callsite, rather than being scattered
   throughout the entire function.

4) same for mdio_bus_data

5) use phylink_limit_mac_speed() to handle the priv->plat->max_speed
   restriction.

6) add a method to get the MAC-specific capabilities from the code
   dealing with the MACs, and arrange to call it at an appropriate
   time.

7) convert the gmac4 users to use the MAC specific method.

8) same for xgmac.

9) group all the simple phylink_config initialisations together.

10) convert half-duplex logic to being positive logic.

While looking into all of this, this raised eyebrows:

        if (priv->plat->tx_queues_to_use > 1)
                priv->phylink_config.mac_capabilities &=
                        ~(MAC_10HD | MAC_100HD | MAC_1000HD);

priv->plat->tx_queues_to_use is initialised by platforms to either 1,
4 or 8, and can be controlled from userspace via the --set-channels
ethtool op. The implementation of this op in this driver limits the
number of channels to priv->dma_cap.number_tx_queues, which is derived
from the DMA hwcap.

So, the obvious questions are:

1) what guarantees that the static initialisation of tx_queues_to_use
will always be less than or equal to number_tx_queues from the DMA hw
cap?

2) tx_queues_to_use starts off as 1, but number_tx_queues is larger,
we will leave the half-duplex capabilities in place, but userspace can
increase tx_queues_to_use above 1. Does that mean half-duplex is then
not supported?

3) Should we be basing the decision whether half-duplex is supported
off the DMA capabilities?

4) What about priv->dma_cap.half_duplex? Doesn't that get a say in
whether half-duplex is supported or not? Why isn't this used? Why is
it only reported via debugfs? If it's not being used by the driver,
what's the point of reporting it via debugfs?
====================

Link: https://lore.kernel.org/r/ZOddFH22PWmOmbT5@shell.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f5e17b47

net: stmmac: convert half-duplex support to positive logic · 76649fc9

Russell King (Oracle) authored Aug 24, 2023

Rather than detecting when half-duplex is not supported, and clearing
the MAC capabilities, reverse the if() condition and use it to set the
capabilities instead.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXn-005pUb-SP@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

76649fc9

net: stmmac: move priv->phylink_config.mac_managed_pm · 64961f1b

Russell King (Oracle) authored Aug 24, 2023

Move priv->phylink_config.mac_managed_pm to be along side the other
phylink initialisations.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXi-005pUV-Nq@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

64961f1b

net: stmmac: move xgmac specific phylink caps to dwxgmac2 core · bedf9b81

Russell King (Oracle) authored Aug 24, 2023

Move the xgmac specific phylink capabilities to the dwxgmac2 support
core.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXd-005pUP-JL@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

bedf9b81

net: stmmac: move gmac4 specific phylink capabilities to gmac4 · f1dae3d2

Russell King (Oracle) authored Aug 24, 2023

Move the setup of gmac4 speicifc phylink capabilities into gmac4 code.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXY-005pUJ-Ez@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f1dae3d2

net: stmmac: provide stmmac_mac_phylink_get_caps() · d42ca04e

Russell King (Oracle) authored Aug 24, 2023

Allow MACs to provide their own capabilities via the MAC operations
struct.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXT-005pUD-Aj@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d42ca04e

net: stmmac: use phylink_limit_mac_speed() · a4ac612b

Russell King (Oracle) authored Aug 24, 2023

Use phylink_limit_mac_speed() to limit the MAC capabilities rather
than coding this for each speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXO-005pU7-61@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a4ac612b

net: stmmac: use "mdio_bus_data" local variable · 2b070cdd

Russell King (Oracle) authored Aug 24, 2023

We have a local variable for priv->plat->mdio_bus_data, which we use
later in the conditional if() block, but we evaluate the above within
the conditional expression. Use mdio_bus_data instead. Since these
will be the only two users of this local variable, move its assignment
just before the if().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXJ-005pU1-1z@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

2b070cdd

net: stmmac: clean up passing fwnode to phylink · 1a37c1c1

Russell King (Oracle) authored Aug 24, 2023

Move the initialisation of the fwnode variable closer to its use
site, rather than scattered throughout stmmac_phy_setup().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXD-005pTv-TN@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1a37c1c1

net: stmmac: convert plat->phylink_node to fwnode · e80af2ac

Russell King (Oracle) authored Aug 24, 2023

All users of plat->phylink_node first convert it to a fwnode. Rather
than repeatedly convert to a fwnode, store it as a fwnode. To reflect
this change, call it plat->port_node instead - it is used for more
than just phylink.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX8-005pTo-OT@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e80af2ac

net: phylink: add phylink_limit_mac_speed() · 70934c7c

Russell King (Oracle) authored Aug 24, 2023

Add a function which can be used to limit the phylink MAC capabilities
to an upper speed limit.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX3-005pTi-K1@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

70934c7c

veth: Avoid NAPI scheduling on failed SKB forwarding · 215eb9f9

Liang Chen authored Aug 24, 2023

When an skb fails to be forwarded to the peer(e.g., skb data buffer
length exceeds MTU), it will not be added to the peer's receive queue.
Therefore, we should schedule the peer's NAPI poll function only when
skb forwarding is successful to avoid unnecessary overhead.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Link: https://lore.kernel.org/r/20230824123131.7673-1-liangchen.linux@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

215eb9f9

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · bebfbf07

Jakub Kicinski authored Aug 25, 2023

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-08-25

We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).

The main changes are:

1) Add multi uprobe BPF links for attaching multiple uprobes
   and usdt probes, which is significantly faster and saves extra fds,
   from Jiri Olsa.

2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
   from Xu Kuohai.

3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
   from Pu Lehui.

4) Fix LWT BPF xmit hooks wrt their return values where propagating
   the result from skb_do_redirect() would trigger a use-after-free,
   from Yan Zhai.

5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
   where the map's value kptr type and locally allocated obj type
   mismatch, from Yonghong Song.

6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
   root/node which bypassed reg->off == 0 enforcement,
   from Kumar Kartikeya Dwivedi.

7) Lift BPF verifier restriction in networking BPF programs to treat
   comparison of packet pointers not as a pointer leak,
   from Yafang Shao.

8) Remove unmaintained XDP BPF samples as they are maintained
   in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.

9) Batch of fixes for the tracing programs from BPF samples in order
   to make them more libbpf-aware, from Daniel T. Lee.

10) Fix a libbpf signedness determination bug in the CO-RE relocation
    handling logic, from Andrii Nakryiko.

11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
    fixes for bpf_refcount shared ownership implementation,
    both from Dave Marchevsky.

12) Add a new bpf_object__unpin() API function to libbpf,
    from Daniel Xu.

13) Fix a memory leak in libbpf to also free btf_vmlinux
    when the bpf_object gets closed, from Hao Luo.

14) Small error output improvements to test_bpf module, from Helge Deller.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
  selftests/bpf: Add tests for rbtree API interaction in sleepable progs
  bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
  bpf: Consider non-owning refs to refcounted nodes RCU protected
  bpf: Reenable bpf_refcount_acquire
  bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
  bpf: Consider non-owning refs trusted
  bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
  selftests/bpf: Enable cpu v4 tests for RV64
  riscv, bpf: Support unconditional bswap insn
  riscv, bpf: Support signed div/mod insns
  riscv, bpf: Support 32-bit offset jmp insn
  riscv, bpf: Support sign-extension mov insns
  riscv, bpf: Support sign-extension load insns
  riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
  samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
  samples/bpf: Cleanup .gitignore
  samples/bpf: Remove the xdp_sample_pkts utility
  samples/bpf: Remove the xdp1 and xdp2 utilities
  samples/bpf: Remove the xdp_rxq_info utility
  samples/bpf: Remove the xdp_redirect* utilities
  ...
====================

Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>

bebfbf07

Merge tag 'wireless-next-2023-08-25' of... · 1fa6ffad

Jakub Kicinski authored Aug 25, 2023

Merge tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.6

The second pull request for v6.6, this time with both stack and driver
changes. Unusually we have only one major new feature but lots of
small cleanup all over, I guess this is due to people have been on
vacation the last month.

Major changes:

rtw89
 - Introduce Time Averaged SAR (TAS) support

* tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (114 commits)
  wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet()
  wifi: rtw88: usb: kill and free rx urbs on probe failure
  wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar()
  wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG
  wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips
  wifi: rtw89: mac: define register address of rx_filter to generalize code
  wifi: rtw89: mac: define internal memory address for WiFi 7 chip
  wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory
  wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address
  wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code
  wifi: rtw89: initialize multi-channel handling
  wifi: rtw89: provide functions to configure NoA for beacon update
  wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif
  wifi: rtw89: sar: let caller decide the center frequency to query
  wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band()
  wifi: rtw89: add function prototype for coex request duration
  Fix nomenclature for USB and PCI wireless devices
  wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address
  wifi: ath12k: Remove unused declarations
  wifi: ath12k: add check max message length while scanning with extraie
  ...
====================

Link: https://lore.kernel.org/r/20230825132230.A0833C433C8@smtp.kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1fa6ffad

Merge tag 'for-net-next-2023-08-24' of... · 3db34747

Jakub Kicinski authored Aug 25, 2023

Merge tag 'for-net-next-2023-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - Introduce HCI_QUIRK_BROKEN_LE_CODED
 - Add support for PA/BIG sync
 - Add support for NXP IW624 chipset
 - Add support for Qualcomm WCN7850

* tag 'for-net-next-2023-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
  Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave()
  Bluetooth: btusb: Fix quirks table naming
  Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED
  Bluetooth: btintel: Send new command for PPAG
  Bluetooth: ISO: Add support for periodic adv reports processing
  Bluetooth: hci_conn: fail SCO/ISO via hci_conn_failed if ACL gone early
  Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH
  Bluetooth: ISO: Use defer setup to separate PA sync and BIG sync
  Bluetooth: qca: add support for WCN7850
  Bluetooth: qca: use switch case for soc type behavior
  dt-bindings: net: bluetooth: qualcomm: document WCN7850 chipset
  Bluetooth: hci_conn: Fix sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL
  Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync
  Bluetooth: btnxpuart: Improve inband Independent Reset handling
  Bluetooth: btnxpuart: Add support for IW624 chipset
  Bluetooth: btnxpuart: Remove check for CTS low after FW download
====================

Link: https://lore.kernel.org/r/20230824201458.2577-1-luiz.dentz@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

3db34747

25 Aug, 2023 4 commits

Merge branch 'bpf-refcount-followups-3-bpf_mem_free_rcu-refcounted-nodes' · ec0ded2e

Alexei Starovoitov authored Aug 25, 2023

Dave Marchevsky says:

====================
BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes

This series is the third of three (or more) followups to address issues
in the bpf_refcount shared ownership implementation discovered by Kumar.
This series addresses the use-after-free scenario described in [0]. The
first followup series ([1]) also attempted to address the same
use-after-free, but only got rid of the splat without addressing the
underlying issue. After this series the underyling issue is fixed and
bpf_refcount_acquire can be re-enabled.

The main fix here is migration of bpf_obj_drop to use
bpf_mem_free_rcu. To understand why this fixes the issue, let us consider
the example interleaving provided by Kumar in [0]:

CPU 0                                   CPU 1
n = bpf_obj_new
lock(lock1)
bpf_rbtree_add(rbtree1, n)
m = bpf_rbtree_acquire(n)
unlock(lock1)

kptr_xchg(map, m) // move to map
// at this point, refcount = 2
					m = kptr_xchg(map, NULL)
					lock(lock2)
lock(lock1)				bpf_rbtree_add(rbtree2, m)
p = bpf_rbtree_first(rbtree1)			if (!RB_EMPTY_NODE) bpf_obj_drop_impl(m) // A
bpf_rbtree_remove(rbtree1, p)
unlock(lock1)
bpf_obj_drop(p) // B
					bpf_refcount_acquire(m) // use-after-free
					...

Before this series, bpf_obj_drop returns memory to the allocator using
bpf_mem_free. At this point (B in the example) there might be some
non-owning references to that memory which the verifier believes are valid,
but where the underlying memory was reused for some other allocation.
Commit 7793fc3b ("bpf: Make bpf_refcount_acquire fallible for
non-owning refs") attempted to fix this by doing refcount_inc_non_zero
on refcount_acquire in instead of refcount_inc under the assumption that
preventing erroneous incr-on-0 would be sufficient. This isn't true,
though: refcount_inc_non_zero must *check* if the refcount is zero, and
the memory it's checking could have been reused, so the check may look
at and incr random reused bytes.

If we wait to reuse this memory until all non-owning refs that could
point to it are gone, there is no possibility of this scenario
happening. Migrating bpf_obj_drop to use bpf_mem_free_rcu for refcounted
nodes accomplishes this.

For such nodes, the validity of their underlying memory is now tied to
RCU critical section. This matches MEM_RCU trustedness
expectations, so the series takes the opportunity to more explicitly
mark this trustedness state.

The functional effects of trustedness changes here are rather small.
This is largely due to local kptrs having separate verifier handling -
with implicit trustedness assumptions - than arbitrary kptrs.
Regardless, let's take the opportunity to move towards a world where
trustedness is more explicitly handled.

Changelog:

v1 -> v2: https://lore.kernel.org/bpf/20230801203630.3581291-1-davemarchevsky@fb.com/

Patch 1 ("bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire")
  * Spent some time experimenting with a better approach as per convo w/
    Yonghong on v1's patch. It started getting too complex, so left unchanged
    for now. Yonghong was fine with this approach being shipped.

Patch 2 ("bpf: Consider non-owning refs trusted")
  * Add Yonghong ack
Patch 3 ("bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes")
  * Add Yonghong ack
Patch 4 ("bpf: Reenable bpf_refcount_acquire")
  * Add Yonghong ack

Patch 5 ("bpf: Consider non-owning refs to refcounted nodes RCU protected")
  * Undo a nonfunctional whitespace change that shouldn't have been included
    (Yonghong)
  * Better logging message when complaining about rcu_read_{lock,unlock} in
    rbtree cb (Alexei)
  * Don't invalidate_non_owning_refs when processing bpf_rcu_read_unlock
    (Yonghong, Alexei)

Patch 6 ("[RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS")
  * preempt_{disable,enable} in __bpf_spin_{lock,unlock} (Alexei)
    * Due to this we can consider spin_lock CS an RCU-sched read-side CS (per
      RCU/Design/Requirements/Requirements.rst). Modify in_rcu_cs accordingly.
  * no need to check for !in_rcu_cs before allowing bpf_spin_{lock,unlock}
    (Alexei)
  * RFC tag removed and renamed to "bpf: Allow bpf_spin_{lock,unlock} in
    sleepable progs"

Patch 7 ("selftests/bpf: Add tests for rbtree API interaction in sleepable progs")
  * Remove "no explicit bpf_rcu_read_lock" failure test, add similar success
    test (Alexei)

Summary of patch contents, with sub-bullets being leading questions and
comments I think are worth reviewer attention:

  * Patches 1 and 2 are moreso documententation - and
    enforcement, in patch 1's case - of existing semantics / expectations

  * Patch 3 changes bpf_obj_drop behavior for refcounted nodes such that
    their underlying memory is not reused until RCU grace period elapses
    * Perhaps it makes sense to move to mem_free_rcu for _all_
      non-owning refs in the future, not just refcounted. This might
      allow custom non-owning ref lifetime + invalidation logic to be
      entirely subsumed by MEM_RCU handling. IMO this needs a bit more
      thought and should be tackled outside of a fix series, so it's not
      attempted here.

  * Patch 4 re-enables bpf_refcount_acquire as changes in patch 3 fix
    the remaining use-after-free
    * One might expect this patch to be last in the series, or last
      before selftest changes. Patches 5 and 6 don't change
      verification or runtime behavior for existing BPF progs, though.

  * Patch 5 brings the verifier's understanding of refcounted node
    trustedness in line with Patch 4's changes

  * Patch 6 allows some bpf_spin_{lock, unlock} calls in sleepable
    progs. Marked RFC for a few reasons:
    * bpf_spin_{lock,unlock} haven't been usable in sleepable progs
      since before the introduction of bpf linked list and rbtree. As
      such this feels more like a new feature that may not belong in
      this fixes series.

  * Patch 7 adds tests

  [0]: https://lore.kernel.org/bpf/atfviesiidev4hu53hzravmtlau3wdodm2vqs7rd7tnwft34e3@xktodqeqevir/
  [1]: https://lore.kernel.org/bpf/20230602022647.1571784-1-davemarchevsky@fb.com/
====================

Link: https://lore.kernel.org/r/20230821193311.3290257-1-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ec0ded2e

selftests/bpf: Add tests for rbtree API interaction in sleepable progs · 312aa5bd

Dave Marchevsky authored Aug 21, 2023

Confirm that the following sleepable prog states fail verification:
  * bpf_rcu_read_unlock before bpf_spin_unlock
     * RCU CS will last at least as long as spin_lock CS

Also confirm that correct usage passes verification, specifically:
  * Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
  * Implied RCU CS due to spin_lock CS

None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

312aa5bd

bpf: Allow bpf_spin_{lock,unlock} in sleepable progs · 5861d1e8

Dave Marchevsky authored Aug 21, 2023

Commit 9e7a4d98 ("bpf: Allow LSM programs to use bpf spin locks")
disabled bpf_spin_lock usage in sleepable progs, stating:

Sleepable LSM programs can be preempted which means that allowng spin
locks will need more work (disabling preemption and the verifier
ensuring that no sleepable helpers are called when a spin lock is
held).

This patch disables preemption before grabbing bpf_spin_lock. The second
requirement above "no sleepable helpers are called when a spin lock is
held" is implicitly enforced by current verifier logic due to helper
calls in spin_lock CS being disabled except for a few exceptions, none
of which sleep.

Due to above preemption changes, bpf_spin_lock CS can also be considered
a RCU CS, so verifier's in_rcu_cs check is modified to account for this.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-7-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5861d1e8

bpf: Consider non-owning refs to refcounted nodes RCU protected · 0816b8c6

Dave Marchevsky authored Aug 21, 2023

An earlier patch in the series ensures that the underlying memory of
nodes with bpf_refcount - which can have multiple owners - is not reused
until RCU grace period has elapsed. This prevents
use-after-free with non-owning references that may point to
recently-freed memory. While RCU read lock is held, it's safe to
dereference such a non-owning ref, as by definition RCU GP couldn't have
elapsed and therefore underlying memory couldn't have been reused.

From the perspective of verifier "trustedness" non-owning refs to
refcounted nodes are now trusted only in RCU CS and therefore should no
longer pass is_trusted_reg, but rather is_rcu_reg. Let's mark them
MEM_RCU in order to reflect this new state.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-6-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

0816b8c6