Commits · 4facbe3d4426720b65354be1d0f5608c65eebc20 · Kirill Smelkov / linux

21 Apr, 2022 1 commit

drivers: net: davinci_mdio: using pm_runtime_resume_and_get instead of pm_runtime_get_sync · 4facbe3d

Minghao Chi authored Apr 18, 2022

Using pm_runtime_resume_and_get is more appropriate
for simplifing code
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/r/20220418062921.2557884-1-chi.minghao@zte.com.cnSigned-off-by: Paolo Abeni <pabeni@redhat.com>

4facbe3d

20 Apr, 2022 38 commits

Merge branch 'mlxsw-line-card-status-tracking' · 365014f5

David S. Miller authored Apr 20, 2022

Ido Schimmel says:

====================
mlxsw: Line cards status tracking

When a line card is provisioned, netdevs corresponding to the ports
found on the line card are registered. User space can then perform
various logical configurations (e.g., splitting, setting MTU) on these
netdevs.

However, since the line card is not present / powered on (i.e., it is
not in 'active' state), user space cannot access the various components
found on the line card. For example, user space cannot read the
temperature of gearboxes or transceiver modules found on the line card
via hwmon / thermal. Similarly, it cannot dump the EEPROM contents of
these transceiver modules. The above is only possible when the line card
becomes active.

This patchset solves the problem by tracking the status of each line
card and invoking callbacks from interested parties when a line card
becomes active / inactive.

Patchset overview:

Patch #1 adds the infrastructure in the line cards core that allows
users to registers a set of callbacks that are invoked when a line card
becomes active / inactive. To avoid races, if a line card is already
active during registration, the got_active() callback is invoked.

Patches #2-#3 are preparations.

Patch #4 changes the port module core to register a set of callbacks
with the line cards core. See detailed description with examples in the
commit message.

Patches #5-#6 do the same with regards to thermal / hwmon support, so
that user space will be able to monitor the temperature of various
components on the line card when it becomes active.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

365014f5

mlxsw: core_hwmon: Add interfaces for line card initialization and de-initialization · 99a03b31

Vadim Pasternak authored Apr 19, 2022

Add callback functions for line card 'hwmon' initialization and
de-initialization. Each line card is associated with the relevant
'hwmon' device, which may contain thermal attributes for the cages
and gearboxes found on this line card.

The line card 'hwmon' initialization / de-initialization APIs are to be
called when line card is set to active / inactive state by
got_active() / got_inactive() callbacks from line card state machine.

For example cage temperature for module #9 located at line card #7 will
be exposed by utility 'sensors' like:
linecard#07
front panel 009:	+32.0C  (crit = +70.0C, emerg = +80.0C)
And temperature for gearbox #3 located at line card #5 will be exposed
like:
linecard#05
gearbox 003:		+41.0C  (highest = +41.0C)
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99a03b31

mlxsw: core_thermal: Add interfaces for line card initialization and de-initialization · f11a323d

Vadim Pasternak authored Apr 19, 2022

Add callback functions for line card thermal area initialization and
de-initialization. Each line card is associated with the relevant
thermal area, which may contain thermal zones for cages and gearboxes
found on this line card.

The line card thermal initialization / de-initialization APIs are to be
called when line card is set to active / inactive state by
got_active() / got_inactive() callbacks from line card state machine.

For example thermal zone for module #9 located at line card #7 will
have type:
mlxsw-lc7-module9.
And thermal zone for gearbox #2 located at line card #5 will have type:
mlxsw-lc5-gearbox2.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f11a323d

mlxsw: core_env: Add interfaces for line card initialization and de-initialization · 06a0fc43

Vadim Pasternak authored Apr 19, 2022

Netdevs for ports found on line cards are registered upon provisioning.
However, user space is not allowed to access the transceiver modules
found on a line card until the line card becomes active.

Therefore, register event operations with the line card core to get
notifications whenever a line card becomes active or inactive.

When user space tries to dump the EEPROM of a transceiver module or reset
it and the corresponding line card is inactive, emit an error
message:
ethtool -m enp1s0nl7p9
netlink error: mlxsw_core: Cannot read EEPROM of module on an inactive line card
netlink error: Input/output error

When user space tries to set the power mode policy of such a transceiver,
cache the configuration and apply it when the line card becomes active. This
is consistent with other port configuration (e.g., MTU setting) that user space
is able to perform while the line card is provisioned, but inactive.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

06a0fc43

mlxsw: core_env: Split module power mode setting to a separate function · a11e1ec1

Vadim Pasternak authored Apr 19, 2022

Move the code that applies the module power mode to the device to a
separate function. This function will be invoked by the next patch to
set the power mode on transceiver modules found on a line card when the
line card becomes active.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a11e1ec1

mlxsw: core: Add bus argument to environment init API · 7b261af9

Vadim Pasternak authored Apr 19, 2022

Pass bus argument to mlxsw_env_init(). The purpose is to get access to
device handle, which is to be provided to error message in case of line
card activation failure.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7b261af9

mlxsw: core_linecards: Introduce ops for linecards status change tracking · de28976d

Jiri Pirko authored Apr 19, 2022

Introduce an infrastructure allowing users to register a set
of operations which are to be called whenever a line card gets
active/inactive.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de28976d

Merge tag 'linux-can-next-for-5.19-20220419' of... · 85ef87ba

David S. Miller authored Apr 20, 2022

Merge tag 'linux-can-next-for-5.19-20220419' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2022-04-19

this is a pull request of 17 patches for net-next/master.

The first 2 patches are by me and target the CAN driver
infrastructure. One patch renames a function in the rx_offload helper
the other one updates the CAN bitrate calculation to prefer small bit
rate pre-scalers over larger ones, which is encouraged by the CAN in
Automation.

Kris Bahnsen contributes a patch to fix the links to Technologic
Systems web resources in the sja1000 driver.

Christophe Leroy's patch prepares the mpc5xxx_can driver for upcoming
powerpc header cleanup.

Minghao Chi's patch converts the flexcan driver to use
pm_runtime_resume_and_get().

The next 2 patches target the Xilinx CAN driver. Lukas Bulwahn's patch
fixes an entry in the MAINTAINERS file. A patch by me marks the bit
timing constants as const.

Wolfram Sang's patch documents r8a77961 support on the
renesas,rcar-canfd bindings document.

The next 2 patches are by me and add support for the mcp251863 chip to
the mcp251xfd driver.

The last 7 patches are by Pavel Pisa, Martin Jerabek et al. and add
the ctucanfd driver for the CTU CAN FD IP Core.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

85ef87ba

Merge branch 'net-sched-flower-num-vlan-tags' · c1f6f1e6

David S. Miller authored Apr 20, 2022

Boris Sukholitko says:

====================
net/sched: flower: match on the number of vlan tags

Our customers in the fiber telecom world have network configurations
where they would like to control their traffic according to the number
of tags appearing in the packet.

For example, TR247 GPON conformance test suite specification mostly
talks about untagged, single, double tagged packets and gives lax
guidelines on the vlan protocol vs. number of vlan tags.

This is different from the common IT networks where 802.1Q and 802.1ad
protocols are usually describe single and double tagged packet. GPON
configurations that we work with have arbitrary mix the above protocols
and number of vlan tags in the packet.

The following patch series implement number of vlans flower filter. They
add num_of_vlans flower filter as an alternative to vlan ethtype protocol
matching. The end result is that the following command becomes possible:

tc filter add dev eth1 ingress flower \
  num_of_vlans 1 vlan_prio 5 action drop

Also, from our logs, we have redirect rules such that:

tc filter add dev $GPON ingress flower num_of_vlans $N \
     action mirred egress redirect dev $DEV

where N can range from 0 to 3 and $DEV is the function of $N.

Also there are rules setting skb mark based on the number of vlans:

tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \
    $P action skbedit mark $M

More about the patch series:
  - patches 1-2 remove duplicate code by introducing is_key_vlan
    helper.
  - patch 3, 4 implement num_of_vlans in the dissector and in the
    flower.
  - patch 5 uses the num_of_vlans filter to allow further matching on
    vlan attributes.

Complementary iproute2 patches are being sent separately.

Thanks,
Boris.

- v4: rebased to the latest net-next
- v3:
    - more example commands in patch 3 description (request by Jamal)
    - patch 5 description made clearer (thanks to Jiri)
- v2:
    - add suitable subject prefixes
    - more evolved patch 5 description
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c1f6f1e6

net/sched: flower: Consider the number of tags for vlan filters · 99fdb22b

Boris Sukholitko authored Apr 19, 2022

Before this patch the existence of vlan filters was conditional on the vlan
protocol being matched in the tc rule. For example, the following rule:

tc filter add dev eth1 ingress flower vlan_prio 5

was illegal because vlan protocol (e.g. 802.1q) does not appear in the rule.

Remove the above restriction by looking at the num_of_vlans filter to
allow further matching on vlan attributes. The following rule becomes
legal as a result of this commit:

tc filter add dev eth1 ingress flower num_of_vlans 1 vlan_prio 5

because having num_of_vlans==1 implies that the packet is single tagged.

Change is_vlan_key helper to look at the number of vlans in addition to
the vlan ethertype. The outcome of this change is that outer (e.g. vlan_prio)
and inner (e.g. cvlan_prio) tag vlan filters require the number of vlan
tags to be greater then 0 and 1 accordingly.

As a result of is_vlan_key change, the ethertype may be set to 0 when
matching on the number of vlans. Update fl_set_key_vlan to avoid setting
key, mask vlan_tpid for the 0 ethertype.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99fdb22b

net/sched: flower: Add number of vlan tags filter · b4000312

Boris Sukholitko authored Apr 19, 2022

These are bookkeeping parts of the new num_of_vlans filter.
Defines, dump, load and set are being done here.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b4000312

flow_dissector: Add number of vlan tags dissector · 34951fcf

Boris Sukholitko authored Apr 19, 2022

Our customers in the fiber telecom world have network configurations
where they would like to control their traffic according to the number
of tags appearing in the packet.

For example, TR247 GPON conformance test suite specification mostly
talks about untagged, single, double tagged packets and gives lax
guidelines on the vlan protocol vs. number of vlan tags.

This is different from the common IT networks where 802.1Q and 802.1ad
protocols are usually describe single and double tagged packet. GPON
configurations that we work with have arbitrary mix the above protocols
and number of vlan tags in the packet.

The goal is to make the following TC commands possible:

tc filter add dev eth1 ingress flower \
  num_of_vlans 1 vlan_prio 5 action drop

From our logs, we have redirect rules such that:

tc filter add dev $GPON ingress flower num_of_vlans $N \
     action mirred egress redirect dev $DEV

where N can range from 0 to 3 and $DEV is the function of $N.

Also there are rules setting skb mark based on the number of vlans:

tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \
    $P action skbedit mark $M

This new dissector allows extracting the number of vlan tags existing in
the packet.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

34951fcf

net/sched: flower: Reduce identation after is_key_vlan refactoring · 6ee59e55

Boris Sukholitko authored Apr 19, 2022

Whitespace only.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6ee59e55

net/sched: flower: Helper function for vlan ethtype checks · 285ba06b

Boris Sukholitko authored Apr 19, 2022

There are somewhat repetitive ethertype checks in fl_set_key. Refactor
them into is_vlan_key helper function.

To make the changes clearer, avoid touching identation levels. This is
the job for the next patch in the series.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

285ba06b

ar5523: Use kzalloc instead of kmalloc/memset · e63dd412

Haowen Bai authored Apr 19, 2022

Use kzalloc rather than duplicating its implementation, which
makes code simple and easy to understand.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e63dd412

net: dsa: realtek: remove realtek,rtl8367s string · fcd30c96

Luiz Angelo Daros de Luca authored Apr 18, 2022

There is no need to add new compatible strings for each new supported
chip version. The compatible string is used only to select the subdriver
(rtl8365mb.c or rtl8366rb.c). Once in the subdriver, it will detect the
chip model by itself, ignoring which compatible string was used.

Link: https://lore.kernel.org/netdev/20220414014055.m4wbmr7tdz6hsa3m@bang-olufsen.dk/Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fcd30c96

dt-bindings: net: dsa: realtek: cleanup compatible strings · 6f2d04cc

Luiz Angelo Daros de Luca authored Apr 18, 2022

Compatible strings are used to help the driver find the chip ID/version
register for each chip family. After that, the driver can setup the
switch accordingly. Keep only the first supported model for each family
as a compatible string and reference other chip models in the
description.

The removed compatible strings have never been used in a released kernel.

CC: devicetree@vger.kernel.org
Link: https://lore.kernel.org/netdev/20220414014055.m4wbmr7tdz6hsa3m@bang-olufsen.dk/Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f2d04cc

Merge branch 'hns3-next' · e92453b9

David S. Miller authored Apr 20, 2022

Guangbin Huang says:

====================
net: hns3: updates for -next

This series includes some updates for the HNS3 ethernet driver.

Change logs:
V1 -> V2:
 - Fix failed to apply to net-next problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e92453b9

net: hns3: remove unnecessary line wrap for hns3_set_tunable · 29c17cb6

Hao Chen authored Apr 19, 2022

Remove unnecessary line wrap for hns3_set_tunable to improve
function readability.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

29c17cb6

net: hns3: replace magic value by HCLGE_RING_REG_OFFSET · 350cb440

Peng Li authored Apr 19, 2022

Magic values are not recommended.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

350cb440

net: hns3: fix the wrong words in comments · 9c657cbc

Peng Li authored Apr 19, 2022

This patch fixes wrong words in comments.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c657cbc

net: hns3: update the comment of function hclgevf_get_mbx_resp · 2e0f5388

Peng Li authored Apr 19, 2022

The param of function hclgevf_get_mbx_resp has been changed but the
comments not upodated. This patch updates it.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e0f5388

net: hns3: add log for setting tx spare buf size · 2373b35c

Hao Chen authored Apr 19, 2022

For the active tx spare buffer size maybe changed according
to the page size, so add log to notice it.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2373b35c

net: hns3: add failure logs in hclge_set_vport_mtu · bcc7a98f

Jie Wang authored Apr 19, 2022

Currently, There is a low probability that pf mtu configuration fails, but
the information in logs is insufficient for problem locating when the VF
mtu value is illegally modified.

So record the vf index and vf mtu value at the failure scenario.
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bcc7a98f

net: hns3: refine the definition for struct hclge_pf_to_vf_msg · 6fde96df

Jian Shen authored Apr 19, 2022

The struct hclge_pf_to_vf_msg is used for mailbox message from
PF to VF, including both response and request. But its definition
can only indicate respone, which makes the message data copy in
function hclge_send_mbx_msg() unreadable. So refine it by edding
a general message definition into it.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fde96df

net: hns3: refactor hns3_set_ringparam() · 07fdc163

Hao Chen authored Apr 19, 2022

Use struct hns3_ring_param to replace variable new/old_xxx and
add hns3_is_ringparam_changed() to judge them if is changed to
improve code readability.
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

07fdc163

net: hns3: add ethtool parameter check for CQE/EQE mode · 286c61e7

Yufeng Mo authored Apr 19, 2022

For DEVICE_VERSION_V2, the hardware does not support the CQE mode.
So add capability bit for coalesce CQE mode and add parameter check
for it in ethtool.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

286c61e7

Merge branch 'atlantic-xdp-multi-buffer' · e97e917b

David S. Miller authored Apr 20, 2022

[PATCH net-next v5 0/3] net: atlantic: Add XDP support
@ 2022-04-17 10:12 Taehee Yoo
  2022-04-17 10:12 ` [PATCH net-next v5 1/3] net: atlantic: Implement xdp control plane Taehee Yoo
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Taehee Yoo @ 2022-04-17 10:12 UTC (permalink / raw)
  To: davem, kuba, pabeni, netdev, irusskikh, ast, daniel, hawk,
	john.fastabend, andrii, kafai, songliubraving, yhs, kpsingh, bpf
  Cc: ap420073

This patchset is to make atlantic to support multi-buffer XDP.

The first patch implement control plane of xdp.
The aq_xdp(), callback of .xdp_bpf is added.

The second patch implements data plane of xdp.
XDP_TX, XDP_DROP, and XDP_PASS is supported.
__aq_ring_xdp_clean() is added to receive and execute xdp program.
aq_nic_xmit_xdpf() is added to send packet by XDP.

The third patch implements callback of .ndo_xdp_xmit.
aq_xdp_xmit() is added to send redirected packets and it internally
calls aq_nic_xmit_xdpf().

Memory model is MEM_TYPE_PAGE_SHARED.

Order-2 page allocation is used when XDP is enabled.

LRO will be disabled if XDP program doesn't supports multi buffer.

AQC chip supports 32 multi-queues and 8 vectors(irq).
There are two options.
1. under 8 cores and maximum 4 tx queues per core.
2. under 4 cores and maximum 8 tx queues per core.

Like other drivers, these tx queues can be used only for XDP_TX,
XDP_REDIRECT queue. If so, no tx_lock is needed.
But this patchset doesn't use this strategy because getting hardware tx
queue index cost is too high.
So, tx_lock is used in the aq_nic_xmit_xdpf().

single-core, single queue, 80% cpu utilization.

  32.30%  [kernel]                  [k] aq_get_rxpages_xdp
  10.44%  [kernel]                  [k] aq_hw_read_reg <---------- here
   9.86%  bpf_prog_xxx_xdp_prog_tx  [k] bpf_prog_xxx_xdp_prog_tx
   5.51%  [kernel]                  [k] aq_ring_rx_clean

single-core, 8 queues, 100% cpu utilization, half PPS.

  52.03%  [kernel]                  [k] aq_hw_read_reg <---------- here
  18.24%  [kernel]                  [k] aq_get_rxpages_xdp
   4.30%  [kernel]                  [k] hw_atl_b0_hw_ring_rx_receive
   4.24%  bpf_prog_xxx_xdp_prog_tx  [k] bpf_prog_xxx_xdp_prog_tx
   2.79%  [kernel]                  [k] aq_ring_rx_clean

Performance result(64 Byte)
1. XDP_TX
  a. xdp_geieric, single core
    - 2.5Mpps, 100% cpu
  b. xdp_driver, single core
    - 4.5Mpps, 80% cpu
  c. xdp_generic, 8 core(hyper thread)
    - 6.3Mpps, 40% cpu
  d. xdp_driver, 8 core(hyper thread)
    - 6.3Mpps, 30% cpu

2. XDP_REDIRECT
  a. xdp_generic, single core
    - 2.3Mpps
  b. xdp_driver, single core
    - 4.5Mpps

v5:
 - Use MEM_TYPE_PAGE_SHARED instead of MEM_TYPE_PAGE_ORDER0
 - Use 2K frame size instead of 3K
 - Use order-2 page allocation instead of order-0
 - Rename aq_get_rxpage() to aq_alloc_rxpages()
 - Add missing PageFree stats for ethtool
 - Remove aq_unset_rxpage_xdp(), introduced by v2 patch due to
   change of memory model
 - Fix wrong last parameter value of xdp_prepare_buff()
 - Add aq_get_rxpages_xdp() to increase page reference count

v4:
 - Fix compile warning

v3:
 - Change wrong PPS performance result 40% -> 80% in single
   core(Intel i3-12100)
 - Separate aq_nic_map_xdp() from aq_nic_map_skb()
 - Drop multi buffer packets if single buffer XDP is attached
 - Disable LRO when single buffer XDP is attached
 - Use xdp_get_{frame/buff}_len()

v2:
 - Do not use inline in C file

Taehee Yoo (3):
  net: atlantic: Implement xdp control plane
  net: atlantic: Implement xdp data plane
  net: atlantic: Implement .ndo_xdp_xmit handler

 .../net/ethernet/aquantia/atlantic/aq_cfg.h   |   1 +
 .../ethernet/aquantia/atlantic/aq_ethtool.c   |   9 +
 .../net/ethernet/aquantia/atlantic/aq_main.c  |  87 ++++
 .../net/ethernet/aquantia/atlantic/aq_main.h  |   2 +
 .../net/ethernet/aquantia/atlantic/aq_nic.c   | 136 ++++++
 .../net/ethernet/aquantia/atlantic/aq_nic.h   |   5 +
 .../net/ethernet/aquantia/atlantic/aq_ring.c  | 409 ++++++++++++++++--
 .../net/ethernet/aquantia/atlantic/aq_ring.h  |  21 +-
 .../net/ethernet/aquantia/atlantic/aq_vec.c   |  23 +-
 .../net/ethernet/aquantia/atlantic/aq_vec.h   |   6 +
 .../aquantia/atlantic/hw_atl/hw_atl_a0.c      |   6 +-
 .../aquantia/atlantic/hw_atl/hw_atl_b0.c      |  10 +-
 12 files changed, 670 insertions(+), 45 deletions(-)

--
2.17.1

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [PATCH net-next v5 1/3] net: atlantic: Implement xdp control plane
  2022-04-17 10:12 [PATCH net-next v5 0/3] net: atlantic: Add XDP support Taehee Yoo
@ 2022-04-17 10:12 ` Taehee Yoo
  2022-04-17 10:12 ` [PATCH net-next v5 2/3] net: atlantic: Implement xdp data plane Taehee Yoo
  2022-04-17 10:12 ` [PATCH net-next v5 3/3] net: atlantic: Implement .ndo_xdp_xmit handler Taehee Yoo
  2 siblings, 0 replies; 4+ messages in thread
From: Taehee Yoo @ 2022-04-17 10:12 UTC (permalink / raw)
  To: davem, kuba, pabeni, netdev, irusskikh, ast, daniel, hawk,
	john.fastabend, andrii, kafai, songliubraving, yhs, kpsingh, bpf
  Cc: ap420073

aq_xdp() is a xdp setup callback function for Atlantic driver.
When XDP is attached or detached, the device will be restarted because
it uses different headroom, tailroom, and page order value.

If XDP enabled, it switches default page order value from 0 to 2.
Because the default maximum frame size is still 2K and it needs
additional area for headroom and tailroom.
The total size(headroom + frame size + tailroom) is 2624.
So, 1472Bytes will be always wasted for every frame.
But when order-2 is used, these pages can be used 6 times
with flip strategy.
It means only about 106Bytes per frame will be wasted.

Also, It supports xdp fragment feature.
MTU can be 16K if xdp prog supports xdp fragment.
If not, MTU can not exceed 2K - ETH_HLEN - ETH_FCS.

And a static key is added and It will be used to call the xdp_clean
handler in ->poll(). data plane implementation will be contained
the followed patch.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v5:
 - Use MEM_TYPE_PAGE_SHARED instead of MEM_TYPE_PAGE_ORDER0
 - Use 2K frame size instead of 3K
 - Use order-2 page allocation instead of order-0
 - Rename aq_get_rxpage() to aq_alloc_rxpages()

v4:
 - No changed

v3:
 - Disable LRO when single buffer XDP is attached

v2:
 - No changed

e97e917b

net: atlantic: Implement .ndo_xdp_xmit handler · 45638f01

Taehee Yoo authored Apr 17, 2022

aq_xdp_xmit() is the callback function of .ndo_xdp_xmit.
It internally calls aq_nic_xmit_xdpf() to send packet.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

45638f01

net: atlantic: Implement xdp data plane · 26efaef7

Taehee Yoo authored Apr 17, 2022

It supports XDP_PASS, XDP_DROP and multi buffer.

The new function aq_nic_xmit_xdpf() is used to send packet with
xdp_frame and internally it calls aq_nic_map_xdp().

AQC chip supports 32 multi-queues and 8 vectors(irq).
there are two option
1. under 8 cores and 4 tx queues per core.
2. under 4 cores and 8 tx queues per core.

Like ixgbe, these tx queues can be used only for XDP_TX, XDP_REDIRECT
queue. If so, no tx_lock is needed.
But this patchset doesn't use this strategy because getting hardware tx
queue index cost is too high.
So, tx_lock is used in the aq_nic_xmit_xdpf().

single-core, single queue, 80% cpu utilization.

  30.75%  bpf_prog_xxx_xdp_prog_tx  [k] bpf_prog_xxx_xdp_prog_tx
  10.35%  [kernel]                  [k] aq_hw_read_reg <---------- here
   4.38%  [kernel]                  [k] get_page_from_freelist

single-core, 8 queues, 100% cpu utilization, half PPS.

  45.56%  [kernel]                  [k] aq_hw_read_reg <---------- here
  17.58%  bpf_prog_xxx_xdp_prog_tx  [k] bpf_prog_xxx_xdp_prog_tx
   4.72%  [kernel]                  [k] hw_atl_b0_hw_ring_rx_receive

The new function __aq_ring_xdp_clean() is a xdp rx handler and this is
called only when XDP is attached.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

26efaef7

net: atlantic: Implement xdp control plane · 0d14657f

Taehee Yoo authored Apr 17, 2022

aq_xdp() is a xdp setup callback function for Atlantic driver.
When XDP is attached or detached, the device will be restarted because
it uses different headroom, tailroom, and page order value.

If XDP enabled, it switches default page order value from 0 to 2.
Because the default maximum frame size is still 2K and it needs
additional area for headroom and tailroom.
The total size(headroom + frame size + tailroom) is 2624.
So, 1472Bytes will be always wasted for every frame.
But when order-2 is used, these pages can be used 6 times
with flip strategy.
It means only about 106Bytes per frame will be wasted.

Also, It supports xdp fragment feature.
MTU can be 16K if xdp prog supports xdp fragment.
If not, MTU can not exceed 2K - ETH_HLEN - ETH_FCS.

And a static key is added and It will be used to call the xdp_clean
handler in ->poll(). data plane implementation will be contained
the followed patch.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0d14657f

Merge branch 'dsa-cross-chip-notifier-cleanup' · 8ab38ed7

David S. Miller authored Apr 20, 2022

Vladimir Oltean says:

====================
DSA cross-chip notifier cleanups

This patch set makes the following improvements:

- Cross-chip notifiers pass a switch index, port index, sometimes tree
  index, all as integers. Sometimes we need to recover the struct
  dsa_port based on those integers. That recovery involves traversing a
  list. By passing directly a pointer to the struct dsa_port we can
  avoid that, and the indices passed previously can still be obtained
  from the passed struct dsa_port.

- Resetting VLAN filtering on a switch has explicit code to make it run
  on a single switch, so it has no place to stay in the cross-chip
  notifier code. Move it out.

- Changing the MTU on a user port affects only that single port, yet the
  code passes through the cross-chip notifier layer where all switches
  are notified. Avoid that.

- Other related cosmetic changes in the MTU changing procedure.

Apart from the slight improvement in performance given by
(a) doing less work in cross-chip notifiers
(b) emitting less cross-chip notifiers
we also end up with about 100 less lines of code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8ab38ed7

net: dsa: don't emit targeted cross-chip notifiers for MTU change · be6ff966

Vladimir Oltean authored Apr 15, 2022

A cross-chip notifier with "targeted_match=true" is one that matches
only the local port of the switch that emitted it. In other words,
passing through the cross-chip notifier layer serves no purpose.

Eliminate this concept by calling directly ds->ops->port_change_mtu
instead of emitting a targeted cross-chip notifier. This leaves the
DSA_NOTIFIER_MTU event being emitted only for MTU updates on the CPU
port, which need to be reflected also across all DSA links.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be6ff966

net: dsa: drop dsa_slave_priv from dsa_slave_change_mtu · 4715029f

Vladimir Oltean authored Apr 15, 2022

We can get a hold of the "ds" pointer directly from "dp", no need for
the dsa_slave_priv.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4715029f

net: dsa: avoid one dsa_to_port() in dsa_slave_change_mtu · cf1c39d3

Vladimir Oltean authored Apr 15, 2022

We could retrieve the cpu_dp pointer directly from the "dp" we already
have, no need to resort to dsa_to_port(ds, port).

This change also removes the need for an "int port", so that is also
deleted.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cf1c39d3

net: dsa: use dsa_tree_for_each_user_port in dsa_slave_change_mtu · b2033a05

Vladimir Oltean authored Apr 15, 2022

Use the more conventional iterator over user ports instead of explicitly
ignoring them, and use the more conventional name "other_dp" instead of
"dp_iter", for readability.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2033a05

net: dsa: make cross-chip notifiers more efficient for host events · 726816a1

Vladimir Oltean authored Apr 15, 2022

To determine whether a given port should react to the port targeted by
the notifier, dsa_port_host_vlan_match() and dsa_port_host_address_match()
look at the positioning of the switch port currently executing the
notifier relative to the switch port for which the notifier was emitted.

To maintain stylistic compatibility with the other match functions from
switch.c, the host address and host VLAN match functions take the
notifier information about targeted port, switch and tree indices as
argument. However, these functions only use that information to retrieve
the struct dsa_port *targeted_dp, which is an invariant for the outer
loop that calls them. So it makes more sense to calculate the targeted
dp only once, and pass it to them as argument.

But furthermore, the targeted dp is actually known at the time the call
to dsa_port_notify() is made. It is just that we decide to only save the
indices of the port, switch and tree in the notifier structure, just to
retrace our steps and find the dp again using dsa_switch_find() and
dsa_to_port().

But both the above functions are relatively expensive, since they need
to iterate through lists. It appears more straightforward to make all
notifiers just pass the targeted dp inside their info structure, and
have the code that needs the indices to look at info->dp->index instead
of info->port, or info->dp->ds->index instead of info->sw_index, or
info->dp->ds->dst->index instead of info->tree_index.

For the sake of consistency, all cross-chip notifiers are converted to
pass the "dp" directly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

726816a1

net: dsa: move reset of VLAN filtering to dsa_port_switchdev_unsync_attrs · 8e9e678e

Vladimir Oltean authored Apr 15, 2022

In dsa_port_switchdev_unsync_attrs() there is a comment that resetting
the VLAN filtering isn't done where it is expected. And since commit
108dc874 ("net: dsa: Avoid cross-chip syncing of VLAN filtering"),
there is no reason to handle this in switch.c either.

Therefore, move the logic to port.c, and adapt it slightly to the data
structures and naming conventions from there.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e9e678e

19 Apr, 2022 1 commit

MAINTAINERS: Add maintainers for CTU CAN FD IP core driver · cfdb2f36

Pavel Pisa authored Mar 22, 2022

This patch adds an entry for the CTU CAN FD IP to the maintainers
file.

Link: https://lore.kernel.org/all/2cc77e2999d9688bed155e4c7f7807e46d1bf9e3.1647904780.git.pisa@cmp.felk.cvut.czSigned-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

cfdb2f36