Commits · be6b41c15dc09c067492bd23668763f551747e4e · Kirill Smelkov / linux

17 Feb, 2022 40 commits

ipv6/addrconf: ensure addrconf_verify_rtnl() has completed · be6b41c1

Eric Dumazet authored Feb 16, 2022

Before freeing the hash table in addrconf_exit_net(),
we need to make sure the work queue has completed,
or risk NULL dereference or UAF.

Thus, use cancel_delayed_work_sync() to enforce this.
We do not hold RTNL in addrconf_exit_net(), making this safe.

Fixes: 8805d13f ("ipv6/addrconf: use one delayed work per netns")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220216182037.3742-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

be6b41c1

net: allow out-of-order netdev unregistration · faab39f6

Jakub Kicinski authored Feb 15, 2022

Sprinkle for each loops to allow netdevices to be unregistered
out of order, as their refs are released.

This prevents problems caused by dependencies between netdevs
which want to release references in their ->priv_destructor.
See commit d6ff94af ("vlan: move dev_put into vlan_dev_uninit")
for example.

Eric has removed the only known ordering requirement in
commit c002496b ("Merge branch 'ipv6-loopback'")
so let's try this and see if anything explodes...
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220215225310.3679266-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

faab39f6

net: transition netdev reg state earlier in run_todo · ae68db14

Jakub Kicinski authored Feb 15, 2022

In prep for unregistering netdevs out of order move the netdev
state validation and change outside of the loop.

While at it modernize this code and use WARN() instead of
pr_err() + dump_stack().
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220215225310.3679266-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ae68db14

Merge branch 'ping6-SOL_IPV6' · 4d449bdc

David S. Miller authored Feb 17, 2022

Jakub Kicinski says:

====================
net: ping6: support setting basic SOL_IPV6 options via cmsg

Support for IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG on ICMPv6
sockets and associated tests. I have no immediate plans to
implement IPV6_FLOWINFO and all the extension header stuff.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4d449bdc

selftests: net: basic test for IPV6_2292* · a22982c3

Jakub Kicinski authored Feb 16, 2022

Add a basic test to make sure ping sockets don't crash
with IPV6_2292* options.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a22982c3

selftests: net: test IPV6_HOPLIMIT · 05ae83d5

Jakub Kicinski authored Feb 16, 2022

Test setting IPV6_HOPLIMIT via setsockopt and cmsg
across socket types.

Output without the kernel support (this series):

  Case HOPLIMIT ICMP cmsg - packet data returned 1, expected 0
  Case HOPLIMIT ICMP diff - packet data returned 1, expected 0
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

05ae83d5

selftests: net: test IPV6_TCLASS · 9657ad09

Jakub Kicinski authored Feb 16, 2022

Test setting IPV6_TCLASS via setsockopt and cmsg
across socket types.

Output without the kernel support (this series):

  Case TCLASS ICMP cmsg - packet data returned 1, expected 0
  Case TCLASS ICMP cmsg - rejection returned 0, expected 1
  Case TCLASS ICMP diff - pass returned 1, expected 0
  Case TCLASS ICMP diff - packet data returned 1, expected 0
  Case TCLASS ICMP diff - rejection returned 0, expected 1
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

9657ad09

selftests: net: test IPV6_DONTFRAG · 6f97c7c6

Jakub Kicinski authored Feb 16, 2022

Test setting IPV6_DONTFRAG via setsockopt and cmsg
across socket types.

Output without the kernel support (this series):

    Case DONTFRAG ICMP setsock returned 0, expected 1
    Case DONTFRAG ICMP cmsg returned 0, expected 1
    Case DONTFRAG ICMP both returned 0, expected 1
    Case DONTFRAG ICMP diff returned 0, expected 1
  FAIL - 4/24 cases failed
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

6f97c7c6

net: ping6: support setting basic SOL_IPV6 options via cmsg · 13651224

Jakub Kicinski authored Feb 16, 2022

Support setting IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG
during sendmsg via SOL_IPV6 cmsgs.

tclass and dontfrag are init'ed from struct ipv6_pinfo in
ipcm6_init_sk(), while hlimit is inited to -1, so we need
to handle it being populated via cmsg explicitly.

Leave extension headers and flowlabel unimplemented.
Those are slightly more laborious to test and users
seem to primarily care about IPV6_TCLASS.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

13651224

Merge branch 'switchdev-BRENTRY' · d54f16c7

David S. Miller authored Feb 17, 2022

Vladimir Oltean says:

====================
kRemove BRENTRY checks from switchdev drivers

As discussed here:
https://patchwork.kernel.org/project/netdevbpf/patch/20220214233111.1586715-2-vladimir.oltean@nxp.com/#24738869

no switchdev driver makes use of VLAN port objects that lack the
BRIDGE_VLAN_INFO_BRENTRY flag. Notifying them in the first place rather
seems like an omission of commit 9c86ce2c ("net: bridge: Notify
about bridge VLANs").

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag") that was just merged, the bridge no
longer notifies switchdev upon creation of these VLANs, so we can remove
the checks from drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d54f16c7

net: ti: cpsw: remove guards against !BRIDGE_VLAN_INFO_BRENTRY · 5edb65ea

Vladimir Oltean authored Feb 16, 2022

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag"), the bridge no longer emits
switchdev notifiers for VLANs that don't have the
BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code.
Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5edb65ea

net: ti: am65-cpsw-nuss: remove guards against !BRIDGE_VLAN_INFO_BRENTRY · 1d21c327

Vladimir Oltean authored Feb 16, 2022

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag"), the bridge no longer emits
switchdev notifiers for VLANs that don't have the
BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code.
Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d21c327

net: sparx5: remove guards against !BRIDGE_VLAN_INFO_BRENTRY · 318994d3

Vladimir Oltean authored Feb 16, 2022

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag"), the bridge no longer emits
switchdev notifiers for VLANs that don't have the
BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code.
Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

318994d3

net: lan966x: remove guards against !BRIDGE_VLAN_INFO_BRENTRY · ba43b547

Vladimir Oltean authored Feb 16, 2022

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag"), the bridge no longer emits
switchdev notifiers for VLANs that don't have the
BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code.
Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ba43b547

mlxsw: spectrum: remove guards against !BRIDGE_VLAN_INFO_BRENTRY · ddaff504

Vladimir Oltean authored Feb 16, 2022

Since commit 3116ad06 ("net: bridge: vlan: don't notify to switchdev
master VLANs without BRENTRY flag"), the bridge no longer emits
switchdev notifiers for VLANs that don't have the
BRIDGE_VLAN_INFO_BRENTRY flag, so these checks are dead code.
Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddaff504

Merge branch 'ptp-over-udp-dsa' · 5da1033b

David S. Miller authored Feb 17, 2022

Vladimir Oltean says:

====================
Support PTP over UDP with the ocelot-8021q DSA tagging protocol

The alternative tag_8021q-based tagger for Ocelot switches, added here:
https://patchwork.kernel.org/project/netdevbpf/cover/20210129010009.3959398-1-olteanv@gmail.com/

gained support for PTP over L2 here:
https://patchwork.kernel.org/project/netdevbpf/cover/20210213223801.1334216-1-olteanv@gmail.com/

mostly as a minimum viable requirement. That PTP support was mostly
self-contained code that installed some rules to replicate PTP packets
on the CPU queue, in felix_setup_mmio_filtering().

However ocelot-8021q starts to look more interesting for general purpose
usage, so it is now time to reduce the technical debt by integrating the
PTP traps used by Felix for tag_8021q with the rest of the Ocelot driver.

There is further consolidation of traps to be done. The cookies used by
MRP traps overlap with the cookies used for tag_8021q PTP traps, so
those features could not be used at the same time.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

5da1033b

net: dsa: tag_ocelot_8021q: calculate TX checksum in software for deferred packets · 29940ce3

Vladimir Oltean authored Feb 16, 2022

DSA inherits NETIF_F_CSUM_MASK from master->vlan_features, and the
expectation is that TX checksumming is offloaded and not done in
software.

Normally the DSA master takes care of this, but packets handled by
ocelot_defer_xmit() are a very special exception, because they are
actually injected into the switch through register-based MMIO. So the
DSA master is not involved at all for these packets => no one calculates
the checksum.

This allows PTP over UDP to work using the ocelot-8021q tagging
protocol.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

29940ce3

net: dsa: felix: update destinations of existing traps with ocelot-8021q · 99348004

Vladimir Oltean authored Feb 16, 2022

Historically, the felix DSA driver has installed special traps such that
PTP over L2 works with the ocelot-8021q tagging protocol; commit
0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP
timestamping") has the details.

Then the ocelot switch library also gained more comprehensive support
for PTP traps through commit 96ca08c0 ("net: mscc: ocelot: set up
traps for PTP packets").

Right now, PTP over L2 works using ocelot-8021q via the traps it has set
for itself, but nothing else does. Consolidating the two code blocks
would make ocelot-8021q gain support for PTP over L4 and tc-flower
traps, and at the same time avoid some code and TCAM duplication.

The traps are similar in intent, but different in execution, so some
explanation is required. The traps set up by felix_setup_mmio_filtering()
are VCAP IS1 filters, which have a PAG that chains them to a VCAP IS2
filter, and the IS2 is where the 'trap' action resides. The traps set up
by ocelot_trap_add(), on the other hand, have a single filter, in VCAP
IS2. The reason for chaining VCAP IS1 and IS2 in Felix was to ensure
that the hardcoded traps take precedence and cannot be overridden by the
Ocelot switch library.

So in principle, the PTP traps needed for ocelot-8021q in the Felix
driver can rely on ocelot_trap_add(), but the filters need to be patched
to account for a quirk that LS1028A has: the quirk_no_xtr_irq described
in commit 0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP
timestamping"). Live-patching is done by iterating through the trap list
every time we know it has been updated, and transforming a trap into a
redirect + CPU copy if ocelot-8021q is in use.

Making the DSA ocelot-8021q tagger work with the Ocelot traps means we
can eliminate the dedicated OCELOT_VCAP_IS1_TAG_8021Q_PTP_MMIO and
OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO cookies. To minimize the patch delta,
OCELOT_VCAP_IS2_MRP_TRAP takes the place of OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO
(the alternative would have been to left-shift all cookie numbers by 1).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99348004

net: dsa: felix: remove dead code in felix_setup_mmio_filtering() · d78637a8

Vladimir Oltean authored Feb 16, 2022

There has been some controversy related to the sanity check that a CPU
port exists, and commit e8b1d769 ("net: dsa: felix: Fix memory leak
in felix_setup_mmio_filtering") even "corrected" an apparent memory leak
as static analysis tools see it.

However, the check is completely dead code, since the earliest point at
which felix_setup_mmio_filtering() can be called is:

felix_pci_probe
-> dsa_register_switch
   -> dsa_switch_probe
      -> dsa_tree_setup
         -> dsa_tree_setup_cpu_ports
            -> dsa_tree_setup_default_cpu
               -> contains the "DSA: tree %d has no CPU port\n" check
         -> dsa_tree_setup_master
            -> dsa_master_setup
               -> sysfs_create_group(&dev->dev.kobj, &dsa_group);
                  -> makes tagging_store() callable
                     -> dsa_tree_change_tag_proto
                        -> dsa_tree_notify
                           -> dsa_switch_event
                              -> dsa_switch_change_tag_proto
                                 -> ds->ops->change_tag_protocol
                                    -> felix_change_tag_protocol
                                       -> felix_set_tag_protocol
                                          -> felix_setup_tag_8021q
                                             -> felix_setup_mmio_filtering
                                                -> breaks at first CPU port

So probing would have failed earlier if there wasn't any CPU port
defined.

To avoid all confusion, delete the dead code and replace it with a
comment.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d78637a8

net: mscc: ocelot: annotate which traps need PTP timestamping · 9d75b881

Vladimir Oltean authored Feb 16, 2022

The ocelot switch library does not need this information, but the felix
DSA driver does.

As a reminder, the VSC9959 switch in LS1028A doesn't have an IRQ line
for packet extraction, so to be notified that a PTP packet needs to be
dequeued, it receives that packet also over Ethernet, by setting up a
packet trap. The Felix driver needs to install special kinds of traps
for packets in need of RX timestamps, such that the packets are
replicated both over Ethernet and over the CPU port module.

But the Ocelot switch library sets up more than one trap for PTP event
messages; it also traps PTP general messages, MRP control messages etc.
Those packets don't need PTP timestamps, so there's no reason for the
Felix driver to send them to the CPU port module.

By knowing which traps need PTP timestamps, the Felix driver can
adjust the traps installed using ocelot_trap_add() such that only those
will actually get delivered to the CPU port module.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9d75b881

net: mscc: ocelot: keep traps in a list · e42bd4ed

Vladimir Oltean authored Feb 16, 2022

When using the ocelot-8021q tagging protocol, the CPU port isn't
configured as an NPI port, but is a regular port. So a "trap to CPU"
operation is actually a "redirect" operation. So DSA needs to set up the
trapping action one way or another, depending on the tagging protocol in
use.

To ease DSA's work of modifying the action, keep all currently installed
traps in a list, so that DSA can live-patch them when the tagging
protocol changes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e42bd4ed

net: dsa: felix: use DSA port iteration helpers · 2960bb14

Vladimir Oltean authored Feb 16, 2022

Use the helpers that avoid the quadratic complexity associated with
calling dsa_to_port() indirectly: dsa_is_unused_port(),
dsa_is_cpu_port().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2960bb14

net: mscc: ocelot: avoid overlap in VCAP IS2 between PTP and MRP traps · 85ea0daa

Vladimir Oltean authored Feb 16, 2022

OCELOT_VCAP_IS2_TAG_8021Q_TXVLAN overlaps with OCELOT_VCAP_IS2_MRP_REDIRECT.
To avoid this, make OCELOT_VCAP_IS2_MRP_REDIRECT take the cookie region
from N to 2 * N - 1 (where N is ocelot->num_phys_ports).

To avoid any risk that the singleton (not per port) VCAP IS2 filters
overlap with per-port VCAP IS2 filters, we must ensure that the number
of singleton filters is smaller than the number of physical ports.
This is true right now, but may change in the future as switches with
less ports get supported, or more singleton filters get added. So to be
future-proof, let's move the singleton filters at the end of the range,
where they won't overlap with anything to their right.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85ea0daa

net: mscc: ocelot: use a single VCAP filter for all MRP traps · b9bace6e

Vladimir Oltean authored Feb 16, 2022

The MRP assist code installs a VCAP IS2 trapping rule for each port, but
since the key and the action is the same, just the ingress port mask
differs, there isn't any need to do this. We can save some space in the
TCAM by using a single filter and adjusting the ingress port mask.

Reuse the ocelot_trap_add() and ocelot_trap_del() functions for this
purpose.

Now that the cookies are no longer per port, we need to change the
allocation scheme such that MRP traps use a fixed number.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9bace6e

net: mscc: ocelot: delete OCELOT_MRP_CPUQ · 36fac35b

Vladimir Oltean authored Feb 16, 2022

MRP frames are configured to be trapped to the CPU queue 7, and this
number is reflected in the extraction header. However, the information
isn't used anywhere, so just leave MRP frames to go to CPU queue 0
unless needed otherwise.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

36fac35b

net: mscc: ocelot: consolidate cookie allocation for private VCAP rules · c518afec

Vladimir Oltean authored Feb 16, 2022

Every use case that needed VCAP filters (in order: DSA tag_8021q, MRP,
PTP traps) has hardcoded filter identifiers that worked well enough for
that use case alone. But when two or more of those use cases would be
used together, some of those identifiers would overlap, leading to
breakage.

Add definitions for each cookie and centralize them in ocelot_vcap.h,
such that the overlaps are more obvious.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c518afec

net: mscc: ocelot: use a consistent cookie for MRP traps · e3c02b7c

Vladimir Oltean authored Feb 16, 2022

The driver uses an identifier equal to (ocelot->num_phys_ports + port)
for MRP traps installed when the system is in the role of an MRC, and an
identifier equal to (port) otherwise.

Use the same identifier in both cases as a consolidation for the various
cookie values spread throughout the driver.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3c02b7c

Merge tag 'mlx5-updates-2022-02-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · c8b441d2

David S. Miller authored Feb 17, 2022

Saeed Mahameed says:

====================
mlx5-updates-2022-02-16

Misc updates for mlx5:
1) Alex Liu Adds support for using xdp->data_meta

2) Aya Levin Adds PTP counters and port time stamp mode for representors and
switchdev mode.

3) Tariq Toukan, Striding RQ simple improvements.

4) Roi Dayan (7): Create multiple attr instances per flow

Some TC actions use post actions for their implementation.
For example CT and sample actions.

Create a new flow attr instance after each multi table action and
create a post action rule for it as a generic parsing step.
Now multi table actions like CT, sample don't require to do it.

When flow has multiple attr instances, the first flow attr is being
offloaded normally and linked to the next attr (post action rule) by
setting an id on reg_c for matching.
Post action rule (rule created from second attr instance) match the
id on reg_c and does rest of the actions.

Example rule with actions CT,goto will be created with 2 attr instances
as following: attr1(CT)->attr2(goto)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c8b441d2

net/mlx5e: TC, Allow sample action with CT · b070e703

Roi Dayan authored Dec 02, 2021

Allow sample+CT actions but still block sample+CT NAT
as it is not supported.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

b070e703

net/mlx5e: TC, Make post_act parse CT and sample actions · 7843bd60

Roi Dayan authored Dec 19, 2021

Before this commit post_act can be used for normal rules
and didn't handle special cases like CT and sample.
With this commit post_act rule can also handle the special cases
when needed.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7843bd60

net/mlx5e: TC, Clean redundant counter flag from tc action parsers · 2a829fe2

Roi Dayan authored Nov 21, 2021

When tc actions being parsed only the last flow attr created needs the
counter flag and the previous flags being reset.
Clean the flag from the tc action parsers.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

2a829fe2

net/mlx5e: Use multi table support for CT and sample actions · a8128326

Roi Dayan authored Dec 01, 2021

CT and sample actions use post actions for their implementation.
Flag those actions as multi table actions so the post act infrastructure
will handle the post actions allocation.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a8128326

net/mlx5e: Create new flow attr for multi table actions · 8300f225

Roi Dayan authored Aug 08, 2021

Some TC actions use post actions for their implementation.
For example CT and sample actions.

Create a new flow attr after each multi table action and
create a post action rule for it.

First flow attr being offloaded normally and linked to the next
attr (post action rule) with setting an id on reg_c.
Post action rules match the id on reg_c and continue to the next one.

The flow counter is allocated on the last rule.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

8300f225

net/mlx5e: Add post act offload/unoffload API · 314e1105

Roi Dayan authored Oct 14, 2021

Introduce mlx5e_tc_post_act_offload() and mlx5e_tc_post_act_unoffload()
to be able to unoffload and reoffload existing post action rules handles.
For example in neigh update events, the driver removes and readds rules in
hardware.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

314e1105

net/mlx5e: Pass actions param to actions_match_supported() · 0610f8dc

Roi Dayan authored Sep 12, 2021

Currently the mlx5_flow object contains a single mlx5_attr instance.
However, multi table actions (e.g. CT) instantiate multiple attr instances.

Currently action_match_supported() reads the actions flag from the
flow's attribute instance. Modify the function to receive the action
flags as a parameter which is set by the calling function and
pass the aggregated actions to actions_match_supported().
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

0610f8dc

net/mlx5e: TC, Move flow hashtable to be per rep · d1a3138f

Paul Blakey authored Jan 24, 2022

To allow shared tc block offload between two or more reps of the
same eswitch, move the tc flow hashtable to be per rep, instead
of per eswitch.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

d1a3138f

net/mlx5e: E-Switch, Add support for tx_port_ts in switchdev mode · bfbdd77a

Aya Levin authored Jan 20, 2022

When turning on tx_port_ts (private flag) a PTP-SQ is created. Consider
this queue when adding rules matching SQs to VPORTs. Otherwise the
traffic on this queue won't reach the wire.
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

bfbdd77a

net/mlx5e: E-Switch, Add PTP counters for uplink representor · 7c5f940d

Aya Levin authored Jan 19, 2022

There is a configuration where the uplink interface is the synchronizer.
Add PTP counters for this interface for monitoring.
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7c5f940d

net/mlx5e: RX, Restrict bulk size for small Striding RQs · 4b5fba4a

Tariq Toukan authored Jan 19, 2022

In RQs of type multi-packet WQE (Striding RQ), each WQE is relatively
large (typically 256KB) but their number is relatively small (8 in
default).

Re-mapping the descriptors' buffers before re-posting them is done via
UMR (User-Mode Memory Registration) operations.

On the one hand, posting UMR WQEs in bulks reduces communication overhead
with the HW and better utilizes its processing units.
On the other hand, delaying the WQE repost operations for a small RQ
(say, of 4 WQEs) might drastically hit its performance, causing packet
drops due to no receive buffer, for high or bursty incoming packets rate.

Here we restrict the bulk size for too small RQs. Effectively, with the current
constants, RQ of size 4 (minimum allowed) would have no bulking, while larger
RQs will continue working with bulks of 2.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

4b5fba4a

net/mlx5e: Default to Striding RQ when not conflicting with CQE compression · 1d5024f8

Tariq Toukan authored Jan 11, 2022

CQE compression is turned on by default on slow pci systems to help
reduce the load on pci.
In this case, Striding RQ was turned off as CQEs of packets that span
several strides were not compressed, significantly reducing the compression
effectiveness.
This issue does not exist when using the newer mini_cqe format "stride_index".
Hence, allow defaulting to Striding RQ in this case.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

1d5024f8