Commits · 30c2515b89f1a6361170961e72bebd375f611b9b · Kirill Smelkov / linux

05 Aug, 2021 12 commits

net: ipa: don't suspend/resume modem if not up · 30c2515b

Alex Elder authored Aug 04, 2021

The modem network device is set up by ipa_modem_start().  But its
TX queue is not actually started and endpoints enabled until it is
opened.

So avoid stopping the modem network device TX queue and disabling
endpoints on suspend or stop unless the netdev is marked UP.  And
skip attempting to resume unless it is UP.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

30c2515b

Merge branch 'sja1105-H' · 1f52247e

David S. Miller authored Aug 05, 2021

Vladimir Oltean says:

====================
NXP SJA1105 driver support for "H" switch topologies

Changes in v3:
Preserve the behavior of dsa_tree_setup_default_cpu() which is to pick
the first CPU port and not the last.

Changes in v2:
Send as non-RFC, drop the patches for discarding DSA-tagged packets on
user ports and DSA-untagged packets on DSA and CPU ports for now.

NXP builds boards like the Bluebox 3 where there are multiple SJA1110
switches connected to an LX2160A, but they are also connected to each
other. I call this topology an "H" tree because of the lateral
connection between switches. A piece extracted from a non-upstream
device tree looks like this:

&spi_bridge {
        /* SW1 */
        ethernet-switch@0 {
                compatible = "nxp,sja1110a";
                reg = <0>;
                dsa,member = <0 0>;

                ethernet-ports {
                        #address-cells = <1>;
                        #size-cells = <0>;

                        /* SW1_P1 */
                        port@1 {
                                reg = <1>;
                                label = "con_2x20";
                                phy-mode = "sgmii";

                                fixed-link {
                                        speed = <1000>;
                                        full-duplex;
                                };
                        };

                        port@2 {
                                reg = <2>;
                                ethernet = <&dpmac17>;
                                phy-mode = "rgmii-id";

                                fixed-link {
                                        speed = <1000>;
                                        full-duplex;
                                };
                        };

                        port@3 {
                                reg = <3>;
                                label = "1ge_p1";
                                phy-mode = "rgmii-id";
                                phy-handle = <&sw1_mii3_phy>;
                        };

                        sw1p4: port@4 {
                                reg = <4>;
                                link = <&sw2p1>;
                                phy-mode = "sgmii";

                                fixed-link {
                                        speed = <1000>;
                                        full-duplex;
                                };
                        };

                        port@5 {
                                reg = <5>;
                                label = "trx1";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port5_base_t1_phy>;
                        };

                        port@6 {
                                reg = <6>;
                                label = "trx2";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port6_base_t1_phy>;
                        };

                        port@7 {
                                reg = <7>;
                                label = "trx3";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port7_base_t1_phy>;
                        };

                        port@8 {
                                reg = <8>;
                                label = "trx4";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port8_base_t1_phy>;
                        };

                        port@9 {
                                reg = <9>;
                                label = "trx5";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port9_base_t1_phy>;
                        };

                        port@a {
                                reg = <10>;
                                label = "trx6";
                                phy-mode = "internal";
                                phy-handle = <&sw1_port10_base_t1_phy>;
                        };
                };
        };

        /* SW2 */
        ethernet-switch@2 {
                compatible = "nxp,sja1110a";
                reg = <2>;
                dsa,member = <0 1>;

                ethernet-ports {
                        #address-cells = <1>;
                        #size-cells = <0>;

                        sw2p1: port@1 {
                                reg = <1>;
                                link = <&sw1p4>;
                                phy-mode = "sgmii";

                                fixed-link {
                                        speed = <1000>;
                                        full-duplex;
                                };
                        };

                        port@2 {
                                reg = <2>;
                                ethernet = <&dpmac18>;
                                phy-mode = "rgmii-id";

                                fixed-link {
                                        speed = <1000>;
                                        full-duplex;
                                };
                        };

                        port@3 {
                                reg = <3>;
                                label = "1ge_p2";
                                phy-mode = "rgmii-id";
                                phy-handle = <&sw2_mii3_phy>;
                        };

                        port@4 {
                                reg = <4>;
                                label = "to_sw3";
                                phy-mode = "2500base-x";

                                fixed-link {
                                        speed = <2500>;
                                        full-duplex;
                                };
                        };

                        port@5 {
                                reg = <5>;
                                label = "trx7";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port5_base_t1_phy>;
                        };

                        port@6 {
                                reg = <6>;
                                label = "trx8";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port6_base_t1_phy>;
                        };

                        port@7 {
                                reg = <7>;
                                label = "trx9";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port7_base_t1_phy>;
                        };

                        port@8 {
                                reg = <8>;
                                label = "trx10";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port8_base_t1_phy>;
                        };

                        port@9 {
                                reg = <9>;
                                label = "trx11";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port9_base_t1_phy>;
                        };

                        port@a {
                                reg = <10>;
                                label = "trx12";
                                phy-mode = "internal";
                                phy-handle = <&sw2_port10_base_t1_phy>;
                        };
                };
        };
};

Basically it is a single DSA tree with 2 "ethernet" properties, i.e. a
multi-CPU-port system. There is also a DSA link between the switches,
but it is not a daisy chain topology, i.e. there is no "upstream" and
"downstream" switch, the DSA link is only to be used for the bridge data
plane (autonomous forwarding between switches, between the RJ-45 ports
and the automotive Ethernet ports), otherwise all traffic that should
reach the host should do so through the dedicated CPU port of the switch.

Of course, plain forwarding in this topology is bound to create packet
loops. I have thought long and hard about strategies to cut forwarding
in such a way as to prevent loops but also not impede normal operation
of the network on such a system, and I believe I have found a solution
that does work as expected. This relies heavily on DSA's recent ability
to perform RX filtering towards the host by installing MAC addresses as
static FDB entries. Since we have 2 distinct DSA masters, we have 2
distinct MAC addresses, and if the bridge is configured to have its own
MAC address that makes it 3 distinct MAC addresses. The bridge core,
plus the switchdev_handle_fdb_add_to_device() extension, handle each MAC
address by replicating it to each port of the DSA switch tree. So the
end result is that both switch 1 and switch 2 will have static FDB
entries towards their respective CPU ports for the 3 MAC addresses
corresponding to the DSA masters and to the bridge net device (and of
course, towards any station learned on a foreign interface).

So I think the basic design works, and it is basically just as fragile
as any other multi-CPU-port system is bound to be in terms of reliance
on static FDB entries towards the host (if hardware address learning on
the CPU port is to be used, MAC addresses would randomly bounce between
one CPU port and the other otherwise). In fact, I think it is even
better to start DSA's support of multi-CPU-port systems with something
small like the NXP Bluebox 3, because we allow some time for the code
paths like dsa_switch_host_address_match(), which were specifically
designed for it, to break in, and this board needs no user space
configuration of CPU ports, like static assignments between user and CPU
ports, or bonding between the CPU ports/DSA masters.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1f52247e

net: dsa: sja1105: enable address learning on cascade ports · 81d45898

Vladimir Oltean authored Aug 04, 2021

Right now, address learning is disabled on DSA ports, which means that a
packet received over a DSA port from a cross-chip switch will be flooded
to unrelated ports.

It is desirable to eliminate that, but for that we need a breakdown of
the possibilities for the sja1105 driver. A DSA port can be:

- a downstream-facing cascade port. This is simple because it will
  always receive packets from a downstream switch, and there should be
  no other route to reach that downstream switch in the first place,
  which means it should be safe to learn that MAC address towards that
  switch.

- an upstream-facing cascade port. This receives packets either:
  * autonomously forwarded by an upstream switch (and therefore these
    packets belong to the data plane of a bridge, so address learning
    should be ok), or
  * injected from the CPU. This deserves further discussion, as normally,
    an upstream-facing cascade port is no different than the CPU port
    itself. But with "H" topologies (a DSA link towards a switch that
    has its own CPU port), these are more "laterally-facing" cascade
    ports than they are "upstream-facing". Here, there is a risk that
    the port might learn the host addresses on the wrong port (on the
    DSA port instead of on its own CPU port), but this is solved by
    DSA's RX filtering infrastructure, which installs the host addresses
    as static FDB entries on the CPU port of all switches in a "H" tree.
    So even if there will be an attempt from the switch to migrate the
    FDB entry from the CPU port to the laterally-facing cascade port, it
    will fail to do that, because the FDB entry that already exists is
    static and cannot migrate. So address learning should be safe for
    this configuration too.

Ok, so what about other MAC addresses coming from the host, not
necessarily the bridge local FDB entries? What about MAC addresses
dynamically learned on foreign interfaces, isn't there a risk that
cascade ports will learn these entries dynamically when they are
supposed to be delivered towards the CPU port? Well, that is correct,
and this is why we also need to enable the assisted learning feature, to
snoop for these addresses and write them to hardware as static FDB
entries towards the CPU, to make the switch's learning process on the
cascade ports ineffective for them. With assisted learning enabled, the
hardware learning on the CPU port must be disabled.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

81d45898

net: dsa: sja1105: suppress TX packets from looping back in "H" topologies · 0f9b762c

Vladimir Oltean authored Aug 04, 2021

H topologies like this one have a problem:

         eth0                                                     eth1
          |                                                        |
       CPU port                                                CPU port
          |                        DSA link                        |
 sw0p0  sw0p1  sw0p2  sw0p3  sw0p4 -------- sw1p4  sw1p3  sw1p2  sw1p1  sw1p0
   |             |      |                            |      |             |
 user          user   user                         user   user          user
 port          port   port                         port   port          port

Basically any packet sent by the eth0 DSA master can be flooded on the
interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the
eth1 DSA master too. Basically we are talking to ourselves.

In VLAN-unaware mode, these packets are encoded using a tag_8021q TX
VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains.
Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN
which _can_ be decoded by the tagger running on eth1, so it will attempt
to reinject that packet into the network stack (the bridge, if there is
any port under eth1 that is under a bridge). In the case where the ports
under eth1 are under the same cross-chip bridge as the ports under eth0,
the TX packets will even be learned as RX packets. The only thing that
will prevent loops with the software bridging path, and therefore
disaster, is that the source port and the destination port are in the
same hardware domain, and the bridge will receive packets from the
driver with skb->offload_fwd_mark = true and will not forward between
the two.

The proper solution to this problem is to detect H topologies and
enforce that all packets are received through the local switch and we do
not attempt to receive packets on our CPU port from switches that have
their own. This is a viable solution which works thanks to the fact that
MAC addresses which should be filtered towards the host are installed by
DSA as static MAC addresses towards the CPU port of each switch.

TX from a CPU port towards the DSA port continues to be allowed, this is
because sja1105 supports bridge TX forwarding offload, and the skb->dev
used initially for xmit does not have any direct correlation with where
the station that will respond to that packet is connected. It may very
well happen that when we send a ping through a br0 interface that spans
all switch ports, the xmit packet will exit the system through a DSA
switch interface under eth1 (say sw1p2), but the destination station is
connected to a switch port under eth0, like sw0p0. So the switch under
eth1 needs to communicate on TX with the switch under eth0. The
response, however, will not follow the same path, but instead, this
patch enforces that the response is sent by the first switch directly to
its DSA master which is eth0.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0f9b762c

net: dsa: sja1105: increase MTU to account for VLAN header on DSA ports · 777e55e3

Vladimir Oltean authored Aug 04, 2021

Since all packets are transmitted as VLAN-tagged over a DSA link (this
VLAN tag represents the tag_8021q header), we need to increase the MTU
of these interfaces to account for the possibility that we are already
transporting a user-visible VLAN header.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

777e55e3

net: dsa: sja1105: manage VLANs on cascade ports · c5130029

Vladimir Oltean authored Aug 04, 2021

Since commit ed040abc ("net: dsa: sja1105: use 4095 as the private
VLAN for untagged traffic"), this driver uses a reserved value as pvid
for the host port (DSA CPU port). Control packets which are sent as
untagged get classified to this VLAN, and all ports are members of it
(this is to be expected for control packets).

Manage all cascade ports in the same way and allow control packets to
egress everywhere.

Also, all VLANs need to be sent as egress-tagged on all cascade ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5130029

net: dsa: sja1105: manage the forwarding domain towards DSA ports · 3fa21270

Vladimir Oltean authored Aug 04, 2021

Manage DSA links towards other switches, be they host ports or cascade
ports, the same as the CPU port, i.e. allow forwarding and flooding
unconditionally from all user ports.

We send packets as always VLAN-tagged on a DSA port, and we rely on the
cross-chip notifiers from tag_8021q to install the RX VLAN of a switch
port only on the proper remote ports of another switch (the ports that
are in the same bridging domain). So if there is no cross-chip bridging
in the system, the flooded packets will be sent on the DSA ports too,
but they will be dropped by the remote switches due to either
(a) a lack of the RX VLAN in the VLAN table of the ingress DSA port, or
(b) a lack of valid destinations for those packets, due to a lack of the
    RX VLAN on the user ports of the switch

Note that switches which only transport packets in a cross-chip bridge,
but have no user ports of their own as part of that bridge, such as
switch 1 in this case:

                    DSA link                   DSA link
  sw0p0 sw0p1 sw0p2 -------- sw1p0 sw1p2 sw1p3 -------- sw2p0 sw2p2 sw2p3

ip link set sw0p0 master br0
ip link set sw2p3 master br0

will still work, because the tag_8021q cross-chip notifiers keep the RX
VLANs installed on all DSA ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3fa21270

net: dsa: sja1105: configure the cascade ports based on topology · 30a100e6

Vladimir Oltean authored Aug 04, 2021

The sja1105 switch family has a feature called "cascade ports" which can
be used in topologies where multiple SJA1105/SJA1110 switches are daisy
chained. Upstream switches set this bit for the DSA link towards the
downstream switches. This is used when the upstream switch receives a
control packet (PTP, STP) from a downstream switch, because if the
source port for a control packet is marked as a cascade port, then the
source port, switch ID and RX timestamp will not be taken again on the
upstream switch, it is assumed that this has already been done by the
downstream switch (the leaf port in the tree) and that the CPU has
everything it needs to decode the information from this packet.

We need to distinguish between an upstream-facing DSA link and a
downstream-facing DSA link, because the upstream-facing DSA links are
"host ports" for the SJA1105/SJA1110 switches, and the downstream-facing
DSA links are "cascade ports".

Note that SJA1105 supports a single cascade port, so only daisy chain
topologies work. With SJA1110, there can be more complex topologies such
as:

                    eth0
                     |
                 host port
                     |
 sw0p0    sw0p1    sw0p2    sw0p3    sw0p4
   |        |                 |        |
 cascade  cascade            user     user
  port     port              port     port
   |        |
   |        |
   |        |
   |       host
   |       port
   |        |
   |      sw1p0    sw1p1    sw1p2    sw1p3    sw1p4
   |                 |        |        |        |
   |                user     user     user     user
  host              port     port     port     port
  port
   |
 sw2p0    sw2p1    sw2p2    sw2p3    sw2p4
            |        |        |        |
           user     user     user     user
           port     port     port     port
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

30a100e6

net: dsa: give preference to local CPU ports · 2c0b0325

Vladimir Oltean authored Aug 04, 2021

Be there an "H" switch topology, where there are 2 switches connected as
follows:

         eth0                                                     eth1
          |                                                        |
       CPU port                                                CPU port
          |                        DSA link                        |
 sw0p0  sw0p1  sw0p2  sw0p3  sw0p4 -------- sw1p4  sw1p3  sw1p2  sw1p1  sw1p0
   |             |      |                            |      |             |
 user          user   user                         user   user          user
 port          port   port                         port   port          port

basically one where each switch has its own CPU port for termination,
but there is also a DSA link in case packets need to be forwarded in
hardware between one switch and another.

DSA insists to see this as a daisy chain topology, basically registering
all network interfaces as sw0p0@eth0, ... sw1p0@eth0 and disregarding
eth1 as a valid DSA master.

This is only half the story, since when asked using dsa_port_is_cpu(),
DSA will respond that sw1p1 is a CPU port, however one which has no
dp->cpu_dp pointing to it. So sw1p1 is enabled, but not used.

Furthermore, be there a driver for switches which support only one
upstream port. This driver iterates through its ports and checks using
dsa_is_upstream_port() whether the current port is an upstream one.
For switch 1, two ports pass the "is upstream port" checks:

- sw1p4 is an upstream port because it is a routing port towards the
  dedicated CPU port assigned using dsa_tree_setup_default_cpu()

- sw1p1 is also an upstream port because it is a CPU port, albeit one
  that is disabled. This is because dsa_upstream_port() returns:

	if (!cpu_dp)
		return port;

  which means that if @dp does not have a ->cpu_dp pointer (which is a
  characteristic of CPU ports themselves as well as unused ports), then
  @dp is its own upstream port.

So the driver for switch 1 rightfully says: I have two upstream ports,
but I don't support multiple upstream ports! So let me error out, I
don't know which one to choose and what to do with the other one.

Generally I am against enforcing any default policy in the kernel in
terms of user to CPU port assignment (like round robin or such) but this
case is different. To solve the conundrum, one would have to:

- Disable sw1p1 in the device tree or mark it as "not a CPU port" in
  order to comply with DSA's view of this topology as a daisy chain,
  where the termination traffic from switch 1 must pass through switch 0.
  This is counter-productive because it wastes 1Gbps of termination
  throughput in switch 1.
- Disable the DSA link between sw0p4 and sw1p4 and do software
  forwarding between switch 0 and 1, and basically treat the switches as
  part of disjoint switch trees. This is counter-productive because it
  wastes 1Gbps of autonomous forwarding throughput between switch 0 and 1.
- Treat sw0p4 and sw1p4 as user ports instead of DSA links. This could
  work, but it makes cross-chip bridging impossible. In this setup we
  would need to have 2 separate bridges, br0 spanning the ports of
  switch 0, and br1 spanning the ports of switch 1, and the "DSA links
  treated as user ports" sw0p4 (part of br0) and sw1p4 (part of br1) are
  the gateway ports between one bridge and another. This is hard to
  manage from a user's perspective, who wants to have a unified view of
  the switching fabric and the ability to transparently add ports to the
  same bridge. VLANs would also need to be explicitly managed by the
  user on these gateway ports.

So it seems that the only reasonable thing to do is to make DSA prefer
CPU ports that are local to the switch. Meaning that by default, the
user and DSA ports of switch 0 will get assigned to the CPU port from
switch 0 (sw0p1) and the user and DSA ports of switch 1 will get
assigned to the CPU port from switch 1.

The way this solves the problem is that sw1p4 is no longer an upstream
port as far as switch 1 is concerned (it no longer views sw0p1 as its
dedicated CPU port).

So here we are, the first multi-CPU port that DSA supports is also
perhaps the most uneventful one: the individual switches don't support
multiple CPUs, however the DSA switch tree as a whole does have multiple
CPU ports. No user space assignment of user ports to CPU ports is
desirable, necessary, or possible.

Ports that do not have a local CPU port (say there was an extra switch
hanging off of sw0p0) default to the standard implementation of getting
assigned to the first CPU port of the DSA switch tree. Is that good
enough? Probably not (if the downstream switch was hanging off of switch
1, we would most certainly prefer its CPU port to be sw1p1), but in
order to support that use case too, we would need to traverse the
dst->rtable in search of an optimum dedicated CPU port, one that has the
smallest number of hops between dp->ds and dp->cpu_dp->ds. At the
moment, the DSA routing table structure does not keep the number of hops
between dl->dp and dl->link_dp, and while it is probably deducible,
there is zero justification to write that code now. Let's hope DSA will
never have to support that use case.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c0b0325

net: dsa: rename teardown_default_cpu to teardown_cpu_ports · 0e8eb9a1

Vladimir Oltean authored Aug 04, 2021

There is nothing specific to having a default CPU port to what
dsa_tree_teardown_default_cpu() does. Even with multiple CPU ports,
it would do the same thing: iterate through the ports of this switch
tree and reset the ->cpu_dp pointer to NULL. So rename it accordingly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0e8eb9a1

net: ipa: fix IPA v4.9 interconnects · 0fd75f57

Alex Elder authored Aug 04, 2021

Three interconnects are defined for IPA version 4.9, but there
should only be two.  They should also use names that match what's
used for other platforms (and specified in the Device Tree binding).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0fd75f57

mctp: remove duplicated assignment of pointer hdr · df7ba0eb

Colin Ian King authored Aug 04, 2021

The pointer hdr is being initialized and also re-assigned with the
same value from the call to function mctp_hdr. Static analysis reports
that the initializated value is unused. The second assignment is
duplicated and can be removed.

Addresses-Coverity: ("Unused value").
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

df7ba0eb

04 Aug, 2021 28 commits

net: Replace deprecated CPU-hotplug functions. · 372bbdd5

Sebastian Andrzej Siewior authored Aug 03, 2021

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

372bbdd5

virtio_net: Replace deprecated CPU-hotplug functions. · a0d1d0f4

Sebastian Andrzej Siewior authored Aug 03, 2021

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a0d1d0f4

pktgen: Remove redundant clone_skb override · c2eecaa1

Nick Richardson authored Aug 03, 2021

When the netif_receive xmit_mode is set, a line is supposed to set
clone_skb to a default 0 value. This line is made redundant due to a
preceding line that checks if clone_skb is more than zero and returns
-ENOTSUPP.

Overriding clone_skb to 0 does not make any difference to the behavior
because if it was positive we return error. So it can be either 0 or
negative, and in both cases the behavior is the same.

Remove redundant line that sets clone_skb to zero.
Signed-off-by: Nick Richardson <richardsonnick@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c2eecaa1

ptp: ocp: Expose various resources on the timecard. · 773bda96

Jonathan Lemon authored Aug 03, 2021

The OpenCompute timecard driver has additional functionality besides
a clock.  Make the following resources available:

 - The external timestamp channels (ts0/ts1)
 - devlink support for flashing and health reporting
 - GPS and MAC serial ports
 - board serial number (obtained from i2c device)

Also add watchdog functionality for when GNSS goes into holdover.

The resources are collected under a timecard class directory:

  [jlemon@timecard ~]$ ls -g /sys/class/timecard/ocp1/
  total 0
  -r--r--r--. 1 root 4096 Aug  3 19:49 available_clock_sources
  -rw-r--r--. 1 root 4096 Aug  3 19:49 clock_source
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 device -> ../../../0000:04:00.0/
  -r--r--r--. 1 root 4096 Aug  3 19:49 gps_sync
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 i2c -> ../../xiic-i2c.1024/i2c-2/
  drwxr-xr-x. 2 root    0 Aug  3 19:49 power/
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 pps ->
  ../../../../../virtual/pps/pps1/
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 ptp -> ../../ptp/ptp2/
  -r--r--r--. 1 root 4096 Aug  3 19:49 serialnum
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 subsystem ->
  ../../../../../../class/timecard/
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 ttyGPS -> ../../tty/ttyS7/
  lrwxrwxrwx. 1 root    0 Aug  3 19:49 ttyMAC -> ../../tty/ttyS8/
  -rw-r--r--. 1 root 4096 Aug  3 19:39 uevent

The labeling is needed at the minimum, in order to tell the serial
devices apart.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

773bda96

sock: allow reading and changing sk_userlocks with setsockopt · 04190bf8

Pavel Tikhomirov authored Aug 04, 2021

SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket
buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and
tcp_sndbuf_expand()). If we've just created a new socket this adjustment
is enabled on it, but if one changes the socket buffer size by
setsockopt(SO_{SND,RCV}BUF*) it becomes disabled.

CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on
restore as it first needs to increase buffer sizes for packet queues
restore and second it needs to restore back original buffer sizes. So
after CRIU restore all sockets become non-auto-adjustable, which can
decrease network performance of restored applications significantly.

CRIU need to be able to restore sockets with enabled/disabled adjustment
to the same state it was before dump, so let's add special setsockopt
for it.

Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so
that using these interface one can reenable automatic socket buffer
adjustment on their sockets.
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04190bf8

tc-testing: Add control-plane selftests for sch_mq · 625af9f0

Peilin Ye authored Aug 03, 2021

Recently we added multi-queue support to netdevsim in commit d4861fc6
("netdevsim: Add multi-queue support"); add a few control-plane selftests
for sch_mq using this new feature.

Use nsPlugin.py to avoid network interface name collisions.
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

625af9f0

Revert "net: build all switchdev drivers as modules when the bridge is a module" · a54182b2

Vladimir Oltean authored Aug 03, 2021

This reverts commit b0e81817. Explicit
driver dependency on the bridge is no longer needed since
switchdev_bridge_port_{,un}offload() is no longer implemented by the
bridge driver but by switchdev.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a54182b2

net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge · 957e2235

Vladimir Oltean authored Aug 03, 2021

With the introduction of explicit offloading API in switchdev in commit
2f5dc00f ("net: bridge: switchdev: let drivers inform which bridge
ports are offloaded"), we started having Ethernet switch drivers calling
directly into a function exported by net/bridge/br_switchdev.c, which is
a function exported by the bridge driver.

This means that drivers that did not have an explicit dependency on the
bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
possible to call a symbol exported by a driver that can be built as
module unless you are a module too.

There was an attempt to solve the dependency issue in the form of commit
b0e81817 ("net: build all switchdev drivers as modules when the
bridge is a module"). Grygorii Strashko, however, says about it:

| In my opinion, the problem is a bit bigger here than just fixing the
| build :(
|
| In case, of ^cpsw the switchdev mode is kinda optional and in many
| cases (especially for testing purposes, NFS) the multi-mac mode is
| still preferable mode.
|
| There were no such tight dependency between switchdev drivers and
| bridge core before and switchdev serviced as independent, notification
| based layer between them, so ^cpsw still can be "Y" and bridge can be
| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
| will need to be set as "Y", or we will have to update drivers to
| support build with BRIDGE=n and maintain separate builds for
| networking vs non-networking testing. But is this enough? Wouldn't
| it cause 'chain reaction' required to add more and more "Y" options
| (like CONFIG_VLAN_8021Q)?
|
| PS. Just to be sure we on the same page - ARM builds will be forced
| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
| automation testing will just fail with omap2plus_defconfig.

In the light of this, it would be desirable for some configurations to
avoid dependencies between switchdev drivers and the bridge, and have
the switchdev mode as completely optional within the driver.

Arnd Bergmann also tried to write a patch which better expressed the
build time dependency for Ethernet switch drivers where the switchdev
support is optional, like cpsw/am65-cpsw, and this made the drivers
follow the bridge (compile as module if the bridge is a module) only if
the optional switchdev support in the driver was enabled in the first
place:
https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/

but this still did not solve the fact that cpsw and am65-cpsw now must
be built as modules when the bridge is a module - it just expressed
correctly that optional dependency. But the new behavior is an apparent
regression from Grygorii's perspective.

So to support the use case where the Ethernet driver is built-in,
NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
need a framework that can handle the possible absence of the bridge from
the running system, i.e. runtime bloatware as opposed to build-time
bloatware.

Luckily we already have this framework, since switchdev has been using
it extensively. Events from the bridge side are transmitted to the
driver side using notifier chains - this was originally done so that
unrelated drivers could snoop for events emitted by the bridge towards
ports that are implemented by other drivers (think of a switch driver
with LAG offload that listens for switchdev events on a bonding/team
interface that it offloads).

There are also events which are transmitted from the driver side to the
bridge side, which again are modeled using notifiers.
SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
notifying the bridge that a MAC address has been dynamically learned.
So there is a precedent we can use for modeling the new framework.

The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
that the bridge needs to do when a port becomes offloaded is blocking in
its nature: replay VLANs, MDBs etc. The calling context is indeed
blocking (we are under rtnl_mutex), but the existing switchdev
notification chain that the bridge is subscribed to is only the atomic
one. So we need to subscribe the bridge to the blocking switchdev
notification chain too.

This patch:
- keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
unchanged
- moves the implementation of switchdev_bridge_port_{,un}offload from
the bridge module into the switchdev module.
- makes everybody that is subscribed to the switchdev blocking notifier
chain "hear" offload & unoffload events
- makes the bridge driver subscribe and handle those events
- moves the bridge driver's handling of those events into 2 new
functions called br_switchdev_port_{,un}offload. These functions
contain in fact the core of the logic that was previously in
switchdev_bridge_port_{,un}offload, just that now we go through an
extra indirection layer to reach them.

Unlike all the other switchdev notification structures, the structure
used to carry the bridge port information, struct
switchdev_notifier_brport_info, does not contain a "bool handled".
This is because in the current usage pattern, we always know that a
switchdev bridge port offloading event will be handled by the bridge,
because the switchdev_bridge_port_offload() call was initiated by a
NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
couldn't have happened.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

957e2235

Merge tag 'linux-can-next-for-5.15-20210804' of... · 9c0532f9

David S. Miller authored Aug 04, 2021

Merge tag 'linux-can-next-for-5.15-20210804' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-08-04

this is a pull request of 5 patches for net-next/master.

The first patch is by me and fixes a typo in a comment in the CAN
J1939 protocol.

The next 2 patches are by Oleksij Rempel and update the CAN J1939
protocol to send RX status updates via the error queue mechanism.

The next patch is by me and adds a missing variable initialization to
the flexcan driver (the problem was introduced in the current net-next
cycle).

The last patch is by Aswath Govindraju and adds power-domains to the
Bosch m_can DT binding documentation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9c0532f9

dt-bindings: net: can: Document power-domains property · d85165b2

Aswath Govindraju authored Aug 02, 2021

Document power-domains property for adding the Power domain provider.

Link: https://lore.kernel.org/r/20210802091822.16407-1-a-govindraju@ti.comSigned-off-by: Aswath Govindraju <a-govindraju@ti.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

d85165b2

can: flexcan: flexcan_clks_enable(): add missing variable initialization · 33626669

Marc Kleine-Budde authored Jul 28, 2021

This patch adds the missing initialization of the "err" variable in
the flexcan_clks_enable() function.

Fixes: d9cead75 ("can: flexcan: add mcf5441x support")
Link: https://lore.kernel.org/r/20210728075428.1493568-1-mkl@pengutronix.deReported-by: kernel test robot <lkp@intel.com>
Cc: Angelo Dureghello <angelo@kernel-space.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

33626669

can: j1939: extend UAPI to notify about RX status · 5b9272e9

Oleksij Rempel authored Jul 07, 2021

To be able to create applications with user friendly feedback, we need be
able to provide receive status information.

Typical ETP transfer may take seconds or even hours. To give user some
clue or show a progress bar, the stack should push status updates.
Same as for the TX information, the socket error queue will be used with
following new signals:
- J1939_EE_INFO_RX_RTS   - received and accepted request to send signal.
- J1939_EE_INFO_RX_DPO   - received data package offset signal
- J1939_EE_INFO_RX_ABORT - RX session was aborted

Instead of completion signal, user will get data package.
To activate this signals, application should set
SOF_TIMESTAMPING_RX_SOFTWARE to the SO_TIMESTAMPING socket option. This
will avoid unpredictable application behavior for the old software.

Link: https://lore.kernel.org/r/20210707094854.30781-3-o.rempel@pengutronix.deSigned-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

5b9272e9

can: j1939: rename J1939_ERRQUEUE_* to J1939_ERRQUEUE_TX_* · cd85d3ae

Oleksij Rempel authored Jul 07, 2021

Prepare the world for the J1939_ERRQUEUE_RX_ version

Link: https://lore.kernel.org/r/20210707094854.30781-2-o.rempel@pengutronix.deSigned-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

cd85d3ae

ipv6: exthdrs: get rid of indirect calls in ip6_parse_tlv() · 51b8f812

Eric Dumazet authored Aug 03, 2021

As presented last month in our "BIG TCP" talk at netdev 0x15,
we plan using IPv6 jumbograms.

One of the minor problem we talked about is the fact that
ip6_parse_tlv() is currently using tables to list known tlvs,
thus using potentially expensive indirect calls.

While we could mitigate this cost using macros from
indirect_call_wrapper.h, we also can get rid of the tables
and let the compiler emit optimized code.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Justin Iurman <justin.iurman@uliege.be>
Cc: Coco Li <lixiaoyan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

51b8f812

Merge branch 'm7530-sw-fallback' · d8517985

David S. Miller authored Aug 04, 2021

DENG Qingfang says:

====================
mt7530 software fallback bridging fix

DSA core has gained software fallback support since commit 2f5dc00f,
but it does not work properly on mt7530. This patch series fixes the
issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d8517985

net: dsa: mt7530: always install FDB entries with IVL and FID 1 · 73c447ca

DENG Qingfang authored Aug 04, 2021

This reverts commit 7e777021 ("mt7530 mt7530_fdb_write only set ivl
bit vid larger than 1").

Before this series, the default value of all ports' PVID is 1, which is
copied into the FDB entry, even if the ports are VLAN unaware. So
`bridge fdb show` will show entries like `dev swp0 vlan 1 self` even on
a VLAN-unaware bridge.

The blamed commit does not solve that issue completely, instead it may
cause a new issue that FDB is inaccessible in a VLAN-aware bridge with
PVID 1.

This series sets PVID to 0 on VLAN-unaware ports, so `bridge fdb show`
will no longer print `vlan 1` on VLAN-unaware bridges, and that special
case in fdb_write is not required anymore.

Set FDB entries' filter ID to 1 to match the VLAN table.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73c447ca

net: dsa: mt7530: set STP state on filter ID 1 · a9e3f62d

DENG Qingfang authored Aug 04, 2021

As filter ID 1 is the only one used for bridges, set STP state on it.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a9e3f62d

net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges · 6087175b

DENG Qingfang authored Aug 04, 2021

Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Ideally, when the switch receives a packet from swp3 or swp4, it should
forward the packet to the CPU, according to the port matrix and unknown
unicast flood settings.

But packet loss will happen if the destination address is at one of the
offloaded ports (swp0~2). For example, when client C sends a packet to
A, the FDB lookup will indicate that it should be forwarded to swp0, but
the port matrix of swp3 and swp4 is configured to only allow the CPU to
be its destination, so it is dropped.

However, this issue does not happen if the bridge is VLAN-aware. That is
because VLAN-aware bridges use independent VLAN learning, i.e. use VID
for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded,
shared VLAN learning with default filter ID of 0 is used instead. So the
lookup for A with filter ID 0 never hits and the packet can be forwarded
to the CPU.

In the current code, only two combinations were used to toggle user
ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with
PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to
security mode with PVC.VLAN_ATTR set to user port.

It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and
port matrix mode just skips the VLAN table lookup. The reference manual
is somehow misleading when describing PORT_VLAN modes. It states that
PORT_MEM (VLAN port member) is used for destination if the VLAN table
lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of
VLAN port member and port matrix) is used instead, which means we can
have two or more separate VLAN-aware bridges with the same PVID and
traffic won't leak between them.

Therefore, to solve this, enable independent VLAN learning with PVID 0
on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback
mode, while leaving standalone ports in port matrix mode. The CPU port
is always set to fallback mode to serve those bridges.

During testing, it is found that FDB lookup with filter ID of 0 will
also hit entries with VID 0 even with independent VLAN learning. To
avoid that, install all VLANs with filter ID of 1.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6087175b

net: dsa: mt7530: enable assisted learning on CPU port · 0b69c54c

DENG Qingfang authored Aug 04, 2021

Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Address learning is enabled on offloaded ports (swp0~2) and the CPU
port, so when client A sends a packet to C, the following will happen:

1. The switch learns that client A can be reached at swp0.
2. The switch probably already knows that client C can be reached at the
   CPU port, so it forwards the packet to the CPU.
3. The bridge core knows client C can be reached at bond0, so it
   forwards the packet back to the switch.
4. The switch learns that client A can be reached at the CPU port.
5. The switch forwards the packet to either swp3 or swp4, according to
   the packet's tag.

That makes client A's MAC address flap between swp0 and the CPU port. If
client B sends a packet to A, it is possible that the packet is
forwarded to the CPU. With offload_fwd_mark = 1, the bridge core won't
forward it back to the switch, resulting in packet loss.

As we have the assisted_learning_on_cpu_port in DSA core now, enable
that and disable hardware learning on the CPU port.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <oltean@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b69c54c

Merge branch 'ipa-pm-irqs' · 8eceea41

David S. Miller authored Aug 04, 2021

Alex Elder says:

====================
net: ipa: prepare GSI interrupts for runtime PM

The last patch in this series arranges for GSI interrupts to be
disabled when the IPA hardware is suspended.  This ensures the clock
is always operational when a GSI interrupt fires.  Leading up to
that are patches that rearrange the code a bit to allow this to
be done.

The first two patches aren't *directly* related.  They remove some
flag arguments to some GSI suspend/resume related functions, using
the version field now present in the GSI structure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8eceea41

net: ipa: disable GSI interrupts while suspended · 45a42a3c