Commit 1f9fc48f authored by Vladimir Oltean's avatar Vladimir Oltean Committed by Jakub Kicinski

net: dsa: sja1105: fix reception from VLAN-unaware bridges

The blamed commit introduced an unexpected regression in the sja1105
driver. Packets from VLAN-unaware bridge ports get received correctly,
but the protocol stack can't seem to decode them properly.

For ds->untag_bridge_pvid users (thus also sja1105), the blamed commit
did introduce a functional change: dsa_switch_rcv() used to call
dsa_untag_bridge_pvid(), which looked like this:

	err = br_vlan_get_proto(br, &proto);
	if (err)
		return skb;

	/* Move VLAN tag from data to hwaccel */
	if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) {
		skb = skb_vlan_untag(skb);
		if (!skb)
			return NULL;
	}

and now it calls dsa_software_vlan_untag() which has just this:

	/* Move VLAN tag from data to hwaccel */
	if (!skb_vlan_tag_present(skb)) {
		skb = skb_vlan_untag(skb);
		if (!skb)
			return NULL;
	}

thus lacks any skb->protocol == bridge VLAN protocol check. That check
is deferred until a later check for skb->vlan_proto (in the hwaccel area).

The new code is problematic because, for VLAN-untagged packets,
skb_vlan_untag() blindly takes the 4 bytes starting with the EtherType
and turns them into a hwaccel VLAN tag. This is what breaks the protocol
stack.

It would be tempting to "make it work as before" and only call
skb_vlan_untag() for those packets with the skb->protocol actually
representing a VLAN.

But the premise of the newly introduced dsa_software_vlan_untag() core
function is not wrong. Drivers set ds->untag_bridge_pvid or
ds->untag_vlan_aware_bridge_pvid presumably because they send all
traffic to the CPU reception path as VLAN-tagged. So why should we spend
any additional CPU cycles assuming that the packet may be VLAN-untagged?
And why does the sja1105 driver opt into ds->untag_bridge_pvid if it
doesn't always deliver packets to the CPU as VLAN-tagged?

The answer to the latter question is indeed more interesting: it doesn't
need to. This got done in commit 884be12f ("net: dsa: sja1105: add
support for imprecise RX"), because I thought it would be needed, but I
didn't realize that it doesn't actually make a difference.

As explained in the commit message of the blamed patch, ds->untag_bridge_pvid
only makes a difference in the VLAN-untagged receive path of a bridge port.
However, in that operating mode, tag_sja1105.c makes use of VLAN tags
with the ETH_P_SJA1105 TPID, and it decodes and consumes these VLAN tags
as if they were DSA tags (aka tag_8021q operation). Even if commit
884be12f ("net: dsa: sja1105: add support for imprecise RX") added
this logic in sja1105_bridge_vlan_add():

	/* Always install bridge VLANs as egress-tagged on the CPU port. */
	if (dsa_is_cpu_port(ds, port))
		flags = 0;

that was for _bridge_ VLANs, which are _not_ committed to hardware
in VLAN-unaware mode (aka the mode where ds->untag_bridge_pvid does
anything at all). Even prior to that change, the tag_8021q VLANs
were always installed as egress-tagged on the CPU port, see
dsa_switch_tag_8021q_vlan_add():

	u16 flags = 0; // egress-tagged, non-PVID

	if (dsa_port_is_user(dp))
		flags |= BRIDGE_VLAN_INFO_UNTAGGED |
			 BRIDGE_VLAN_INFO_PVID;

	err = dsa_port_do_tag_8021q_vlan_add(dp, info->vid,
					     flags);
	if (err)
		return err;

Whether the sja1105 driver needs the new flag, ds->untag_vlan_aware_bridge_pvid,
rather than ds->untag_bridge_pvid, is a separate discussion. To fix the
current bug in VLAN-unaware bridge mode, I would argue that the sja1105
driver should not request something it doesn't need, rather than
complicating the core DSA helper. Whereas before the blamed commit, this
setting was harmless, now it has caused breakage.

Fixes: 93e4649e ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges")
Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241001140206.50933-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent 096c0fa4
...@@ -3158,7 +3158,6 @@ static int sja1105_setup(struct dsa_switch *ds) ...@@ -3158,7 +3158,6 @@ static int sja1105_setup(struct dsa_switch *ds)
* TPID is ETH_P_SJA1105, and the VLAN ID is the port pvid. * TPID is ETH_P_SJA1105, and the VLAN ID is the port pvid.
*/ */
ds->vlan_filtering_is_global = true; ds->vlan_filtering_is_global = true;
ds->untag_bridge_pvid = true;
ds->fdb_isolation = true; ds->fdb_isolation = true;
ds->max_num_bridges = DSA_TAG_8021Q_MAX_NUM_BRIDGES; ds->max_num_bridges = DSA_TAG_8021Q_MAX_NUM_BRIDGES;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment