Commit beeee08c authored by David S. Miller's avatar David S. Miller

Merge branch 'sja1105-bridge-port-traffic-termination'

Vladimir Oltean says:

====================
Traffic termination for sja1105 ports under VLAN-aware bridge

This set of patches updates the sja1105 DSA driver to be able to send
and receive network stack packets on behalf of a VLAN-aware upper bridge
interface.

The reasons why this has traditionally been a problem are explained in
the "Traffic support" section of Documentation/networking/dsa/sja1105.rst.
(the entire documentation will be revised in a separate patch series).

The limitations that have prevented us from doing this so far have now
been partially lifted by the bridge's ability to send a packet with
skb->offload_fwd_mark = true, which means that the accelerator is
allowed to look up its hardware FDB when sending a packet and deliver it
to those destination ports. Basically skb->dev is now just a conduit to
the switchdev driver's ndo_start_xmit(), and does not guarantee that the
packet will really be transmitted on that port (but it will be
transmitted where it should, nonetheless).

Apart from the ability to perform IP termination on VLAN-aware bridges
on top of sja1105 interfaces, we also gain the following features:
- VLAN-aware software bridging between sja1105 ports and "foreign"
  (non-DSA) interfaces
- software bridging between sja1105 bridge ports, and software LAG
  uppers of sja1105 ports (as long as the bridge is VLAN-aware)

The only things that don't work are:
1. to create an AF_PACKET socket on top of a sja1105 port that is under
   a VLAN-aware bridge. This is because the "imprecise RX" procedure
   selects an RX port for data plane* packets based on the assumption
   that the packet will land in the bridge's data path.  If ebtables
   rules are added to remove some packets from the bridge's data path,
   that assumption will be broken. Nonetheless, this is not a limitation
   that negatively impacts the known use cases with this switch.  If
   there was a way to impose user space restrictions against creating
   AF_PACKET sockets on this particular configuration, I could be
   interested in adding those restrictions, but I think there are other
   known broken configs already which are not checked by the kernel
   today (like for example that the bridge's rx_handler steals packets
   anyway from AF_PACKET sockets with exact-match ptype handlers, as
   opposed to ptype_all which are processed earlier; this is precisely
   the reason why ebtables rules are generally needed to avoid that).
2. to send traffic on behalf of an 8021q upper of a standalone interface,
   while other sja1105 ports are part of a VLAN-aware bridge. This is
   because sja1105 sets ds->vlan_filtering_is_global = true, so we
   cannot make the standalone port ignore the VLAN header from the
   packet on RX, so we cannot make tag_8021q enforce its own pvid for
   the packets belonging to that port's 8021q upper. So we cannot
   determine in the first place that packets come from that port, unless
   we iterate through all 8021q uppers of all ports, and enforce
   uniqueness of VLAN IDs. I am not sure if this is what I want / if it
   is worth it, so currently all 8021q uppers are denied, regardless of
   whether the switch has ports under a VLAN-aware bridge or not
   (otherwise it becomes complicated even to track the state).
   Nonetheless, the VID uniqueness of all 8021q uppers does raise
   another question: what to do with VID 0, which has no 8021q upper,
   but the 8021q module adds it to our RX filter with vlan_vid_add().
   I am honestly not sure what to do. The best I can do is enable a
   hardware bit in sja1105 which reclassifies VID 0 frames to the PVID,
   and they will be sent on the CPU port using either the tag_8021q pvid
   of standalone ports, or the bridge pvid of VLAN-aware ports. So at
   the very least, those packets are still 'kinda' processed as if they
   were untagged, but the VID 0 is lost, though. In my defence, Marvell
   appears to do the same thing with reclassifying VID 0 frames, see
   commit b8b79c41 ("net: dsa: mv88e6xxx: Fix adding vlan 0").

*Control packets (currently hardcoded in sja1105 as link-local packets
for MAC DA ranges 01-80-c2-xx-xx-xx and 01-1b-19-xx-xx-xx) are received
based on packet traps and their precise source port is always known.

I have taken one patch from Colin because my work conflicts with his,
and integrating it all through the same series avoids that.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 9bff6684 edac6f63
......@@ -226,14 +226,6 @@ struct sja1105_flow_block {
int num_virtual_links;
};
struct sja1105_bridge_vlan {
struct list_head list;
int port;
u16 vid;
bool pvid;
bool untagged;
};
struct sja1105_private {
struct sja1105_static_config static_config;
bool rgmii_rx_delay[SJA1105_MAX_NUM_PORTS];
......@@ -249,8 +241,8 @@ struct sja1105_private {
struct gpio_desc *reset_gpio;
struct spi_device *spidev;
struct dsa_switch *ds;
struct list_head dsa_8021q_vlans;
struct list_head bridge_vlans;
u16 bridge_pvid[SJA1105_MAX_NUM_PORTS];
u16 tag_8021q_pvid[SJA1105_MAX_NUM_PORTS];
struct sja1105_flow_block flow_block;
struct sja1105_port ports[SJA1105_MAX_NUM_PORTS];
/* Serializes transmission of management frames so that
......
This diff is collapsed.
......@@ -35,6 +35,16 @@ struct sk_buff *dsa_8021q_xmit(struct sk_buff *skb, struct net_device *netdev,
void dsa_8021q_rcv(struct sk_buff *skb, int *source_port, int *switch_id);
int dsa_tag_8021q_bridge_tx_fwd_offload(struct dsa_switch *ds, int port,
struct net_device *br,
int bridge_num);
void dsa_tag_8021q_bridge_tx_fwd_unoffload(struct dsa_switch *ds, int port,
struct net_device *br,
int bridge_num);
u16 dsa_8021q_bridge_tx_fwd_offload_vid(int bridge_num);
u16 dsa_8021q_tx_vid(struct dsa_switch *ds, int port);
u16 dsa_8021q_rx_vid(struct dsa_switch *ds, int port);
......
......@@ -111,6 +111,8 @@ int br_vlan_get_pvid_rcu(const struct net_device *dev, u16 *p_pvid);
int br_vlan_get_proto(const struct net_device *dev, u16 *p_proto);
int br_vlan_get_info(const struct net_device *dev, u16 vid,
struct bridge_vlan_info *p_vinfo);
int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
struct bridge_vlan_info *p_vinfo);
#else
static inline bool br_vlan_enabled(const struct net_device *dev)
{
......@@ -137,6 +139,12 @@ static inline int br_vlan_get_info(const struct net_device *dev, u16 vid,
{
return -EINVAL;
}
static inline int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
struct bridge_vlan_info *p_vinfo)
{
return -EINVAL;
}
#endif
#if IS_ENABLED(CONFIG_BRIDGE)
......
......@@ -88,11 +88,6 @@ struct dsa_device_ops {
struct packet_type *pt);
void (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
int *offset);
/* Used to determine which traffic should match the DSA filter in
* eth_type_trans, and which, if any, should bypass it and be processed
* as regular on the master net device.
*/
bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
unsigned int needed_headroom;
unsigned int needed_tailroom;
const char *name;
......@@ -246,7 +241,6 @@ struct dsa_port {
struct dsa_switch_tree *dst;
struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt);
bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
enum {
DSA_PORT_TYPE_UNUSED = 0,
......@@ -985,15 +979,6 @@ static inline bool netdev_uses_dsa(const struct net_device *dev)
return false;
}
static inline bool dsa_can_decode(const struct sk_buff *skb,
struct net_device *dev)
{
#if IS_ENABLED(CONFIG_NET_DSA)
return !dev->dsa_ptr->filter || dev->dsa_ptr->filter(skb, dev);
#endif
return false;
}
/* All DSA tags that push the EtherType to the right (basically all except tail
* tags, which don't break dissection) can be treated the same from the
* perspective of the flow dissector.
......
......@@ -840,11 +840,14 @@ int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val,
if (br_opt_get(br, BROPT_VLAN_ENABLED) == !!val)
return 0;
br_opt_toggle(br, BROPT_VLAN_ENABLED, !!val);
err = switchdev_port_attr_set(br->dev, &attr, extack);
if (err && err != -EOPNOTSUPP)
if (err && err != -EOPNOTSUPP) {
br_opt_toggle(br, BROPT_VLAN_ENABLED, !val);
return err;
}
br_opt_toggle(br, BROPT_VLAN_ENABLED, !!val);
br_manage_promisc(br);
recalculate_group_addr(br);
br_recalculate_fwd_mask(br);
......@@ -1446,6 +1449,33 @@ int br_vlan_get_info(const struct net_device *dev, u16 vid,
}
EXPORT_SYMBOL_GPL(br_vlan_get_info);
int br_vlan_get_info_rcu(const struct net_device *dev, u16 vid,
struct bridge_vlan_info *p_vinfo)
{
struct net_bridge_vlan_group *vg;
struct net_bridge_vlan *v;
struct net_bridge_port *p;
p = br_port_get_check_rcu(dev);
if (p)
vg = nbp_vlan_group_rcu(p);
else if (netif_is_bridge_master(dev))
vg = br_vlan_group_rcu(netdev_priv(dev));
else
return -EINVAL;
v = br_vlan_find(vg, vid);
if (!v)
return -ENOENT;
p_vinfo->vid = vid;
p_vinfo->flags = v->flags;
if (vid == br_get_pvid(vg))
p_vinfo->flags |= BRIDGE_VLAN_INFO_PVID;
return 0;
}
EXPORT_SYMBOL_GPL(br_vlan_get_info_rcu);
static int br_vlan_is_bind_vlan_dev(const struct net_device *dev)
{
return is_vlan_dev(dev) &&
......
......@@ -397,6 +397,49 @@ static inline struct sk_buff *dsa_untag_bridge_pvid(struct sk_buff *skb)
return skb;
}
/* For switches without hardware support for DSA tagging to be able
* to support termination through the bridge.
*/
static inline struct net_device *
dsa_find_designated_bridge_port_by_vid(struct net_device *master, u16 vid)
{
struct dsa_port *cpu_dp = master->dsa_ptr;
struct dsa_switch_tree *dst = cpu_dp->dst;
struct bridge_vlan_info vinfo;
struct net_device *slave;
struct dsa_port *dp;
int err;
list_for_each_entry(dp, &dst->ports, list) {
if (dp->type != DSA_PORT_TYPE_USER)
continue;
if (!dp->bridge_dev)
continue;
if (dp->stp_state != BR_STATE_LEARNING &&
dp->stp_state != BR_STATE_FORWARDING)
continue;
/* Since the bridge might learn this packet, keep the CPU port
* affinity with the port that will be used for the reply on
* xmit.
*/
if (dp->cpu_dp != cpu_dp)
continue;
slave = dp->slave;
err = br_vlan_get_info_rcu(slave, vid, &vinfo);
if (err)
continue;
return slave;
}
return NULL;
}
/* switch.c */
int dsa_switch_register_notifier(struct dsa_switch *ds);
void dsa_switch_unregister_notifier(struct dsa_switch *ds);
......
......@@ -888,7 +888,6 @@ int dsa_port_mrp_del_ring_role(const struct dsa_port *dp,
void dsa_port_set_tag_protocol(struct dsa_port *cpu_dp,
const struct dsa_device_ops *tag_ops)
{
cpu_dp->filter = tag_ops->filter;
cpu_dp->rcv = tag_ops->rcv;
cpu_dp->tag_ops = tag_ops;
}
......
......@@ -17,7 +17,7 @@
*
* | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
* +-----------+-----+-----------------+-----------+-----------------------+
* | DIR | RSV | SWITCH_ID | RSV | PORT |
* | DIR | VBID| SWITCH_ID | VBID | PORT |
* +-----------+-----+-----------------+-----------+-----------------------+
*
* DIR - VID[11:10]:
......@@ -30,9 +30,10 @@
* SWITCH_ID - VID[8:6]:
* Index of switch within DSA tree. Must be between 0 and 7.
*
* RSV - VID[5:4]:
* To be used for further expansion of PORT or for other purposes.
* Must be transmitted as zero and ignored on receive.
* VBID - { VID[9], VID[5:4] }:
* Virtual bridge ID. If between 1 and 7, packet targets the broadcast
* domain of a bridge. If transmitted as zero, packet targets a single
* port. Field only valid on transmit, must be ignored on receive.
*
* PORT - VID[3:0]:
* Index of switch port. Must be between 0 and 15.
......@@ -50,11 +51,30 @@
#define DSA_8021Q_SWITCH_ID(x) (((x) << DSA_8021Q_SWITCH_ID_SHIFT) & \
DSA_8021Q_SWITCH_ID_MASK)
#define DSA_8021Q_VBID_HI_SHIFT 9
#define DSA_8021Q_VBID_HI_MASK GENMASK(9, 9)
#define DSA_8021Q_VBID_LO_SHIFT 4
#define DSA_8021Q_VBID_LO_MASK GENMASK(5, 4)
#define DSA_8021Q_VBID_HI(x) (((x) & GENMASK(2, 2)) >> 2)
#define DSA_8021Q_VBID_LO(x) ((x) & GENMASK(1, 0))
#define DSA_8021Q_VBID(x) \
(((DSA_8021Q_VBID_LO(x) << DSA_8021Q_VBID_LO_SHIFT) & \
DSA_8021Q_VBID_LO_MASK) | \
((DSA_8021Q_VBID_HI(x) << DSA_8021Q_VBID_HI_SHIFT) & \
DSA_8021Q_VBID_HI_MASK))
#define DSA_8021Q_PORT_SHIFT 0
#define DSA_8021Q_PORT_MASK GENMASK(3, 0)
#define DSA_8021Q_PORT(x) (((x) << DSA_8021Q_PORT_SHIFT) & \
DSA_8021Q_PORT_MASK)
u16 dsa_8021q_bridge_tx_fwd_offload_vid(int bridge_num)
{
/* The VBID value of 0 is reserved for precise TX */
return DSA_8021Q_DIR_TX | DSA_8021Q_VBID(bridge_num + 1);
}
EXPORT_SYMBOL_GPL(dsa_8021q_bridge_tx_fwd_offload_vid);
/* Returns the VID to be inserted into the frame from xmit for switch steering
* instructions on egress. Encodes switch ID and port ID.
*/
......@@ -387,6 +407,26 @@ int dsa_tag_8021q_bridge_leave(struct dsa_switch *ds,
return 0;
}
int dsa_tag_8021q_bridge_tx_fwd_offload(struct dsa_switch *ds, int port,
struct net_device *br,
int bridge_num)
{
u16 tx_vid = dsa_8021q_bridge_tx_fwd_offload_vid(bridge_num);
return dsa_port_tag_8021q_vlan_add(dsa_to_port(ds, port), tx_vid);
}
EXPORT_SYMBOL_GPL(dsa_tag_8021q_bridge_tx_fwd_offload);
void dsa_tag_8021q_bridge_tx_fwd_unoffload(struct dsa_switch *ds, int port,
struct net_device *br,
int bridge_num)
{
u16 tx_vid = dsa_8021q_bridge_tx_fwd_offload_vid(bridge_num);
dsa_port_tag_8021q_vlan_del(dsa_to_port(ds, port), tx_vid);
}
EXPORT_SYMBOL_GPL(dsa_tag_8021q_bridge_tx_fwd_unoffload);
/* Set up a port's tag_8021q RX and TX VLAN for standalone mode operation */
static int dsa_tag_8021q_port_setup(struct dsa_switch *ds, int port)
{
......
......@@ -115,40 +115,6 @@ static inline bool sja1105_is_meta_frame(const struct sk_buff *skb)
return true;
}
static bool sja1105_can_use_vlan_as_tags(const struct sk_buff *skb)
{
struct vlan_ethhdr *hdr = vlan_eth_hdr(skb);
u16 vlan_tci;
if (hdr->h_vlan_proto == htons(ETH_P_SJA1105))
return true;
if (hdr->h_vlan_proto != htons(ETH_P_8021Q) &&
!skb_vlan_tag_present(skb))
return false;
if (skb_vlan_tag_present(skb))
vlan_tci = skb_vlan_tag_get(skb);
else
vlan_tci = ntohs(hdr->h_vlan_TCI);
return vid_is_dsa_8021q(vlan_tci & VLAN_VID_MASK);
}
/* This is the first time the tagger sees the frame on RX.
* Figure out if we can decode it.
*/
static bool sja1105_filter(const struct sk_buff *skb, struct net_device *dev)
{
if (sja1105_can_use_vlan_as_tags(skb))
return true;
if (sja1105_is_link_local(skb))
return true;
if (sja1105_is_meta_frame(skb))
return true;
return false;
}
/* Calls sja1105_port_deferred_xmit in sja1105_main.c */
static struct sk_buff *sja1105_defer_xmit(struct sja1105_port *sp,
struct sk_buff *skb)
......@@ -167,6 +133,31 @@ static u16 sja1105_xmit_tpid(struct sja1105_port *sp)
return sp->xmit_tpid;
}
static struct sk_buff *sja1105_imprecise_xmit(struct sk_buff *skb,
struct net_device *netdev)
{
struct dsa_port *dp = dsa_slave_to_port(netdev);
struct net_device *br = dp->bridge_dev;
u16 tx_vid;
/* If the port is under a VLAN-aware bridge, just slide the
* VLAN-tagged packet into the FDB and hope for the best.
* This works because we support a single VLAN-aware bridge
* across the entire dst, and its VLANs cannot be shared with
* any standalone port.
*/
if (br_vlan_enabled(br))
return skb;
/* If the port is under a VLAN-unaware bridge, use an imprecise
* TX VLAN that targets the bridge's entire broadcast domain,
* instead of just the specific port.
*/
tx_vid = dsa_8021q_bridge_tx_fwd_offload_vid(dp->bridge_num);
return dsa_8021q_xmit(skb, netdev, sja1105_xmit_tpid(dp->priv), tx_vid);
}
static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
struct net_device *netdev)
{
......@@ -175,6 +166,9 @@ static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
u16 queue_mapping = skb_get_queue_mapping(skb);
u8 pcp = netdev_txq_to_tc(netdev, queue_mapping);
if (skb->offload_fwd_mark)
return sja1105_imprecise_xmit(skb, netdev);
/* Transmitting management traffic does not rely upon switch tagging,
* but instead SPI-installed management routes. Part 2 of this
* is the .port_deferred_xmit driver callback.
......@@ -199,6 +193,9 @@ static struct sk_buff *sja1110_xmit(struct sk_buff *skb,
__be16 *tx_header;
int trailer_pos;
if (skb->offload_fwd_mark)
return sja1105_imprecise_xmit(skb, netdev);
/* Transmitting control packets is done using in-band control
* extensions, while data packets are transmitted using
* tag_8021q TX VLANs.
......@@ -371,15 +368,42 @@ static bool sja1110_skb_has_inband_control_extension(const struct sk_buff *skb)
return ntohs(eth_hdr(skb)->h_proto) == ETH_P_SJA1110;
}
/* Returns true for imprecise RX and sets the @vid.
* Returns false for precise RX and sets @source_port and @switch_id.
*/
static bool sja1105_vlan_rcv(struct sk_buff *skb, int *source_port,
int *switch_id, u16 *vid)
{
struct vlan_ethhdr *hdr = (struct vlan_ethhdr *)skb_mac_header(skb);
u16 vlan_tci;
if (skb_vlan_tag_present(skb))
vlan_tci = skb_vlan_tag_get(skb);
else
vlan_tci = ntohs(hdr->h_vlan_TCI);
if (vid_is_dsa_8021q_rxvlan(vlan_tci & VLAN_VID_MASK)) {
dsa_8021q_rcv(skb, source_port, switch_id);
return false;
}
/* Try our best with imprecise RX */
*vid = vlan_tci & VLAN_VID_MASK;
return true;
}
static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
struct net_device *netdev,
struct packet_type *pt)
{
int source_port = -1, switch_id = -1;
struct sja1105_meta meta = {0};
int source_port, switch_id;
bool imprecise_rx = false;
struct ethhdr *hdr;
bool is_link_local;
bool is_meta;
u16 vid;
hdr = eth_hdr(skb);
is_link_local = sja1105_is_link_local(skb);
......@@ -389,7 +413,8 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
if (sja1105_skb_has_tag_8021q(skb)) {
/* Normal traffic path. */
dsa_8021q_rcv(skb, &source_port, &switch_id);
imprecise_rx = sja1105_vlan_rcv(skb, &source_port, &switch_id,
&vid);
} else if (is_link_local) {
/* Management traffic path. Switch embeds the switch ID and
* port ID into bytes of the destination MAC, courtesy of
......@@ -408,7 +433,10 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
return NULL;
}
skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
if (imprecise_rx)
skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);
else
skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
if (!skb->dev) {
netdev_warn(netdev, "Couldn't decode source port\n");
return NULL;
......@@ -522,6 +550,8 @@ static struct sk_buff *sja1110_rcv(struct sk_buff *skb,
struct packet_type *pt)
{
int source_port = -1, switch_id = -1;
bool imprecise_rx = false;
u16 vid;
skb->offload_fwd_mark = 1;
......@@ -534,13 +564,15 @@ static struct sk_buff *sja1110_rcv(struct sk_buff *skb,
/* Packets with in-band control extensions might still have RX VLANs */
if (likely(sja1105_skb_has_tag_8021q(skb)))
dsa_8021q_rcv(skb, &source_port, &switch_id);
imprecise_rx = sja1105_vlan_rcv(skb, &source_port, &switch_id,
&vid);
skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
if (imprecise_rx)
skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);
else
skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
if (!skb->dev) {
netdev_warn(netdev,
"Couldn't decode source port %d and switch id %d\n",
source_port, switch_id);
netdev_warn(netdev, "Couldn't decode source port\n");
return NULL;
}
......@@ -576,7 +608,6 @@ static const struct dsa_device_ops sja1105_netdev_ops = {
.proto = DSA_TAG_PROTO_SJA1105,
.xmit = sja1105_xmit,
.rcv = sja1105_rcv,
.filter = sja1105_filter,
.needed_headroom = VLAN_HLEN,
.flow_dissect = sja1105_flow_dissect,
.promisc_on_master = true,
......@@ -590,7 +621,6 @@ static const struct dsa_device_ops sja1110_netdev_ops = {
.proto = DSA_TAG_PROTO_SJA1110,
.xmit = sja1110_xmit,
.rcv = sja1110_rcv,
.filter = sja1105_filter,
.flow_dissect = sja1110_flow_dissect,
.needed_headroom = SJA1110_HEADER_LEN + VLAN_HLEN,
.needed_tailroom = SJA1110_RX_TRAILER_LEN + SJA1110_MAX_PADDING_LEN,
......
......@@ -182,12 +182,8 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
* at all, so we check here whether one of those tagging
* variants has been configured on the receiving interface,
* and if so, set skb->protocol without looking at the packet.
* The DSA tagging protocol may be able to decode some but not all
* traffic (for example only for management). In that case give it the
* option to filter the packets from which it can decode source port
* information.
*/
if (unlikely(netdev_uses_dsa(dev)) && dsa_can_decode(skb, dev))
if (unlikely(netdev_uses_dsa(dev)))
return htons(ETH_P_XDSA);
if (likely(eth_proto_is_802_3(eth->h_proto)))
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment