Commit c6514f36 authored by David S. Miller's avatar David S. Miller

Merge branch 'mlxsw-enslavement'

Petr Machata says:

====================
mlxsw: Permit enslavement to netdevices with uppers

The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.

As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.

Similarly, detaching a front panel port from a configured topology means
unoffloading of this whole topology -- VLAN uppers, next hops, etc.
Attaching the port back is then not permitted at all. If it were, it would
not result in a working configuration, because much of mlxsw is written to
react to changes in immediate configuration. There is nothing that would go
visit netdevices in the attached-to topology and offload existing routes
and VLAN memberships, for example.

In this patchset, introduce a number of replays to be invoked so that this
sort of post-hoc offload is supported. Then remove the vetoes that
disallowed enslavement of front panel ports to other netdevices with
uppers.

The patchset progresses as follows:

- In patch #1, fix an issue in the bridge driver. To my knowledge, the
  issue could not have resulted in a buggy behavior previously, and thus is
  packaged with this patchset instead of being sent separately to net.

- In patch #2, add a new helper to the switchdev code.

- In patch #3, drop mlxsw selftests that will not be relevant after this
  patchset anymore.

- Patches #4, #5, #6, #7 and #8 prepare the codebase for smoother
  introduction of the rest of the code.

- Patches #9, #10, #11, #12, #13 and #14 replay various aspects of upper
  configuration when a front panel port is introduced into a topology.
  Individual patches take care of bridge and LAG RIF memberships, switchdev
  replay, nexthop and neighbors replay, and MACVLAN offload.

- Patches #15 and #16 introduce RIFs for newly-relevant netdevices when a
  front panel port is enslaved (in which case all uppers are newly
  relevant), or, respectively, deslaved (in which case the newly-relevant
  netdevice is the one being deslaved).

- Up until this point, the introduced scaffolding was not really used,
  because mlxsw still forbids enslavement of mlxsw netdevices to uppers
  with uppers. In patch #17, this condition is finally relaxed.

A sizable selftest suite is available to test all this new code. That will
be sent in a separate patchset.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents a5dc694e 2c5ffe8d
......@@ -700,6 +700,8 @@ int mlxsw_sp_port_pvid_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid,
struct mlxsw_sp_port_vlan *
mlxsw_sp_port_vlan_create(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid);
void mlxsw_sp_port_vlan_destroy(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan);
int mlxsw_sp_port_kill_vid(struct net_device *dev,
__be16 __always_unused proto, u16 vid);
int mlxsw_sp_port_vlan_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid_begin,
u16 vid_end, bool is_member, bool untagged);
int mlxsw_sp_flow_counter_get(struct mlxsw_sp *mlxsw_sp,
......
......@@ -178,5 +178,12 @@ int mlxsw_sp_router_bridge_vlan_add(struct mlxsw_sp *mlxsw_sp,
int mlxsw_sp_router_port_join_lag(struct mlxsw_sp_port *mlxsw_sp_port,
struct net_device *lag_dev,
struct netlink_ext_ack *extack);
void mlxsw_sp_router_port_leave_lag(struct mlxsw_sp_port *mlxsw_sp_port,
struct net_device *lag_dev);
int mlxsw_sp_netdevice_enslavement_replay(struct mlxsw_sp *mlxsw_sp,
struct net_device *upper_dev,
struct netlink_ext_ack *extack);
void mlxsw_sp_netdevice_deslavement_replay(struct mlxsw_sp *mlxsw_sp,
struct net_device *dev);
#endif /* _MLXSW_ROUTER_H_*/
......@@ -384,6 +384,91 @@ mlxsw_sp_bridge_port_find(struct mlxsw_sp_bridge *bridge,
return __mlxsw_sp_bridge_port_find(bridge_device, brport_dev);
}
static int mlxsw_sp_port_obj_add(struct net_device *dev, const void *ctx,
const struct switchdev_obj *obj,
struct netlink_ext_ack *extack);
static int mlxsw_sp_port_obj_del(struct net_device *dev, const void *ctx,
const struct switchdev_obj *obj);
struct mlxsw_sp_bridge_port_replay_switchdev_objs {
struct net_device *brport_dev;
struct mlxsw_sp_port *mlxsw_sp_port;
int done;
};
static int
mlxsw_sp_bridge_port_replay_switchdev_objs(struct notifier_block *nb,
unsigned long event, void *ptr)
{
struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
struct switchdev_notifier_port_obj_info *port_obj_info = ptr;
struct netlink_ext_ack *extack = port_obj_info->info.extack;
struct mlxsw_sp_bridge_port_replay_switchdev_objs *rso;
int err = 0;
rso = (void *)port_obj_info->info.ctx;
if (event != SWITCHDEV_PORT_OBJ_ADD ||
dev != rso->brport_dev)
goto out;
/* When a port is joining the bridge through a LAG, there likely are
* VLANs configured on that LAG already. The replay will thus attempt to
* have the given port-vlans join the corresponding FIDs. But the LAG
* netdevice has already called the ndo_vlan_rx_add_vid NDO for its VLAN
* memberships, back before CHANGEUPPER was distributed and netdevice
* master set. So now before propagating the VLAN events further, we
* first need to kill the corresponding VID at the mlxsw_sp_port.
*
* Note that this doesn't need to be rolled back on failure -- if the
* replay fails, the enslavement is off, and the VIDs would be killed by
* LAG anyway as part of its rollback.
*/
if (port_obj_info->obj->id == SWITCHDEV_OBJ_ID_PORT_VLAN) {
u16 vid = SWITCHDEV_OBJ_PORT_VLAN(port_obj_info->obj)->vid;
err = mlxsw_sp_port_kill_vid(rso->mlxsw_sp_port->dev, 0, vid);
if (err)
goto out;
}
++rso->done;
err = mlxsw_sp_port_obj_add(rso->mlxsw_sp_port->dev, NULL,
port_obj_info->obj, extack);
out:
return notifier_from_errno(err);
}
static struct notifier_block mlxsw_sp_bridge_port_replay_switchdev_objs_nb = {
.notifier_call = mlxsw_sp_bridge_port_replay_switchdev_objs,
};
static int
mlxsw_sp_bridge_port_unreplay_switchdev_objs(struct notifier_block *nb,
unsigned long event, void *ptr)
{
struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
struct switchdev_notifier_port_obj_info *port_obj_info = ptr;
struct mlxsw_sp_bridge_port_replay_switchdev_objs *rso;
rso = (void *)port_obj_info->info.ctx;
if (event != SWITCHDEV_PORT_OBJ_ADD ||
dev != rso->brport_dev)
return NOTIFY_DONE;
if (!rso->done--)
return NOTIFY_STOP;
mlxsw_sp_port_obj_del(rso->mlxsw_sp_port->dev, NULL,
port_obj_info->obj);
return NOTIFY_DONE;
}
static struct notifier_block mlxsw_sp_bridge_port_unreplay_switchdev_objs_nb = {
.notifier_call = mlxsw_sp_bridge_port_unreplay_switchdev_objs,
};
static struct mlxsw_sp_bridge_port *
mlxsw_sp_bridge_port_create(struct mlxsw_sp_bridge_device *bridge_device,
struct net_device *brport_dev,
......@@ -2350,6 +2435,33 @@ static struct mlxsw_sp_port *mlxsw_sp_lag_rep_port(struct mlxsw_sp *mlxsw_sp,
return NULL;
}
static int
mlxsw_sp_bridge_port_replay(struct mlxsw_sp_bridge_port *bridge_port,
struct mlxsw_sp_port *mlxsw_sp_port,
struct netlink_ext_ack *extack)
{
struct mlxsw_sp_bridge_port_replay_switchdev_objs rso = {
.brport_dev = bridge_port->dev,
.mlxsw_sp_port = mlxsw_sp_port,
};
struct notifier_block *nb;
int err;
nb = &mlxsw_sp_bridge_port_replay_switchdev_objs_nb;
err = switchdev_bridge_port_replay(bridge_port->dev, mlxsw_sp_port->dev,
&rso, NULL, nb, extack);
if (err)
goto err_replay;
return 0;
err_replay:
nb = &mlxsw_sp_bridge_port_unreplay_switchdev_objs_nb;
switchdev_bridge_port_replay(bridge_port->dev, mlxsw_sp_port->dev,
&rso, NULL, nb, extack);
return err;
}
static int
mlxsw_sp_bridge_vlan_aware_port_join(struct mlxsw_sp_bridge_port *bridge_port,
struct mlxsw_sp_port *mlxsw_sp_port,
......@@ -2364,7 +2476,7 @@ mlxsw_sp_bridge_vlan_aware_port_join(struct mlxsw_sp_bridge_port *bridge_port,
if (mlxsw_sp_port->default_vlan->fid)
mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port->default_vlan);
return 0;
return mlxsw_sp_bridge_port_replay(bridge_port, mlxsw_sp_port, extack);
}
static int
......@@ -2536,6 +2648,7 @@ mlxsw_sp_bridge_8021d_port_join(struct mlxsw_sp_bridge_device *bridge_device,
struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan;
struct net_device *dev = bridge_port->dev;
u16 vid;
int err;
vid = is_vlan_dev(dev) ? vlan_dev_vlan_id(dev) : MLXSW_SP_DEFAULT_VID;
mlxsw_sp_port_vlan = mlxsw_sp_port_vlan_find_by_vid(mlxsw_sp_port, vid);
......@@ -2551,8 +2664,20 @@ mlxsw_sp_bridge_8021d_port_join(struct mlxsw_sp_bridge_device *bridge_device,
if (mlxsw_sp_port_vlan->fid)
mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port_vlan);
return mlxsw_sp_port_vlan_bridge_join(mlxsw_sp_port_vlan, bridge_port,
err = mlxsw_sp_port_vlan_bridge_join(mlxsw_sp_port_vlan, bridge_port,
extack);
if (err)
return err;
err = mlxsw_sp_bridge_port_replay(bridge_port, mlxsw_sp_port, extack);
if (err)
goto err_replay;
return 0;
err_replay:
mlxsw_sp_port_vlan_bridge_leave(mlxsw_sp_port_vlan);
return err;
}
static void
......@@ -2769,8 +2894,15 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
if (err)
goto err_port_join;
err = mlxsw_sp_netdevice_enslavement_replay(mlxsw_sp, br_dev, extack);
if (err)
goto err_replay;
return 0;
err_replay:
bridge_device->ops->port_leave(bridge_device, bridge_port,
mlxsw_sp_port);
err_port_join:
mlxsw_sp_bridge_port_put(mlxsw_sp->bridge, bridge_port);
return err;
......
......@@ -231,6 +231,7 @@ enum switchdev_notifier_type {
SWITCHDEV_BRPORT_OFFLOADED,
SWITCHDEV_BRPORT_UNOFFLOADED,
SWITCHDEV_BRPORT_REPLAY,
};
struct switchdev_notifier_info {
......@@ -299,6 +300,11 @@ void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb);
int switchdev_bridge_port_replay(struct net_device *brport_dev,
struct net_device *dev, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb,
struct netlink_ext_ack *extack);
void switchdev_deferred_process(void);
int switchdev_port_attr_set(struct net_device *dev,
......
......@@ -234,6 +234,14 @@ static int br_switchdev_blocking_event(struct notifier_block *nb,
br_switchdev_port_unoffload(p, b->ctx, b->atomic_nb,
b->blocking_nb);
break;
case SWITCHDEV_BRPORT_REPLAY:
brport_info = ptr;
b = &brport_info->brport;
err = br_switchdev_port_replay(p, b->dev, b->ctx, b->atomic_nb,
b->blocking_nb, extack);
err = notifier_from_errno(err);
break;
}
out:
......
......@@ -2118,6 +2118,12 @@ void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb);
int br_switchdev_port_replay(struct net_bridge_port *p,
struct net_device *dev, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb,
struct netlink_ext_ack *extack);
bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb);
void br_switchdev_frame_set_offload_fwd_mark(struct sk_buff *skb);
......@@ -2168,6 +2174,16 @@ br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
{
}
static inline int
br_switchdev_port_replay(struct net_bridge_port *p,
struct net_device *dev, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb,
struct netlink_ext_ack *extack)
{
return -EOPNOTSUPP;
}
static inline bool br_switchdev_frame_uses_tx_fwd_offload(struct sk_buff *skb)
{
return false;
......
......@@ -727,6 +727,8 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
err = br_switchdev_mdb_replay_one(nb, dev,
SWITCHDEV_OBJ_PORT_MDB(obj),
action, ctx, extack);
if (err == -EOPNOTSUPP)
err = 0;
if (err)
goto out_free_mdb;
}
......@@ -759,8 +761,10 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
err = br_switchdev_mdb_replay(br_dev, dev, ctx, true, blocking_nb,
extack);
if (err && err != -EOPNOTSUPP)
if (err) {
/* -EOPNOTSUPP not propagated from MDB replay. */
return err;
}
err = br_switchdev_fdb_replay(br_dev, ctx, true, atomic_nb);
if (err && err != -EOPNOTSUPP)
......@@ -825,3 +829,12 @@ void br_switchdev_port_unoffload(struct net_bridge_port *p, const void *ctx,
nbp_switchdev_del(p);
}
int br_switchdev_port_replay(struct net_bridge_port *p,
struct net_device *dev, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb,
struct netlink_ext_ack *extack)
{
return nbp_switchdev_sync_objs(p, ctx, atomic_nb, blocking_nb, extack);
}
......@@ -862,3 +862,28 @@ void switchdev_bridge_port_unoffload(struct net_device *brport_dev,
NULL);
}
EXPORT_SYMBOL_GPL(switchdev_bridge_port_unoffload);
int switchdev_bridge_port_replay(struct net_device *brport_dev,
struct net_device *dev, const void *ctx,
struct notifier_block *atomic_nb,
struct notifier_block *blocking_nb,
struct netlink_ext_ack *extack)
{
struct switchdev_notifier_brport_info brport_info = {
.brport = {
.dev = dev,
.ctx = ctx,
.atomic_nb = atomic_nb,
.blocking_nb = blocking_nb,
},
};
int err;
ASSERT_RTNL();
err = call_switchdev_blocking_notifiers(SWITCHDEV_BRPORT_REPLAY,
brport_dev, &brport_info.info,
extack);
return notifier_to_errno(err);
}
EXPORT_SYMBOL_GPL(switchdev_bridge_port_replay);
......@@ -16,7 +16,6 @@ ALL_TESTS="
bridge_deletion_test
bridge_vlan_flags_test
vlan_1_test
lag_bridge_upper_test
duplicate_vlans_test
vlan_rif_refcount_test
subport_rif_refcount_test
......@@ -211,33 +210,6 @@ vlan_1_test()
ip link del dev $swp1.1
}
lag_bridge_upper_test()
{
# Test that ports cannot be enslaved to LAG devices that have uppers
# and that failure is handled gracefully. See commit b3529af6bb0d
# ("spectrum: Reference count VLAN entries") for more details
RET=0
ip link add name bond1 type bond mode 802.3ad
ip link add name br0 type bridge vlan_filtering 1
ip link set dev bond1 master br0
ip link set dev $swp1 down
ip link set dev $swp1 master bond1 &> /dev/null
check_fail $? "managed to enslave port to lag when should not"
# This might generate a trace, if we did not handle the failure
# correctly
ip -6 address add 2001:db8:1::1/64 dev $swp1
ip -6 address del 2001:db8:1::1/64 dev $swp1
log_test "lag with bridge upper"
ip link del dev br0
ip link del dev bond1
}
duplicate_vlans_test()
{
# Test that on a given port a VLAN is only used once. Either as VLAN
......@@ -510,9 +482,6 @@ vlan_interface_uppers_test()
ip link set dev $swp1 master br0
ip link add link br0 name br0.10 type vlan id 10
ip link add link br0.10 name macvlan0 \
type macvlan mode private &> /dev/null
check_fail $? "managed to create a macvlan when should not"
ip -6 address add 2001:db8:1::1/64 dev br0.10
ip link add link br0.10 name macvlan0 type macvlan mode private
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment