• Vladimir Oltean's avatar
    net: bridge: do not replay fdb entries pointing towards the bridge twice · cbb56b03
    Vladimir Oltean authored
    This simple script:
    
    ip link add br0 type bridge
    ip link set swp2 master br0
    ip link set br0 address 00:01:02:03:04:05
    ip link del br0
    
    produces this result on a DSA switch:
    
    [  421.306399] br0: port 1(swp2) entered blocking state
    [  421.311445] br0: port 1(swp2) entered disabled state
    [  421.472553] device swp2 entered promiscuous mode
    [  421.488986] device swp2 left promiscuous mode
    [  421.493508] br0: port 1(swp2) entered disabled state
    [  421.886107] sja1105 spi0.1: port 1 failed to delete 00:01:02:03:04:05 vid 1 from fdb: -ENOENT
    [  421.894374] sja1105 spi0.1: port 1 failed to delete 00:01:02:03:04:05 vid 0 from fdb: -ENOENT
    [  421.943982] br0: port 1(swp2) entered blocking state
    [  421.949030] br0: port 1(swp2) entered disabled state
    [  422.112504] device swp2 entered promiscuous mode
    
    A very simplified view of what happens is:
    
    (1) the bridge port is created, and the bridge device inherits its MAC
        address
    
    (2) when joining, the bridge port (DSA) requests a replay of the
        addition of all FDB entries towards this bridge port and towards the
        bridge device itself. In fact, DSA calls br_fdb_replay() twice:
    
    	br_fdb_replay(br, brport_dev);
    	br_fdb_replay(br, br);
    
        DSA uses reference counting for the FDB entries. So the MAC address
        of the bridge is simply kept with refcount 2. When the bridge port
        leaves under normal circumstances, everything cancels out since the
        replay of the FDB entry deletion is also done twice per VLAN.
    
    (3) when the bridge MAC address changes, switchdev is notified of the
        deletion of the old address and of the insertion of the new one.
        But the old address does not really go away, since it had refcount
        2, and the new address is added "only" with refcount 1.
    
    (4) when the bridge port leaves now, it will replay a deletion of the
        FDB entries pointing towards the bridge twice. Then DSA will
        complain that it can't delete something that no longer exists.
    
    It is clear that the problem is that the FDB entries towards the bridge
    are replayed too many times, so let's fix that problem.
    
    Fixes: 63c51453 ("net: dsa: replay the local bridge FDB entries pointing to the bridge dev too")
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/20210719093916.4099032-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    cbb56b03
br_fdb.c 34.1 KB