1. 18 Mar, 2022 39 commits
  2. 17 Mar, 2022 1 commit
    • Jakub Kicinski's avatar
      Merge branch 'net-bridge-multiple-spanning-trees' · 82e94d41
      Jakub Kicinski authored
      Tobias Waldekranz says:
      
      ====================
      net: bridge: Multiple Spanning Trees
      
      The bridge has had per-VLAN STP support for a while now, since:
      
      https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay@cumulusnetworks.com/
      
      The current implementation has some problems:
      
      - The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN
        is managed independently. This is awkward from an MSTP (802.1Q-2018,
        Clause 13.5) point of view, where the model is that multiple VLANs
        are grouped into MST instances.
      
        Because of the way that the standard is written, presumably, this is
        also reflected in hardware implementations. It is not uncommon for a
        switch to support the full 4k range of VIDs, but that the pool of
        MST instances is much smaller. Some examples:
      
        Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs
        Marvell Prestera: 4k VLANs, but only 128 MSTIs
        Microchip SparX-5i: 4k VLANs, but only 128 MSTIs
      
      - By default, the feature is enabled, and there is no way to disable
        it. This makes it hard to add offloading in a backwards compatible
        way, since any underlying switchdevs have no way to refuse the
        function if the hardware does not support it
      
      - The port-global STP state has precedence over per-VLAN states. In
        MSTP, as far as I understand it, all VLANs will use the common
        spanning tree (CST) by default - through traffic engineering you can
        then optimize your network to group subsets of VLANs to use
        different trees (MSTI). To my understanding, the way this is
        typically managed in silicon is roughly:
      
        Incoming packet:
        .----.----.--------------.----.-------------
        | DA | SA | 802.1Q VID=X | ET | Payload ...
        '----'----'--------------'----'-------------
                              |
                              '->|\     .----------------------------.
                                 | +--> | VID | Members | ... | MSTI |
                         PVID -->|/     |-----|---------|-----|------|
                                        |   1 | 0001001 | ... |    0 |
                                        |   2 | 0001010 | ... |   10 |
                                        |   3 | 0001100 | ... |   10 |
                                        '----------------------------'
                                                                   |
                                     .-----------------------------'
                                     |  .------------------------.
                                     '->| MSTI | Fwding | Lrning |
                                        |------|--------|--------|
                                        |    0 | 111110 | 111110 |
                                        |   10 | 110111 | 110111 |
                                        '------------------------'
      
        What this is trying to show is that the STP state (whether MSTP is
        used, or ye olde STP) is always accessed via the VLAN table. If STP
        is running, all MSTI pointers in that table will reference the same
        index in the STP stable - if MSTP is running, some VLANs may point
        to other trees (like in this example).
      
        The fact that in the Linux bridge, the global state (think: index 0
        in most hardware implementations) is supposed to override the
        per-VLAN state, is very awkward to offload. In effect, this means
        that when the global state changes to blocking, drivers will have to
        iterate over all MSTIs in use, and alter them all to match. This
        also means that you have to cache whether the hardware state is
        currently tracking the global state or the per-VLAN state. In the
        first case, you also have to cache the per-VLAN state so that you
        can restore it if the global state transitions back to forwarding.
      
      This series adds a new mst_enable bridge setting (as suggested by Nik)
      that can only be changed when no VLANs are configured on the
      bridge. Enabling this mode has the following effect:
      
      - The port-global STP state is used to represent the CST (Common
        Spanning Tree) (1/15)
      
      - Ingress STP filtering is deferred until the frame's VLAN has been
        resolved (1/15)
      
      - The preexisting per-VLAN states can no longer be controlled directly
        (1/15). They are instead placed under the MST module's control,
        which is managed using a new netlink interface (described in 3/15)
      
      - VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a
        new global VLAN option (2/15)
      
      Switchdev notifications are added so that a driver can track:
      - MST enabled state
      - VID to MSTI mappings
      - MST port states
      
      An offloading implementation is this provided for mv88e6xxx.
      ====================
      
      Link: https://lore.kernel.org/r/20220316150857.2442916-1-tobias@waldekranz.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      82e94d41