• Vladimir Oltean's avatar
    net: dsa: flush switchdev workqueue on bridge join error path · 630fd482
    Vladimir Oltean authored
    There is a race between switchdev_bridge_port_offload() and the
    dsa_port_switchdev_sync_attrs() call right below it.
    
    When switchdev_bridge_port_offload() finishes, FDB entries have been
    replayed by the bridge, but are scheduled for deferred execution later.
    
    However dsa_port_switchdev_sync_attrs -> dsa_port_can_apply_vlan_filtering()
    may impose restrictions on the vlan_filtering attribute and refuse
    offloading.
    
    When this happens, the delayed FDB entries will dereference dp->bridge,
    which is a NULL pointer because we have stopped the process of
    offloading this bridge.
    
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    Workqueue: dsa_ordered dsa_slave_switchdev_event_work
    pc : dsa_port_bridge_host_fdb_del+0x64/0x100
    lr : dsa_slave_switchdev_event_work+0x130/0x1bc
    Call trace:
     dsa_port_bridge_host_fdb_del+0x64/0x100
     dsa_slave_switchdev_event_work+0x130/0x1bc
     process_one_work+0x294/0x670
     worker_thread+0x80/0x460
    ---[ end trace 0000000000000000 ]---
    Error: dsa_core: Must first remove VLAN uppers having VIDs also present in bridge.
    
    Fix the bug by doing what we do on the normal bridge leave path as well,
    which is to wait until the deferred FDB entries complete executing, then
    exit.
    
    The placement of dsa_flush_workqueue() after switchdev_bridge_port_unoffload()
    guarantees that both the FDB additions and deletions on rollback are waited for.
    
    Fixes: d7d0d423 ("net: dsa: flush switchdev workqueue when leaving the bridge")
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/20220507134550.1849834-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    630fd482
port.c 39.8 KB