• Maor Gottlieb's avatar
    net/mlx5: Fix FTE cleanup · cefc2355
    Maor Gottlieb authored
    Currently, when an FTE is allocated, its refcount is decreased to 0
    with the purpose it will not be a stand alone steering object and every
    rule (destination) of the FTE would increase the refcount.
    When mlx5_cleanup_fs is called while not all rules were deleted by the
    steering users, it hit refcount underflow on the FTE once clean_tree
    calls to tree_remove_node after the deleted rules already decreased
    the refcount to 0.
    
    FTE is no longer destroyed implicitly when the last rule (destination)
    is deleted. mlx5_del_flow_rules avoids it by increasing the refcount on
    the FTE and destroy it explicitly after all rules were deleted. So we
    can avoid the refcount underflow by making FTE as stand alone object.
    In addition need to set del_hw_func to FTE so the HW object will be
    destroyed when the FTE is deleted from the cleanup_tree flow.
    
    refcount_t: underflow; use-after-free.
    WARNING: CPU: 2 PID: 15715 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    Call Trace:
     tree_put_node+0xf2/0x140 [mlx5_core]
     clean_tree+0x4e/0xf0 [mlx5_core]
     clean_tree+0x4e/0xf0 [mlx5_core]
     clean_tree+0x4e/0xf0 [mlx5_core]
     clean_tree+0x5f/0xf0 [mlx5_core]
     clean_tree+0x4e/0xf0 [mlx5_core]
     clean_tree+0x5f/0xf0 [mlx5_core]
     mlx5_cleanup_fs+0x26/0x270 [mlx5_core]
     mlx5_unload+0x2e/0xa0 [mlx5_core]
     mlx5_unload_one+0x51/0x120 [mlx5_core]
     mlx5_devlink_reload_down+0x51/0x90 [mlx5_core]
     devlink_reload+0x39/0x120
     ? devlink_nl_cmd_reload+0x43/0x220
     genl_rcv_msg+0x1e4/0x420
     ? genl_family_rcv_msg_attrs_parse+0x100/0x100
     netlink_rcv_skb+0x47/0x110
     genl_rcv+0x24/0x40
     netlink_unicast+0x217/0x2f0
     netlink_sendmsg+0x30f/0x430
     sock_sendmsg+0x30/0x40
     __sys_sendto+0x10e/0x140
     ? handle_mm_fault+0xc4/0x1f0
     ? do_page_fault+0x33f/0x630
     __x64_sys_sendto+0x24/0x30
     do_syscall_64+0x48/0x130
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    
    Fixes: 718ce4d6 ("net/mlx5: Consolidate update FTE for all removal changes")
    Fixes: bd71b08e ("net/mlx5: Support multiple updates of steering rules in parallel")
    Signed-off-by: default avatarMaor Gottlieb <maorg@nvidia.com>
    Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
    Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
    cefc2355
fs_core.c 82.9 KB