• Ido Schimmel's avatar
    mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work · fb4e2b70
    Ido Schimmel authored
    The rehash delayed work is rescheduled with a delay if the number of
    credits at end of the work is not negative as supposedly it means that
    the migration ended. Otherwise, it is rescheduled immediately.
    
    After "mlxsw: spectrum_acl_tcam: Fix possible use-after-free during
    rehash" the above is no longer accurate as a non-negative number of
    credits is no longer indicative of the migration being done. It can also
    happen if the work encountered an error in which case the migration will
    resume the next time the work is scheduled.
    
    The significance of the above is that it is possible for the work to be
    pending and associated with hints that were allocated when the migration
    started. This leads to the hints being leaked [1] when the work is
    canceled while pending as part of ACL region dismantle.
    
    Fix by freeing the hints if hints are associated with a work that was
    canceled while pending.
    
    Blame the original commit since the reliance on not having a pending
    work associated with hints is fragile.
    
    [1]
    unreferenced object 0xffff88810e7c3000 (size 256):
      comm "kworker/0:16", pid 176, jiffies 4295460353
      hex dump (first 32 bytes):
        00 30 95 11 81 88 ff ff 61 00 00 00 00 00 00 80  .0......a.......
        00 00 61 00 40 00 00 00 00 00 00 00 04 00 00 00  ..a.@...........
      backtrace (crc 2544ddb9):
        [<00000000cf8cfab3>] kmalloc_trace+0x23f/0x2a0
        [<000000004d9a1ad9>] objagg_hints_get+0x42/0x390
        [<000000000b143cf3>] mlxsw_sp_acl_erp_rehash_hints_get+0xca/0x400
        [<0000000059bdb60a>] mlxsw_sp_acl_tcam_vregion_rehash_work+0x868/0x1160
        [<00000000e81fd734>] process_one_work+0x59c/0xf20
        [<00000000ceee9e81>] worker_thread+0x799/0x12c0
        [<00000000bda6fe39>] kthread+0x246/0x300
        [<0000000070056d23>] ret_from_fork+0x34/0x70
        [<00000000dea2b93e>] ret_from_fork_asm+0x1a/0x30
    
    Fixes: c9c9af91 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work")
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Tested-by: default avatarAlexander Zubkov <green@qrator.net>
    Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
    Reviewed-by: default avatarSimon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/0cc12ebb07c4d4c41a1265ee2c28b392ff997a86.1713797103.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    fb4e2b70
spectrum_acl_tcam.c 55.1 KB