• Maxim Mikityanskiy's avatar
    net/mlx5e: Optimize modulo in mlx5e_select_queue · 3a9e5fff
    Maxim Mikityanskiy authored
    To improve the performance of the modulo operation (%), it's replaced by
    a subtracting the divisor in a loop. The modulo is used to fix up an
    out-of-bounds value that might be returned by netdev_pick_tx or to
    convert the queue number to the channel number when num_tcs > 1. Both
    situations are unlikely, because XPS is configured not to pick higher
    queues (qid >= num_channels) by default, so under normal circumstances
    the flow won't go inside the loop, and it will be faster than %.
    
    num_tcs == 8 adds at most 7 iterations to the loop. PTP adds at most 1
    iteration to the loop. HTB would add at most 256 iterations (when
    num_channels == 1), so there is an additional boundary check in the HTB
    flow, which falls back to % if more than 7 iterations are expected.
    Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
    Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
    Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
    3a9e5fff
selq.h 1.3 KB