• Eric Dumazet's avatar
    mlx4: avoid unnecessary dirtying of critical fields · dad42c30
    Eric Dumazet authored
    While stressing a 40Gbit mlx4 NIC with busy polling, I found false
    sharing in mlx4 driver that can be easily avoided.
    
    This patch brings an additional 7 % performance improvement in UDP_RR
    workload.
    
    1) If we received no frame during one mlx4_en_process_rx_cq()
       invocation, no need to call mlx4_cq_set_ci() and/or dirty ring->cons
    
    2) Do not refill rx buffers if we have plenty of them.
       This avoids false sharing and allows some bulk/batch optimizations.
       Page allocator and its locks will thank us.
    
    Finally, mlx4_en_poll_rx_cq() should not return 0 if it determined
    cpu handling NIC IRQ should be changed. We should return budget-1
    instead, to not fool net_rx_action() and its netdev_budget.
    
    v2: keep AVG_PERF_COUNTER(... polled) even if polled is 0
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Cc: Tariq Toukan <tariqt@mellanox.com>
    Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    dad42c30
en_rx.c 38.8 KB