• Paolo Abeni's avatar
    net: dev: introduce support for sch BYPASS for lockless qdisc · ba27b4cd
    Paolo Abeni authored
    With commit c5ad119f ("net: sched: pfifo_fast use skb_array")
    pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization.
    Due to retpolines the cost of the enqueue()/dequeue() pair has become
    relevant and we observe measurable regression for the uncontended
    scenario when the packet-rate is below line rate.
    
    After commit 46b1c18f ("net: sched: put back q.qlen into a
    single location") we can check for empty qdisc with a reasonably
    fast operation even for nolock qdiscs.
    
    This change extends TCQ_F_CAN_BYPASS support to nolock qdisc.
    The new chunk of code mirrors closely the existing one for traditional
    qdisc, leveraging a newly introduced helper to read atomically the
    qdisc length.
    
    Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ
    device, and MQ root qdisc:
    
    threads         vanilla         patched
                    kpps            kpps
    1               2465            2889
    2               4304            5188
    4               7898            9589
    
    Same as above, but with a single queue device:
    
    threads         vanilla         patched
                    kpps            kpps
    1               2556            2827
    2               2900            2900
    4               5000            5000
    8               4700            4700
    
    No mesaurable changes in the contended scenarios, and more 10%
    improvement in the uncontended ones.
    
     v1 -> v2:
      - rebased after flag name change
    Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    Tested-by: default avatarIvan Vecera <ivecera@redhat.com>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Reviewed-by: default avatarIvan Vecera <ivecera@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    ba27b4cd
dev.c 247 KB