• Paolo Abeni's avatar
    net: gro: minor optimization for dev_gro_receive() · de5a1f3c
    Paolo Abeni authored
    While inspecting some perf report, I noticed that the compiler
    emits suboptimal code for the napi CB initialization, fetching
    and storing multiple times the memory for flags bitfield.
    This is with gcc 10.3.1, but I observed the same with older compiler
    versions.
    
    We can help the compiler to do a nicer work clearing several
    fields at once using an u32 alias. The generated code is quite
    smaller, with the same number of conditional.
    
    Before:
    objdump -t net/core/gro.o | grep " F .text"
    0000000000000bb0 l     F .text	0000000000000357 dev_gro_receive
    
    After:
    0000000000000bb0 l     F .text	000000000000033c dev_gro_receive
    
    v1  -> v2:
     - use struct_group (Alexander and Alex)
    
    RFC -> v1:
     - use __struct_group to delimit the zeroed area (Alexander)
    Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    de5a1f3c
gro.h 11.8 KB