• Eric Dumazet's avatar
    net: speedup dst_release() · ef711cf1
    Eric Dumazet authored
    During tbench/oprofile sessions, I found that dst_release() was in third position.
    
    CPU: Core 2, speed 2999.68 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
    samples  %        symbol name
    483726    9.0185  __copy_user_zeroing_intel
    191466    3.5697  __copy_user_intel
    185475    3.4580  dst_release
    175114    3.2648  ip_queue_xmit
    153447    2.8608  tcp_sendmsg
    108775    2.0280  tcp_recvmsg
    102659    1.9140  sysenter_past_esp
    101450    1.8914  tcp_current_mss
    95067     1.7724  __copy_from_user_ll
    86531     1.6133  tcp_transmit_skb
    
    Of course, all CPUS fight on the dst_entry associated with 127.0.0.1 
    
    Instead of first checking the refcount value, then decrement it,
    we use atomic_dec_return() to help CPU to make the right memory transaction
    (ie getting the cache line in exclusive mode)
    
    dst_release() is now at the fifth position, and tbench a litle bit faster ;)
    
    CPU: Core 2, speed 3000.1 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
    samples  %        symbol name
    647107    8.8072  __copy_user_zeroing_intel
    258840    3.5229  ip_queue_xmit
    258302    3.5155  __copy_user_intel
    209629    2.8531  tcp_sendmsg
    165632    2.2543  dst_release
    149232    2.0311  tcp_current_mss
    147821    2.0119  tcp_recvmsg
    137893    1.8767  sysenter_past_esp
    127473    1.7349  __copy_from_user_ll
    121308    1.6510  ip_finish_output
    118510    1.6129  tcp_transmit_skb
    109295    1.4875  tcp_v4_rcv
    Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    ef711cf1
dst.c 8.07 KB