• Sergei Shtylyov's avatar
    sh_eth: RX checksum offload support · f8e022db
    Sergei Shtylyov authored
    Add support for the RX checksum offload. This is enabled by default and
    may be disabled and re-enabled using 'ethtool':
    
    # ethtool -K eth0 rx off
    # ethtool -K eth0 rx on
    
    Some Ether MACs provide a simple checksumming scheme which appears to be
    completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
    the L2 header is appended to packet data; this may be trivially read by
    the driver and used to update the skb accordingly. The same checksumming
    scheme is implemented in the EtherAVB MACs and now supported by the 'ravb'
    driver.
    
    In terms of performance, throughput is close to gigabit line rate with the
    RX checksum offload both enabled and disabled.  The 'perf' output, however,
    appears to indicate that significantly less time is spent in do_csum() --
    this is as expected.
    
    Test results with RX checksum offload enabled:
    
    ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
    TCP MAERTS TEST to 192.168.2.4
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec
    
    131072  16384  16384    10.01     933.93
    [ perf record: Woken up 8 times to write data ]
    [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ]
    ~/netperf-2.2pl4# perf report
    Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763
    Overhead  Command          Shared Object             Symbol
       9.44%  netperf          [kernel.kallsyms]         [k] __arch_copy_to_user
       7.75%  swapper          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
       6.31%  swapper          [kernel.kallsyms]         [k] default_idle_call
       5.89%  swapper          [kernel.kallsyms]         [k] arch_cpu_idle
       4.37%  swapper          [kernel.kallsyms]         [k] tick_nohz_idle_exit
       4.02%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irq
       2.52%  netperf          [kernel.kallsyms]         [k] preempt_count_sub
       1.81%  netperf          [kernel.kallsyms]         [k] tcp_recvmsg
       1.80%  netperf          [kernel.kallsyms]         [k] _raw_spin_unlock_irqres
       1.78%  netperf          [kernel.kallsyms]         [k] preempt_count_add
       1.36%  netperf          [kernel.kallsyms]         [k] __tcp_transmit_skb
       1.20%  netperf          [kernel.kallsyms]         [k] __local_bh_enable_ip
       1.10%  netperf          [kernel.kallsyms]         [k] sh_eth_start_xmit
    
    Test results with RX checksum offload disabled:
    
    ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4
    TCP MAERTS TEST to 192.168.2.4
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec
    131072  16384  16384    10.01     932.04
    [ perf record: Woken up 14 times to write data ]
    [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ]
    ~/netperf-2.2pl4# perf report
    Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796
    Overhead  Command          Shared Object       Symbol
       7.00%  swapper          [kernel.kallsyms]   [k] do_csum
       3.94%  swapper          [kernel.kallsyms]   [k] sh_eth_poll
       3.83%  ksoftirqd/0      [kernel.kallsyms]   [k] do_csum
       3.23%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irq
       2.87%  netperf          [kernel.kallsyms]   [k] __arch_copy_to_user
       2.86%  swapper          [kernel.kallsyms]   [k] arch_cpu_idle
       2.13%  swapper          [kernel.kallsyms]   [k] default_idle_call
       2.12%  ksoftirqd/0      [kernel.kallsyms]   [k] sh_eth_poll
       2.02%  swapper          [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
       1.84%  swapper          [kernel.kallsyms]   [k] __softirqentry_text_start
       1.64%  swapper          [kernel.kallsyms]   [k] tick_nohz_idle_exit
       1.53%  netperf          [kernel.kallsyms]   [k] _raw_spin_unlock_irq
       1.32%  netperf          [kernel.kallsyms]   [k] preempt_count_sub
       1.27%  swapper          [kernel.kallsyms]   [k] __pi___inval_dcache_area
       1.22%  swapper          [kernel.kallsyms]   [k] check_preemption_disabled
       1.01%  ksoftirqd/0      [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
    
    The above results collected on the R-Car V3H Starter Kit board.
    
    Based on the commit 4d86d381 ("ravb: RX checksum offload")...
    Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    f8e022db
sh_eth.c 85.9 KB