1. 14 Jan, 2013 9 commits
  2. 12 Jan, 2013 17 commits
  3. 11 Jan, 2013 9 commits
  4. 10 Jan, 2013 5 commits
    • Eric Dumazet's avatar
      softirq: reduce latencies · c10d7367
      Eric Dumazet authored
      In various network workloads, __do_softirq() latencies can be up
      to 20 ms if HZ=1000, and 200 ms if HZ=100.
      
      This is because we iterate 10 times in the softirq dispatcher,
      and some actions can consume a lot of cycles.
      
      This patch changes the fallback to ksoftirqd condition to :
      
      - A time limit of 2 ms.
      - need_resched() being set on current task
      
      When one of this condition is met, we wakeup ksoftirqd for further
      softirq processing if we still have pending softirqs.
      
      Using need_resched() as the only condition can trigger RCU stalls,
      as we can keep BH disabled for too long.
      
      I ran several benchmarks and got no significant difference in
      throughput, but a very significant reduction of latencies (one order
      of magnitude) :
      
      In following bench, 200 antagonist "netperf -t TCP_RR" are started in
      background, using all available cpus.
      
      Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
      IRQ (hard+soft)
      
      Before patch :
      
      # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
      RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
      to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
      RT_LATENCY=550110.424
      MIN_LATENCY=146858
      MAX_LATENCY=997109
      P50_LATENCY=305000
      P90_LATENCY=550000
      P99_LATENCY=710000
      MEAN_LATENCY=376989.12
      STDDEV_LATENCY=184046.92
      
      After patch :
      
      # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
      RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
      to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
      RT_LATENCY=40545.492
      MIN_LATENCY=9834
      MAX_LATENCY=78366
      P50_LATENCY=33583
      P90_LATENCY=59000
      P99_LATENCY=69000
      MEAN_LATENCY=38364.67
      STDDEV_LATENCY=12865.26
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c10d7367
    • Eric Dumazet's avatar
      net_sched: more precise pkt_len computation · 1def9238
      Eric Dumazet authored
      One long standing problem with TSO/GSO/GRO packets is that skb->len
      doesn't represent a precise amount of bytes on wire.
      
      Headers are only accounted for the first segment.
      For TCP, thats typically 66 bytes per 1448 bytes segment missing,
      an error of 4.5 % for normal MSS value.
      
      As consequences :
      
      1) TBF/CBQ/HTB/NETEM/... can send more bytes than the assigned limits.
      2) Device stats are slightly under estimated as well.
      
      Fix this by taking account of headers in qdisc_skb_cb(skb)->pkt_len
      computation.
      
      Packet schedulers should use qdisc pkt_len instead of skb->len for their
      bandwidth limitations, and TSO enabled devices drivers could use pkt_len
      if their statistics are not hardware assisted, and if they don't scratch
      skb->cb[] first word.
      
      Both egress and ingress paths work, thanks to commit fda55eca
      (net: introduce skb_transport_header_was_set()) : If GRO built
      a GSO packet, it also set the transport header for us.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Paolo Valente <paolo.valente@unimore.it>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1def9238
    • Vijay Subramanian's avatar
      doc: Clarify behavior when sysctl tcp_ecn = 1 · 3d55b323
      Vijay Subramanian authored
      Recent commit (commit 7e3a2dc5 doc: make the description of how tcp_ecn
      works more explicit and clear ) clarified the behavior of tcp_ecn sysctl
      variable but description is inconsistent. When requested by incoming conections,
      ECN is enabled with not just tcp_ecn = 2 but also with tcp_ecn = 1.
      
      This patch makes it clear that with tcp_ecn = 1, ECN is enabled when requested
      by incoming connections.
      
      Also fix spelling of 'incoming'.
      Signed-off-by: default avatarVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d55b323
    • Ariel Elior's avatar
      bnx2x: align define usage to satisfy static checkers · 23826850
      Ariel Elior authored
      Static checkers complained that the E1H_FUNC_MAX define is used
      incorrectly in bnx2x_pretend_func(). The complaint was justified,
      although its not a real bug, as the first part of the conditional
      protects us in this case (a real bug would happen if a VF tried to
      use the pretend func, but there are no VFs in E1H chips).
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23826850
    • Eric Dumazet's avatar
      veth: fix a NULL deref in netif_carrier_off · 2efd32ee
      Eric Dumazet authored
      In commit d0e2c55e (veth: avoid a NULL deref in veth_stats_one)
      we now clear the peer pointers in veth_dellink()
      
      veth_close() must therefore make sure the peer pointer is set.
      Reported-by: default avatarTom Parkin <tom.parkin@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2efd32ee