1. 01 Jun, 2012 5 commits
    • Eric Dumazet's avatar
      tcp: reflect SYN queue_mapping into SYNACK packets · fff32699
      Eric Dumazet authored
      While testing how linux behaves on SYNFLOOD attack on multiqueue device
      (ixgbe), I found that SYNACK messages were dropped at Qdisc level
      because we send them all on a single queue.
      
      Obvious choice is to reflect incoming SYN packet @queue_mapping to
      SYNACK packet.
      
      Under stress, my machine could only send 25.000 SYNACK per second (for
      200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues.
      
      After patch, not a single SYNACK is dropped.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fff32699
    • Eric Dumazet's avatar
      tcp: do not create inetpeer on SYNACK message · 7433819a
      Eric Dumazet authored
      Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting
      larger and larger, using lots of memory and cpu time.
      
      tcp_v4_send_synack()
      ->inet_csk_route_req()
       ->ip_route_output_flow()
        ->rt_set_nexthop()
         ->rt_init_metrics()
          ->inet_getpeer( create = true)
      
      This is a side effect of commit a4daad6b (net: Pre-COW metrics for
      TCP) added in 2.6.39
      
      Possible solution :
      
      Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS
      
      Before patch :
      
      # grep peer /proc/slabinfo
      inet_peer_cache   4175430 4175430    192   42    2 : tunables    0    0    0 : slabdata  99415  99415      0
      
      Samples: 41K of event 'cycles', Event count (approx.): 30716565122
      +  20,24%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_getpeer
      +   8,19%      ksoftirqd/0  [kernel.kallsyms]           [k] peer_avl_rebalance.isra.1
      +   4,81%      ksoftirqd/0  [kernel.kallsyms]           [k] sha_transform
      +   3,64%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_table_lookup
      +   2,36%      ksoftirqd/0  [ixgbe]                     [k] ixgbe_poll
      +   2,16%      ksoftirqd/0  [kernel.kallsyms]           [k] __ip_route_output_key
      +   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] kernel_map_pages
      +   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] ip_route_input_common
      +   2,01%      ksoftirqd/0  [kernel.kallsyms]           [k] __inet_lookup_established
      +   1,83%      ksoftirqd/0  [kernel.kallsyms]           [k] md5_transform
      +   1,75%      ksoftirqd/0  [kernel.kallsyms]           [k] check_leaf.isra.9
      +   1,49%      ksoftirqd/0  [kernel.kallsyms]           [k] ipt_do_table
      +   1,46%      ksoftirqd/0  [kernel.kallsyms]           [k] hrtimer_interrupt
      +   1,45%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_alloc
      +   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_csk_search_req
      +   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] __netif_receive_skb
      +   1,16%      ksoftirqd/0  [kernel.kallsyms]           [k] copy_user_generic_string
      +   1,15%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_free
      +   1,02%      ksoftirqd/0  [kernel.kallsyms]           [k] tcp_make_synack
      +   0,93%      ksoftirqd/0  [kernel.kallsyms]           [k] _raw_spin_lock_bh
      +   0,87%      ksoftirqd/0  [kernel.kallsyms]           [k] __call_rcu
      +   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] rt_garbage_collect
      +   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_rules_lookup
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7433819a
    • Jason Wang's avatar
      8139cp/8139too: terminate the eeprom access with the right opmode · 0bc777bc
      Jason Wang authored
      Currently, we terminate the eeprom access through clearing the CS by:
      
      RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr);
      
      This would left the eeprom into "Config. Register Write Enable:"
      state which is not expcted as the highest two bits were set to
      0x11 ( expected is the "Normal" mode (0x00)). Solving this by write
      0x0 instead of ~EE_CS when terminating the eeprom access.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0bc777bc
    • Jason Wang's avatar
      8139cp: set ring address before enabling receiver · b01af457
      Jason Wang authored
      Currently, we enable the receiver before setting the ring address which could
      lead the card DMA into unexpected areas. Solving this by set the ring address
      before enabling the receiver.
      
      btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please
      review it carefully.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b01af457
    • Paul Moore's avatar
      cipso: handle CIPSO options correctly when NetLabel is disabled · 20e2a864
      Paul Moore authored
      When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system
      receives a CIPSO tagged packet it is dropped (cipso_v4_validate()
      returns non-zero).  In most cases this is the correct and desired
      behavior, however, in the case where we are simply forwarding the
      traffic, e.g. acting as a network bridge, this becomes a problem.
      
      This patch fixes the forwarding problem by providing the basic CIPSO
      validation code directly in ip_options_compile() without the need for
      the NetLabel or CIPSO code.  The new validation code can not perform
      any of the CIPSO option label/value verification that
      cipso_v4_validate() does, but it can verify the basic CIPSO option
      format.
      
      The behavior when NetLabel is enabled is unchanged.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20e2a864
  2. 31 May, 2012 35 commits