1. 03 Sep, 2012 2 commits
  2. 02 Sep, 2012 2 commits
  3. 01 Sep, 2012 4 commits
    • David S. Miller's avatar
      Merge branch 'tcp_fastopen_server' · 1bed966c
      David S. Miller authored
      Jerry Chu says:
      
      ====================
      This patch series provides the server (passive open) side code
      for TCP Fast Open. Together with the earlier client side patches
      it completes the TCP Fast Open implementation.
      
      The server side Fast Open code accepts data carried in the SYN
      packet with a valid Fast Open cookie, and passes it to the
      application right away, allowing application to send back response
      data, all before TCP's 3-way handshake finishes.
      
      A simple cookie scheme together with capping the number of
      outstanding TFO requests (still in TCP_SYN_RECV state) to a limit
      per listener forms the main line of defense against spoofed SYN
      attacks.
      
      For more details about TCP Fast Open see our IETF internet draft
      at http://www.ietf.org/id/draft-ietf-tcpm-fastopen-01.txt
      and a research paper at
      http://conferences.sigcomm.org/co-next/2011/papers/1569470463.pdf
      
      A prototype implementation was first developed by Sivasankar
      Radhakrishnan (sivasankar@cs.ucsd.edu).
      
      A patch based on an older version of Linux kernel has been
      undergoing internal tests at Google for the past few months.
      
      Jerry Chu (3):
        tcp: TCP Fast Open Server - header & support functions
        tcp: TCP Fast Open Server - support TFO listeners
        tcp: TCP Fast Open Server - main code path
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bed966c
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - main code path · 168a8f58
      Jerry Chu authored
      This patch adds the main processing path to complete the TFO server
      patches.
      
      A TFO request (i.e., SYN+data packet with a TFO cookie option) first
      gets processed in tcp_v4_conn_request(). If it passes the various TFO
      checks by tcp_fastopen_check(), a child socket will be created right
      away to be accepted by applications, rather than waiting for the 3WHS
      to finish.
      
      In additon to the use of TFO cookie, a simple max_qlen based scheme
      is put in place to fend off spoofed TFO attack.
      
      When a valid ACK comes back to tcp_rcv_state_process(), it will cause
      the state of the child socket to switch from either TCP_SYN_RECV to
      TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time
      retransmission will resume for any unack'ed (data, FIN,...) segments.
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      168a8f58
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - support TFO listeners · 8336886f
      Jerry Chu authored
      This patch builds on top of the previous patch to add the support
      for TFO listeners. This includes -
      
      1. allocating, properly initializing, and managing the per listener
      fastopen_queue structure when TFO is enabled
      
      2. changes to the inet_csk_accept code to support TFO. E.g., the
      request_sock can no longer be freed upon accept(), not until 3WHS
      finishes
      
      3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg()
      if it's a TFO socket
      
      4. properly closing a TFO listener, and a TFO socket before 3WHS
      finishes
      
      5. supporting TCP_FASTOPEN socket option
      
      6. modifying tcp_check_req() to use to check a TFO socket as well
      as request_sock
      
      7. supporting TCP's TFO cookie option
      
      8. adding a new SYN-ACK retransmit handler to use the timer directly
      off the TFO socket rather than the listener socket. Note that TFO
      server side will not retransmit anything other than SYN-ACK until
      the 3WHS is completed.
      
      The patch also contains an important function
      "reqsk_fastopen_remove()" to manage the somewhat complex relation
      between a listener, its request_sock, and the corresponding child
      socket. See the comment above the function for the detail.
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8336886f
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - header & support functions · 10467163
      Jerry Chu authored
      This patch adds all the necessary data structure and support
      functions to implement TFO server side. It also documents a number
      of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
      extension MIBs.
      
      In addition, it includes the following:
      
      1. a new TCP_FASTOPEN socket option an application must call to
      supply a max backlog allowed in order to enable TFO on its listener.
      
      2. A number of key data structures:
      "fastopen_rsk" in tcp_sock - for a big socket to access its
      request_sock for retransmission and ack processing purpose. It is
      non-NULL iff 3WHS not completed.
      
      "fastopenq" in request_sock_queue - points to a per Fast Open
      listener data structure "fastopen_queue" to keep track of qlen (# of
      outstanding Fast Open requests) and max_qlen, among other things.
      
      "listener" in tcp_request_sock - to point to the original listener
      for book-keeping purpose, i.e., to maintain qlen against max_qlen
      as part of defense against IP spoofing attack.
      
      3. various data structure and functions, many in tcp_fastopen.c, to
      support server side Fast Open cookie operations, including
      /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10467163
  4. 31 Aug, 2012 26 commits
  5. 30 Aug, 2012 6 commits
    • Merav Sicron's avatar
      bnx2x: Correct the ndo_poll_controller call · 14a15d61
      Merav Sicron authored
      This patch correct poll_bnx2x (ndo_poll_controller call) which was not
      functioning well with MSI-X.
      Signed-off-by: default avatarMerav Sicron <meravs@broadcom.com>
      Signed-off-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14a15d61
    • Merav Sicron's avatar
      bnx2x: Move netif_napi_add to the open call · 26614ba5
      Merav Sicron authored
      Move netif_napi_add for all queues from the probe call to the open call, to
      avoid the case that napi objects are added for queues that may eventually not
      be initialized and activated. With the former behavior, the driver could crash
      when netpoll was calling ndo_poll_controller.
      Signed-off-by: default avatarMerav Sicron <meravs@broadcom.com>
      Signed-off-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26614ba5
    • Eric Dumazet's avatar
      ipv4: must use rcu protection while calling fib_lookup · c5ae7d41
      Eric Dumazet authored
      Following lockdep splat was reported by Pavel Roskin :
      
      [ 1570.586223] ===============================
      [ 1570.586225] [ INFO: suspicious RCU usage. ]
      [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
      [ 1570.586229] -------------------------------
      [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
      [ 1570.586233]
      [ 1570.586233] other info that might help us debug this:
      [ 1570.586233]
      [ 1570.586236]
      [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
      [ 1570.586238] 2 locks held by Chrome_IOThread/4467:
      [ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
      [ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
      [ 1570.586260]
      [ 1570.586260] stack backtrace:
      [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
      [ 1570.586265] Call Trace:
      [ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
      [ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
      [ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
      [ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
      [ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
      [ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
      [ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
      [ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
      [ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
      [ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
      [ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
      [ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
      [ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
      [ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
      [ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
      [ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
      [ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
      [ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
      [ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarPavel Roskin <proski@gnu.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5ae7d41
    • Timur Tabi's avatar
      net/fsl_pq_mdio: add support for the Fman 1G MDIO controller · 761743eb
      Timur Tabi authored
      The MDIO controller on the Frame Manager (Fman) is compatible with the
      QE and Gianfar MDIO controllers, but we don't care about the TBI because
      the Ethernet drivers (FMD) take care of programming it.
      Signed-off-by: default avatarTimur Tabi <timur@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      761743eb
    • Timur Tabi's avatar
      net/fsl-pq-mdio: coalesce multiple memory allocations into one · dd3b8a32
      Timur Tabi authored
      Take advantage of the new mdiobus_alloc_size() function to combine three
      different memory allocations into one.  This also simplies the error
      handling.
      Signed-off-by: default avatarTimur Tabi <timur@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd3b8a32
    • Timur Tabi's avatar
      net/fsl_pq_mdio: streamline probing of MDIO nodes · afae5ad7
      Timur Tabi authored
      Make the device tree probe function more data-driven, so that it no longer
      searches the 'compatible' property more than once.  The of_device_id[] array
      allows for per-entry private data, so we use that to store details about each
      type of node that the driver supports.  This removes the need to check the
      'compatible' property inside the probe function.
      
      The driver supports four types on MDIO devices:
      
      1) Gianfar MDIO nodes that only map the MII registers
      2) Gianfar MDIO nodes that map the full MDIO register set
      3) eTSEC2 MDIO nodes (which map the full MDIO register set)
      4) QE MDIO nodes (which map only the MII registers)
      
      Gianfar, eTSEC2, and QE have different mappings for the TBIPA register, which
      is needed to initialize the TBI PHY.  In addition, the QE needs a special
      hack because of the way the device tree is ordered.
      
      All of this information is encapsulated in the fsl_pq_mdio_data structure,
      so when an MDIO node is probed, per-device data and functions are used
      to determine how to initialize the device.
      Signed-off-by: default avatarTimur Tabi <timur@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afae5ad7