1. 19 Nov, 2016 5 commits
  2. 18 Nov, 2016 33 commits
  3. 17 Nov, 2016 2 commits
    • David S. Miller's avatar
      Merge branch 'rds-ha-failover-fixes' · fcd2b0da
      David S. Miller authored
      Sowmini Varadhan says:
      
      ====================
      RDS: TCP: HA/Failover fixes
      
      This series contains a set of fixes for bugs exposed when
      we ran the following in a loop between a test machine pair:
      
       while (1); do
         # modprobe rds-tcp on test nodes
         # run rds-stress in bi-dir mode between test machine pair
         # modprobe -r rds-tcp on test nodes
       done
      
      rds-stress in bi-dir mode will cause both nodes to initiate
      RDS-TCP connections at almost the same instant, exposing the
      bugs fixed in this series.
      
      Without the fixes, rds-stress reports sporadic packet drops,
      and packets arriving out of sequence. After the fixes,we have
      been able to run the  test overnight, without any issues.
      
      Each patch has a detailed description of the root-cause fixed
      by the patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcd2b0da
    • Sowmini Varadhan's avatar
      RDS: TCP: Force every connection to be initiated by numerically smaller IP address · 1a0e100f
      Sowmini Varadhan authored
      When 2 RDS peers initiate an RDS-TCP connection simultaneously,
      there is a potential for "duelling syns" on either/both sides.
      See commit 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an
      outgoing socket in rds_tcp_accept_one()") for a description of this
      condition, and the arbitration logic which ensures that the
      numerically large IP address in the TCP connection is bound to the
      RDS_TCP_PORT ("canonical ordering").
      
      The rds_connection should not be marked as RDS_CONN_UP until the
      arbitration logic has converged for the following reason. The sender
      may start transmitting RDS datagrams as soon as RDS_CONN_UP is set,
      and since the sender removes all datagrams from the rds_connection's
      cp_retrans queue based on TCP acks. If the TCP ack was sent from
      a tcp socket that got reset as part of duel aribitration (but
      before data was delivered to the receivers RDS socket layer),
      the sender may end up prematurely freeing the datagram, and
      the datagram is no longer reliably deliverable.
      
      This patch remedies that condition by making sure that, upon
      receipt of 3WH completion state change notification of TCP_ESTABLISHED
      in rds_tcp_state_change, we mark the rds_connection as RDS_CONN_UP
      if, and only if, the IP addresses and ports for the connection are
      canonically ordered. In all other cases, rds_tcp_state_change will
      force an rds_conn_path_drop(), and rds_queue_reconnect() on
      both peers will restart the connection to ensure canonical ordering.
      
      A side-effect of enforcing this condition in rds_tcp_state_change()
      is that rds_tcp_accept_one_path() can now be refactored for simplicity.
      It is also no longer possible to encounter an RDS_CONN_UP connection in
      the arbitration logic in rds_tcp_accept_one().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a0e100f