1. 02 Jul, 2018 11 commits
    • Yafang Shao's avatar
      net: expose sk wmem in sock_exceed_buf_limit tracepoint · d6f19938
      Yafang Shao authored
      Currently trace_sock_exceed_buf_limit() only show rmem info,
      but wmem limit may also be hit.
      So expose wmem info in this tracepoint as well.
      
      Regarding memcg, I think it is better to introduce a new tracepoint(if
      that is needed), i.e. trace_memcg_limit_hit other than show memcg info in
      trace_sock_exceed_buf_limit.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6f19938
    • Heiner Kallweit's avatar
      r8169: remove old PHY reset hack · 335c997d
      Heiner Kallweit authored
      This hack (affecting the non-PCIe models only) was introduced in 2004
      to deal with link negotiation failures in 1GBit mode. Based on a
      comment in the r8169 vendor driver I assume the issue affects RTL8169sb
      in combination with particular 1GBit switch models.
      
      Resetting the PHY every 10s and hoping that one fine day we will make
      it to establish the link seems to be very hacky to me. I'd say:
      If 1GBit doesn't work reliably in a users environment then the user
      should remove 1GBit from the advertised modes, e.g. by using
      ethtool -s <if> advertise <10/100 modes>
      
      If the issue affects one chip version only and that with most link
      partners, then we could also think of removing 1GBit from the
      advertised modes for this chip version in the driver.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      335c997d
    • Colin Ian King's avatar
      netdevsim: fix sa_idx out of bounds check · c02462d8
      Colin Ian King authored
      Currently if sa_idx is equal to NSIM_IPSEC_MAX_SA_COUNT then
      an out-of-bounds read on ipsec->sa will occur. Fix the
      incorrect bounds check by using >= rather than >.
      
      Detected by CoverityScan, CID#1470226 ("Out-of-bounds-read")
      
      Fixes: 7699353d ("netdevsim: add ipsec offload testing")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c02462d8
    • David S. Miller's avatar
      Merge branch 'xps-symmretric-queue-selection' · 97680ade
      David S. Miller authored
      Amritha Nambiar says:
      
      ====================
      Symmetric queue selection using XPS for Rx queues
      
      This patch series implements support for Tx queue selection based on
      Rx queue(s) map. This is done by configuring Rx queue(s) map per Tx-queue
      using sysfs attribute. If the user configuration for Rx queues does
      not apply, then the Tx queue selection falls back to XPS using CPUs and
      finally to hashing.
      
      XPS is refactored to support Tx queue selection based on either the
      CPUs map or the Rx-queues map. The config option CONFIG_XPS needs to be
      enabled. By default no receive queues are configured for the Tx queue.
      
      - /sys/class/net/<dev>/queues/tx-*/xps_rxqs
      
      A set of receive queues can be mapped to a set of transmit queues (many:many),
      although the common use case is a 1:1 mapping. This will enable sending
      packets on the same Tx-Rx queue association as this is useful for busy polling
      multi-threaded workloads where it is not possible to pin the threads to
      a CPU. This is a rework of Sridhar's patch for symmetric queueing via
      socket option:
      https://www.spinics.net/lists/netdev/msg453106.html
      
      Testing Hints:
      Kernel:  Linux 4.17.0-rc7+
      Interface:
      driver: ixgbe
      version: 5.1.0-k
      firmware-version: 0x00015e0b
      
      Configuration:
      ethtool -L $iface combined 16
      ethtool -C $iface rx-usecs 1000
      sysctl net.core.busy_poll=1000
      ATR disabled:
      ethtool -K $iface ntuple on
      
      Workload:
      Modified memcached that changes the thread selection policy to be based
      on the incoming rx-queue of a connection using SO_INCOMING_NAPI_ID socket
      option. The default is round-robin.
      
      Default: No rxqs_map configured
      Symmetric queues: Enable rxqs_map for all queues 1:1 mapped to Tx queue
      
      System:
      Architecture:          x86_64
      CPU(s):                72
      Model name:            Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
      
      16 threads  400K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                 4/51/2215               2/30/5163
      (usec)
      
      intr/sec                        26655                   18606
      
      contextswitch/sec               5145                    4044
      
      insn per cycle                  0.43                    0.72
      
      cache-misses                    6.919                   4.310
      (% of all cache refs)
      
      L1-dcache-load-                 4.49                    3.29
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 13.26                   8.96
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      32 threads  400K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                 10/112/5562             9/46/4637
      (usec)
      
      intr/sec                        30456                   27666
      
      contextswitch/sec               7552                    5133
      
      insn per cycle                  0.41                    0.49
      
      cache-misses                    9.357                   2.769
      (% of all cache refs)
      
      L1-dcache-load-                 4.09                    3.98
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 12.96                   3.96
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      16 threads  800K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                  5/151/4989             9/69/2611
      (usec)
      
      intr/sec                        35686                   22907
      
      contextswitch/sec               25522                   12281
      
      insn per cycle                  0.67                    0.74
      
      cache-misses                    8.652                   6.38
      (% of all cache refs)
      
      L1-dcache-load-                 3.19                    2.86
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 16.53                   11.99
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      32 threads  800K requests/sec
      =============================
      -------------------------------------------------------------------------------
                                      Default                 Symmetric queues
      -------------------------------------------------------------------------------
      RTT min/avg/max                  6/163/6152             8/88/4209
      (usec)
      
      intr/sec                        47079                   26548
      
      contextswitch/sec               42190                   39168
      
      insn per cycle                  0.45                    0.54
      
      cache-misses                    8.798                   4.668
      (% of all cache refs)
      
      L1-dcache-load-                 6.55                    6.29
      -misses
      (% of all L1-dcache hits)
      
      LLC-load-misses                 13.91                   10.44
      (% of all LL-cache hits)
      
      -------------------------------------------------------------------------------
      
      v6:
      - Changed the names of some functions to begin with net_if.
      - Cleaned up sk_tx_queue_set/sk_rx_queue_set functions.
      - Added sk_rx_queue_clear to make it consistent with tx_queue_mapping
        initialization.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97680ade
    • Amritha Nambiar's avatar
    • Amritha Nambiar's avatar
      net-sysfs: Add interface for Rx queue(s) map per Tx queue · 8af2c06f
      Amritha Nambiar authored
      Extend transmit queue sysfs attribute to configure Rx queue(s) map
      per Tx queue. By default no receive queues are configured for the
      Tx queue.
      
      - /sys/class/net/eth0/queues/tx-*/xps_rxqs
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8af2c06f
    • Amritha Nambiar's avatar
      net: Enable Tx queue selection based on Rx queues · fc9bab24
      Amritha Nambiar authored
      This patch adds support to pick Tx queue based on the Rx queue(s) map
      configuration set by the admin through the sysfs attribute
      for each Tx queue. If the user configuration for receive queue(s) map
      does not apply, then the Tx queue selection falls back to CPU(s) map
      based selection and finally to hashing.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc9bab24
    • Amritha Nambiar's avatar
      net: Record receive queue number for a connection · c6345ce7
      Amritha Nambiar authored
      This patch adds a new field to sock_common 'skc_rx_queue_mapping'
      which holds the receive queue number for the connection. The Rx queue
      is marked in tcp_finish_connect() to allow a client app to do
      SO_INCOMING_NAPI_ID after a connect() call to get the right queue
      association for a socket. Rx queue is also marked in tcp_conn_request()
      to allow syn-ack to go on the right tx-queue associated with
      the queue on which syn is received.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6345ce7
    • Amritha Nambiar's avatar
      net: sock: Change tx_queue_mapping in sock_common to unsigned short · 755c31cd
      Amritha Nambiar authored
      Change 'skc_tx_queue_mapping' field in sock_common structure from
      'int' to 'unsigned short' type with ~0 indicating unset and
      other positive queue values being set. This will accommodate adding
      a new 'unsigned short' field in sock_common in the next patch for
      rx_queue_mapping.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      755c31cd
    • Amritha Nambiar's avatar
      net: Use static_key for XPS maps · 04157469
      Amritha Nambiar authored
      Use static_key for XPS maps to reduce the cost of extra map checks,
      similar to how it is used for RPS and RFS. This includes static_key
      'xps_needed' for XPS and another for 'xps_rxqs_needed' for XPS using
      Rx queues map.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04157469
    • Amritha Nambiar's avatar
      net: Refactor XPS for CPUs and Rx queues · 80d19669
      Amritha Nambiar authored
      Refactor XPS code to support Tx queue selection based on
      CPU(s) map or Rx queue(s) map.
      Signed-off-by: default avatarAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80d19669
  2. 30 Jun, 2018 29 commits