1. 15 Oct, 2020 6 commits
  2. 14 Oct, 2020 23 commits
  3. 13 Oct, 2020 4 commits
    • Jakub Kicinski's avatar
      Merge branch 'macb-support-the-2-deep-Tx-queue-on-at91' · c93c5482
      Jakub Kicinski authored
      Willy Tarreau says:
      
      ====================
      macb: support the 2-deep Tx queue on at91
      
      while running some tests on my Breadbee board, I noticed poor network
      Tx performance. I had a look at the driver (macb, at91ether variant)
      and noticed that at91ether_start_xmit() immediately stops the queue
      after sending a frame and waits for the interrupt to restart the queue,
      causing a dead time after each packet is sent.
      
      The AT91RM9200 datasheet states that the controller supports two frames,
      one being sent and the other one being queued, so I performed minimal
      changes to support this. The transmit performance on my board has
      increased by 50% on medium-sized packets (HTTP traffic), and with large
      packets I can now reach line rate.
      
      Since this driver is shared by various platforms, I tried my best to
      isolate and limit the changes as much as possible and I think it's pretty
      reasonable as-is. I've run extensive tests and couldn't meet any
      unexpected situation (no stall, overflow nor lockup).
      
      There are 3 patches in this series. The first one adds the missing
      interrupt flag for RM9200 (TBRE, indicating the tx buffer is willing
      to take a new packet). The second one replaces the single skb with a
      2-array and uses only index 0. It does no other change, this is just
      to prepare the code for the third one. The third one implements the
      queue. Packets are added at the tail of the queue, the queue is
      stopped at 2 packets and the interrupt releases 0, 1 or 2 depending
      on what the transmit status register reports.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c93c5482
    • Willy Tarreau's avatar
      macb: support the two tx descriptors on at91rm9200 · 0a4e9ce1
      Willy Tarreau authored
      The at91rm9200 variant used by a few chips including the MSC313 supports
      two Tx descriptors (one frame being serialized and another one queued).
      However the driver only implemented a single one, which adds a dead time
      after each transfer to receive and process the interrupt and wake the
      queue up, preventing from reaching line rate.
      
      This patch implements a very basic 2-deep queue to address this limitation.
      The tests run on a Breadbee board equipped with an MSC313E show that at
      1 GHz, HTTP traffic on medium-sized objects (45kB) was limited to exactly
      50 Mbps before this patch, and jumped to 76 Mbps with this patch. And tests
      on a single TCP stream with an MTU of 576 jump from 10kpps to 15kpps. With
      1500 byte packets it's now possible to reach line rate versus 75 Mbps
      before.
      
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Daniel Palmer <daniel@0x0f.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Link: https://lore.kernel.org/r/20201011090944.10607-4-w@1wt.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0a4e9ce1
    • Willy Tarreau's avatar
      macb: prepare at91 to use a 2-frame TX queue · 73d74228
      Willy Tarreau authored
      The RM9200 supports one frame being sent while another one is waiting in
      queue. This avoids the dead time that follows the emission of a frame
      and which prevents one from reaching line speed.
      
      Right now the driver supports only a single skb, so we'll first replace
      the rm9200-specific skb info with an array of two macb_tx_skb (already
      used by other drivers). This patch only moves the skb_length to
      txq[0].size and skb_physaddr to skb[0].mapping but doesn't perform any
      other change. It already uses [desc] in order to minimize future changes.
      
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Daniel Palmer <daniel@0x0f.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Link: https://lore.kernel.org/r/20201011090944.10607-3-w@1wt.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      73d74228
    • Willy Tarreau's avatar
      macb: add RM9200's interrupt flag TBRE · fa6031df
      Willy Tarreau authored
      Transmit Buffer Register Empty replaces TXERR on RM9200 and signals the
      sender may try to send again becase the last queued frame is no longer
      in queue (being transmitted or already transmitted).
      
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Daniel Palmer <daniel@0x0f.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Link: https://lore.kernel.org/r/20201011090944.10607-2-w@1wt.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fa6031df
  4. 12 Oct, 2020 7 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ccdf7fae
      Jakub Kicinski authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2020-10-12
      
      The main changes are:
      
      1) The BPF verifier improvements to track register allocation pattern, from Alexei and Yonghong.
      
      2) libbpf relocation support for different size load/store, from Andrii.
      
      3) bpf_redirect_peer() helper and support for inner map array with different max_entries, from Daniel.
      
      4) BPF support for per-cpu variables, form Hao.
      
      5) sockmap improvements, from John.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ccdf7fae
    • Raed Salem's avatar
      net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offload · 5be01904
      Raed Salem authored
      In the TX data path, spot packets with xfrm stack IPsec offload
      indication.
      
      Fill Software-Parser segment in TX descriptor so that the hardware
      may parse the ESP protocol, and perform TX checksum offload on the
      inner payload.
      
      Support GSO, by providing the trailer data and ICV placeholder
      so HW can fill it post encryption operation.
      
      Padding alignment cannot be performed in HW (ConnectX-6Dx) due to
      a bug. Software can overcome this limitation by adding NETIF_F_HW_ESP to
      the gso_partial_features field in netdev so the packets being
      aligned by the stack.
      
      l4_inner_checksum cannot be offloaded by HW for IPsec tunnel type packet.
      
      Note that for GSO SKBs, the stack does not include an ESP trailer,
      unlike the non-GSO case.
      
      Below is the iperf3 performance report on two server of 24 cores
      Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with ConnectX6-DX.
      All the bandwidth test uses iperf3 TCP traffic with packet size 128KB.
      Each tunnel uses one iperf3 stream with one thread (option -P1).
      TX crypto offload shows improvements on both bandwidth
      and CPU utilization.
      
      ----------------------------------------------------------------------
      Mode            |  Num tunnel | BW     | Send CPU util | Recv CPU util
                      |             | (Gbps) | (Average %)   | (Average %)
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (RX only)       | 1           | 4.7    | 4.2           | 3.5
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (RX only)       | 24          | 15.6   | 20            | 10
      ----------------------------------------------------------------------
      Non-offload     | 1           | 4.6    | 4             | 5
      ----------------------------------------------------------------------
      Non-offload     | 24          | 11.9   | 16            | 12
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (TX & RX)       | 1           | 11.9   | 2.1           | 5.9
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (TX & RX)       | 24          | 38     | 9.5           | 27.5
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (TX only)       | 1           | 4.7    | 0.7           | 5
      ----------------------------------------------------------------------
      Cryto offload   |             |        |               |
      (TX only)       | 24          | 14.5   | 6             | 20
      
      Regression tests show no degradation on non-ipsec and
      non-offload-ipsec traffics. The packet rate test uses pktgen UDP to
      transmit on single CPU, the instructions and cycles are measured on
      the transmit CPU.
      
      before:
      ----------------------------------------------------------------------
      Non-offload             | 1           | 4.7    | 4.2           | 5.1
      ----------------------------------------------------------------------
      Non-offload             | 24          | 11.2   | 14            | 15
      ----------------------------------------------------------------------
      Non-ipsec               | 1           | 28     | 4             | 5.7
      ----------------------------------------------------------------------
      Non-ipsec               | 24          | 68.3   | 17.8          | 39.7
      ----------------------------------------------------------------------
      Non-ipsec packet rate(BURST=1000 BC=5 NCPUS=1 SIZE=60)
      13.56Mpps, 456 instructions/pkt, 191 cycles/pkt
      
      after:
      ----------------------------------------------------------------------
      Non-offload             | 1           | 4.69    | 4.2          | 5
      ----------------------------------------------------------------------
      Non-offload             | 24          | 11.9   | 13.5          | 15.1
      ----------------------------------------------------------------------
      Non-ipsec               | 1           | 29     | 3.2           | 5.5
      ----------------------------------------------------------------------
      Non-ipsec               | 24          | 68.2   | 18.5          | 39.8
      ----------------------------------------------------------------------
      Non-ipsec packet rate: 13.56Mpps, 472 instructions/pkt, 191 cycles/pkt
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      5be01904
    • Huy Nguyen's avatar
      net/mlx5e: IPsec: Add TX steering rule per IPsec state · 9b9d454d
      Huy Nguyen authored
      Add new FTE in TX IPsec FT per IPsec state. It has the
      same matching criteria as the RX steering rule.
      
      The IPsec FT is created/destroyed when the first/last rule
      is added/deleted respectively.
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarBoris Pismenny <borisp@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9b9d454d
    • Huy Nguyen's avatar
      net/mlx5: Add NIC TX domain namespace · ee92e4f1
      Huy Nguyen authored
      Add new namespace that represents the NIC TX domain.
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ee92e4f1
    • Colin Ian King's avatar
      net/mlx5: Fix uininitialized pointer read on pointer attr · 825f8b0b
      Colin Ian King authored
      Currently the error exit path err_free kfree's attr. In the case where
      flow and parse_attr failed to be allocated this return path will free
      the uninitialized pointer attr, which is not correct.  In the other
      case where attr fails to allocate attr does not need to be freed. So
      in both error exits via err_free attr should not be freed, so remove
      it.
      
      Addresses-Coverity: ("Uninitialized pointer read")
      Fixes: ff7ea04a ("net/mlx5e: Fix potential null pointer dereference")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      825f8b0b
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · a308283f
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for net-next:
      
      1) Inspect the reply packets coming from DR/TUN and refresh connection
         state and timeout, from longguang yue and Julian Anastasov.
      
      2) Series to add support for the inet ingress chain type in nf_tables.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a308283f
    • Jakub Kicinski's avatar
      Merge branch 'bnxt_en-Updates-for-net-next' · 547848af
      Jakub Kicinski authored
      Michael Chan says:
      
      ====================
      bnxt_en: Updates for net-next.
      
      This series contains these main changes:
      
      1. Change of default message level to enable more logging.
      2. Some cleanups related to processing async events from firmware.
      3. Allow online ethtool selftest on multi-function PFs.
      4. Return stored firmware version information to devlink.
      
      v2:
      Patch 3: Change bnxt_reset_task() to silent mode.
      Patch 8 & 9: Ensure we copy NULL terminated fw strings to devlink.
      Patch 8 & 9: Return directly after the last bnxt_dl_info_put() call.
      Patch 9: If FW call to get stored dev info fails, return success to
               devlink without the stored versions.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      547848af