1. 07 Jan, 2021 9 commits
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: convert to build_skb · d27de0ef
      Sieng Piaw Liew authored
      We can increase the efficiency of rx path by using buffers to receive
      packets then build SKBs around them just before passing into the network
      stack. In contrast, preallocating SKBs too early reduces CPU cache
      efficiency.
      
      Check if we're in NAPI context when refilling RX. Normally we're almost
      always running in NAPI context. Dispatch to napi_alloc_frag directly
      instead of relying on netdev_alloc_frag which does the same but
      with the overhead of local_bh_disable/enable.
      
      Tested on BCM6328 320 MHz and iperf3 -M 512 to measure packet/sec
      performance. Included netif_receive_skb_list and NET_IP_ALIGN
      optimizations.
      
      Before:
      [ ID] Interval           Transfer     Bandwidth       Retr
      [  4]   0.00-10.00  sec  49.9 MBytes  41.9 Mbits/sec  197         sender
      [  4]   0.00-10.00  sec  49.3 MBytes  41.3 Mbits/sec            receiver
      
      After:
      [ ID] Interval           Transfer     Bandwidth       Retr
      [  4]   0.00-30.00  sec   171 MBytes  47.8 Mbits/sec  272         sender
      [  4]   0.00-30.00  sec   170 MBytes  47.6 Mbits/sec            receiver
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d27de0ef
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: consolidate rx SKB ring cleanup code · 3d0b7265
      Sieng Piaw Liew authored
      The rx SKB ring use the same code for cleanup at various points.
      Combine them into a function to reduce lines of code.
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d0b7265
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: alloc rx skb with NET_IP_ALIGN · c4a20786
      Sieng Piaw Liew authored
      Use netdev_alloc_skb_ip_align on newer SoCs with integrated switch
      (enetsw) when refilling RX. Increases packet processing performance
      by 30% (with netif_receive_skb_list).
      
      Non-enetsw SoCs cannot function with the extra pad so continue to use
      the regular netdev_alloc_skb.
      
      Tested on BCM6328 320 MHz and iperf3 -M 512 to measure packet/sec
      performance.
      
      Before:
      [ ID] Interval Transfer Bandwidth Retr
      [ 4] 0.00-30.00 sec 120 MBytes 33.7 Mbits/sec 277 sender
      [ 4] 0.00-30.00 sec 120 MBytes 33.5 Mbits/sec receiver
      
      After (+netif_receive_skb_list):
      [ 4] 0.00-30.00 sec 155 MBytes 43.3 Mbits/sec 354 sender
      [ 4] 0.00-30.00 sec 154 MBytes 43.1 Mbits/sec receiver
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c4a20786
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: add xmit_more support · 375281d3
      Sieng Piaw Liew authored
      Support bulking hardware TX queue by using netdev_xmit_more().
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      375281d3
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: add BQL support · 4c59b0f5
      Sieng Piaw Liew authored
      Add Byte Queue Limits support to reduce/remove bufferbloat in
      bcm63xx_enet.
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4c59b0f5
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: batch process rx path · 9cbfea02
      Sieng Piaw Liew authored
      Use netif_receive_skb_list to batch process rx skb.
      Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
      by 12.5%.
      
      Before:
      [ ID] Interval           Transfer     Bandwidth       Retr
      [  4]   0.00-30.00  sec   120 MBytes  33.7 Mbits/sec  277         sender
      [  4]   0.00-30.00  sec   120 MBytes  33.5 Mbits/sec            receiver
      
      After:
      [ ID] Interval           Transfer     Bandwidth       Retr
      [  4]   0.00-30.00  sec   136 MBytes  37.9 Mbits/sec  203         sender
      [  4]   0.00-30.00  sec   135 MBytes  37.7 Mbits/sec            receiver
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9cbfea02
    • Kristian Evensen's avatar
      qmi_wwan: Increase headroom for QMAP SKBs · 2e423387
      Kristian Evensen authored
      When measuring the throughput (iperf3 + TCP) while routing on a
      not-so-powerful device (Mediatek MT7621, 880MHz CPU), I noticed that I
      achieved significantly lower speeds with QMI-based modems than for
      example a USB LAN dongle. The CPU was saturated in all of my tests.
      
      With the dongle I got ~300 Mbit/s, while I only measured ~200 Mbit/s
      with the modems. All offloads, etc.  were switched off for the dongle,
      and I configured the modems to use QMAP (16k aggregation). The tests
      with the dongle were performed in my local (gigabit) network, while the
      LTE network the modems were connected to delivers 700-800 Mbit/s.
      
      Profiling the kernel revealed the cause of the performance difference.
      In qmimux_rx_fixup(), an SKB is allocated for each packet contained in
      the URB. This SKB has too little headroom, causing the check in
      skb_cow() (called from ip_forward()) to fail. pskb_expand_head() is then
      called and the SKB is reallocated. In the output from perf, I see that a
      significant amount of time is spent in pskb_expand_head() + support
      functions.
      
      In order to ensure that the SKB has enough headroom, this commit
      increases the amount of memory allocated in qmimux_rx_fixup() by
      LL_MAX_HEADER. The reason for using LL_MAX_HEADER and not a more
      accurate value, is that we do not know the type of the outgoing network
      interface. After making this change, I achieve the same throughput with
      the modems as with the dongle.
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Link: https://lore.kernel.org/r/20210106122403.1321180-1-kristian.evensen@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e423387
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-next-for-5.12-20210106' of... · c10b377f
      Jakub Kicinski authored
      Merge tag 'linux-can-next-for-5.12-20210106' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can-next 2021-01-06
      
      The first 16 patches are by me and target the tcan4x5x SPI glue driver for the
      m_can CAN driver. First there are a several cleanup commits, then the SPI
      regmap part is converted to 8 bits per word, to make it possible to use that
      driver on SPI controllers that only support the 8 bit per word mode (such as
      the SPI cores on the raspberry pi).
      
      Oliver Hartkopp contributes a patch for the CAN_RAW protocol. The getsockopt()
      for CAN_RAW_FILTER is changed to return -ERANGE if the filterset does not fit
      into the provided user space buffer.
      
      The last two patches are by Joakim Zhang and add wakeup support to the flexcan
      driver for the i.MX8QM SoC. The dt-bindings docs are extended to describe the
      added property.
      
      * tag 'linux-can-next-for-5.12-20210106' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
        can: flexcan: add CAN wakeup function for i.MX8QM
        dt-bindings: can: fsl,flexcan: add fsl,scu-index property to indicate a resource
        can: raw: return -ERANGE when filterset does not fit into user space buffer
        can: tcan4x5x: add support for half-duplex controllers
        can: tcan4x5x: rework SPI access
        can: tcan4x5x: add {wr,rd}_table
        can: tcan4x5x: add max_raw_{read,write} of 256
        can: tcan4x5x: tcan4x5x_regmap: set reg_stride to 4
        can: tcan4x5x: fix max register value
        can: tcan4x5x: tcan4x5x_regmap_init(): use spi as context pointer
        can: tcan4x5x: tcan4x5x_regmap_write(): remove not needed casts and replace 4 by sizeof
        can: tcan4x5x: rename regmap_spi_gather_write() -> tcan4x5x_regmap_gather_write()
        can: tcan4x5x: remove regmap async support
        can: tcan4x5x: tcan4x5x_bus: remove not needed read_flag_mask
        can: tcan4x5x: mark struct regmap_bus tcan4x5x_bus as constant
        can: tcan4x5x: move regmap code into seperate file
        can: tcan4x5x: rename tcan4x5x.c -> tcan4x5x-core.c
        can: tcan4x5x: beautify indention of tcan4x5x_of_match and tcan4x5x_id_table
        can: tcan4x5x: replace DEVICE_NAME by KBUILD_MODNAME
      ====================
      
      Link: https://lore.kernel.org/r/20210107094900.173046-1-mkl@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c10b377f
    • Rafał Miłecki's avatar
      net: dsa: print error on invalid port index · 8209f5bc
      Rafał Miłecki authored
      Looking for an -EINVAL all over the dsa code could take hours for
      inexperienced DSA users.
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20210106090915.21439-1-zajec5@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8209f5bc
  2. 06 Jan, 2021 28 commits
  3. 05 Jan, 2021 3 commits