1. 29 Mar, 2023 30 commits
  2. 28 Mar, 2023 10 commits
    • Saeed Mahameed's avatar
      net/mlx5e: Fix build break on 32bit · 163c2c70
      Saeed Mahameed authored
      The cited commit caused the following build break in mlx5 due to a change
      in size of MAX_SKB_FRAGS.
      
      error: format '%lu' expects argument of type 'long unsigned int',
             but argument 7 has type 'unsigned int' [-Werror=format=]
      
      Fix this by explicit casting.
      
      Fixes: 3948b059 ("net: introduce a config option to tweak MAX_SKB_FRAGS")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Link: https://lore.kernel.org/r/20230328200723.125122-1-saeed@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      163c2c70
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw: enable p0 host port rx_vlan_remap · 86e2eca4
      Grygorii Strashko authored
      By default, the tagged ingress packets to the switch from the host port
      P0 get internal switch priority assigned equal to the DMA CPPI channel
      number they came from, unless CPSW_P0_CONTROL_REG.RX_REMAP_VLAN is enabled.
      This causes issues with applying QoS policies and mapping packets on
      external port fifos, because the default configuration is vlan_aware and
      DMA CPPI channels are shared between all external ports.
      
      Hence enable CPSW_P0_CONTROL_REG.RX_REMAP_VLAN so packet will preserve
      internal switch priority assigned following the VLAN(priority) tag no
      matter through which DMA CPPI Channels packets enter the switch.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Link: https://lore.kernel.org/r/20230327092103.3256118-1-s-vadapalli@ti.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      86e2eca4
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw: add .ndo to set dma per-queue rate · 5c8560c4
      Grygorii Strashko authored
      Enable rate limiting TX DMA queues for CPSW interface by configuring the
      rate in absolute Mb/s units per TX queue.
      
      Example:
          ethtool -L eth0 tx 4
      
          echo 100 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
          echo 200 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
          echo 50 > /sys/class/net/eth0/queues/tx-2/tx_maxrate
          echo 30 > /sys/class/net/eth0/queues/tx-3/tx_maxrate
      
          # disable
          echo 0 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Link: https://lore.kernel.org/r/20230327085758.3237155-1-s-vadapalli@ti.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5c8560c4
    • Paolo Abeni's avatar
      Merge branch 'allocate-multiple-skbuffs-on-tx' · d8b0c963
      Paolo Abeni authored
      Arseniy Krasnov says:
      
      ====================
      allocate multiple skbuffs on tx
      
      This adds small optimization for tx path: instead of allocating single
      skbuff on every call to transport, allocate multiple skbuff's until
      credit space allows, thus trying to send as much as possible data without
      return to af_vsock.c.
      
      Also this patchset includes second patch which adds check and return from
      'virtio_transport_get_credit()' and 'virtio_transport_put_credit()' when
      these functions are called with 0 argument. This is needed, because zero
      argument makes both functions to behave as no-effect, but both of them
      always tries to acquire spinlock. Moreover, first patch always calls
      function 'virtio_transport_put_credit()' with zero argument in case of
      successful packet transmission.
      ====================
      
      Link: https://lore.kernel.org/r/b0d15942-65ba-3a32-ba8d-fed64332d8f6@sberdevices.ruSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d8b0c963
    • Arseniy Krasnov's avatar
      virtio/vsock: check argument to avoid no effect call · e3ec366e
      Arseniy Krasnov authored
      Both of these functions have no effect when input argument is 0, so to
      avoid useless spinlock access, check argument before it.
      Signed-off-by: default avatarArseniy Krasnov <AVKrasnov@sberdevices.ru>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e3ec366e
    • Arseniy Krasnov's avatar
      virtio/vsock: allocate multiple skbuffs on tx · b68ffb1b
      Arseniy Krasnov authored
      This adds small optimization for tx path: instead of allocating single
      skbuff on every call to transport, allocate multiple skbuff's until
      credit space allows, thus trying to send as much as possible data without
      return to af_vsock.c.
      Signed-off-by: default avatarArseniy Krasnov <AVKrasnov@sberdevices.ru>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b68ffb1b
    • Thomas Gleixner's avatar
      atomics: Provide rcuref - scalable reference counting · ee1ee6db
      Thomas Gleixner authored
      atomic_t based reference counting, including refcount_t, uses
      atomic_inc_not_zero() for acquiring a reference. atomic_inc_not_zero() is
      implemented with a atomic_try_cmpxchg() loop. High contention of the
      reference count leads to retry loops and scales badly. There is nothing to
      improve on this implementation as the semantics have to be preserved.
      
      Provide rcuref as a scalable alternative solution which is suitable for RCU
      managed objects. Similar to refcount_t it comes with overflow and underflow
      detection and mitigation.
      
      rcuref treats the underlying atomic_t as an unsigned integer and partitions
      this space into zones:
      
        0x00000000 - 0x7FFFFFFF	valid zone (1 .. (INT_MAX + 1) references)
        0x80000000 - 0xBFFFFFFF	saturation zone
        0xC0000000 - 0xFFFFFFFE	dead zone
        0xFFFFFFFF   			no reference
      
      rcuref_get() unconditionally increments the reference count with
      atomic_add_negative_relaxed(). rcuref_put() unconditionally decrements the
      reference count with atomic_add_negative_release().
      
      This unconditional increment avoids the inc_not_zero() problem, but
      requires a more complex implementation on the put() side when the count
      drops from 0 to -1.
      
      When this transition is detected then it is attempted to mark the reference
      count dead, by setting it to the midpoint of the dead zone with a single
      atomic_cmpxchg_release() operation. This operation can fail due to a
      concurrent rcuref_get() elevating the reference count from -1 to 0 again.
      
      If the unconditional increment in rcuref_get() hits a reference count which
      is marked dead (or saturated) it will detect it after the fact and bring
      back the reference count to the midpoint of the respective zone. The zones
      provide enough tolerance which makes it practically impossible to escape
      from a zone.
      
      The racy implementation of rcuref_put() requires to protect rcuref_put()
      against a grace period ending in order to prevent a subtle use after
      free. As RCU is the only mechanism which allows to protect against that, it
      is not possible to fully replace the atomic_inc_not_zero() based
      implementation of refcount_t with this scheme.
      
      The final drop is slightly more expensive than the atomic_dec_return()
      counterpart, but that's not the case which this is optimized for. The
      optimization is on the high frequeunt get()/put() pairs and their
      scalability.
      
      The performance of an uncontended rcuref_get()/put() pair where the put()
      is not dropping the last reference is still on par with the plain atomic
      operations, while at the same time providing overflow and underflow
      detection and mitigation.
      
      The performance of rcuref compared to plain atomic_inc_not_zero() and
      atomic_dec_return() based reference counting under contention:
      
       -  Micro benchmark: All CPUs running a increment/decrement loop on an
          elevated reference count, which means the 0 to -1 transition never
          happens.
      
          The performance gain depends on microarchitecture and the number of
          CPUs and has been observed in the range of 1.3X to 4.7X
      
       - Conversion of dst_entry::__refcnt to rcuref and testing with the
          localhost memtier/memcached benchmark. That benchmark shows the
          reference count contention prominently.
      
          The performance gain depends on microarchitecture and the number of
          CPUs and has been observed in the range of 1.1X to 2.6X over the
          previous fix for the false sharing issue vs. struct
          dst_entry::__refcnt.
      
          When memtier is run over a real 1Gb network connection, there is a
          small gain on top of the false sharing fix. The two changes combined
          result in a 2%-5% total gain for that networked test.
      Reported-by: default avatarWangyang Guo <wangyang.guo@intel.com>
      Reported-by: default avatarArjan Van De Ven <arjan.van.de.ven@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230323102800.158429195@linutronix.de
      ee1ee6db
    • Thomas Gleixner's avatar
      atomics: Provide atomic_add_negative() variants · e5ab9eff
      Thomas Gleixner authored
      atomic_add_negative() does not provide the relaxed/acquire/release
      variants.
      
      Provide them in preparation for a new scalable reference count algorithm.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Link: https://lore.kernel.org/r/20230323102800.101763813@linutronix.de
      e5ab9eff
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-next-for-6.4-20230327' of... · 4cee0fb9
      Jakub Kicinski authored
      Merge tag 'linux-can-next-for-6.4-20230327' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can-next 2023-03-27
      
      The first 2 patches by Geert Uytterhoeven add transceiver support and
      improve the error messages in the rcar_canfd driver.
      
      Cai Huoqing contributes 3 patches which remove a redundant call to
      pci_clear_master() in the c_can, ctucanfd and kvaser_pciefd driver.
      
      Frank Jungclaus's patch replaces the struct esd_usb_msg with a union
      in the esd_usb driver to improve readability.
      
      Markus Schneider-Pargmann contributes 5 patches to improve the
      performance in the m_can driver, especially for SPI attached
      controllers like the tcan4x5x.
      
      * tag 'linux-can-next-for-6.4-20230327' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
        can: m_can: Keep interrupts enabled during peripheral read
        can: m_can: Disable unused interrupts
        can: m_can: Remove double interrupt enable
        can: m_can: Always acknowledge all interrupts
        can: m_can: Remove repeated check for is_peripheral
        can: esd_usb: Improve code readability by means of replacing struct esd_usb_msg with a union
        can: kvaser_pciefd: Remove redundant pci_clear_master
        can: ctucanfd: Remove redundant pci_clear_master
        can: c_can: Remove redundant pci_clear_master
        can: rcar_canfd: Improve error messages
        can: rcar_canfd: Add transceiver support
      ====================
      
      Link: https://lore.kernel.org/r/20230327073354.1003134-1-mkl@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4cee0fb9
    • Jakub Kicinski's avatar
      Merge branch 'add-tx-push-buf-len-param-to-ethtool' · da954ae1
      Jakub Kicinski authored
      Shay Agroskin says:
      
      ====================
      Add tx push buf len param to ethtool
      
      This patchset adds a new sub-configuration to ethtool get/set queue
      params (ethtool -g) called 'tx-push-buf-len'.
      
      This configuration specifies the maximum number of bytes of a
      transmitted packet a driver can push directly to the underlying
      device ('push' mode). The motivation for pushing some of the bytes to
      the device has the advantages of
      
      - Allowing a smart device to take fast actions based on the packet's
        header
      - Reducing latency for small packets that can be copied completely into
        the device
      
      This new param is practically similar to tx-copybreak value that can be
      set using ethtool's tunable but conceptually serves a different purpose.
      While tx-copybreak is used to reduce the overhead of DMA mapping and
      makes no sense to use if less than the whole segment gets copied,
      tx-push-buf-len allows to improve performance by analyzing the packet's
      data (usually headers) before performing the DMA operation.
      
      The configuration can be queried and set using the commands:
      
          $ ethtool -g [interface]
      
          # ethtool -G [interface] tx-push-buf-len [number of bytes]
      
      This patchset also adds support for the new configuration in ENA driver
      for which this parameter ensures efficient resources management on the
      device side.
      ====================
      
      Link: https://lore.kernel.org/r/20230323163610.1281468-1-shayagr@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      da954ae1