Commits · b71441b7542d35d48b886b02d808e8544153adda · Kirill Smelkov / linux

02 Aug, 2024 32 commits

Merge branch 'ibmveth-rr-performance' · b71441b7

Jakub Kicinski authored Aug 02, 2024

Nick Child says:

====================
ibmveth RR performance

This patchset aims to increase the ibmveth drivers small packet
request response rate.

These 2 patches address:
1. NAPI rescheduling technique
2. Driver-side processing of small packets

Testing over several netperf tcp_rr connections, we saw a
30% increase in transactions per second. No regressions
were observed in other workloads.
====================

Link: https://patch.msgid.link/20240801211215.128101-1-nnac123@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b71441b7

ibmveth: Recycle buffers during replenish phase · b5381a55

Nick Child authored Aug 01, 2024

When the length of a packet is under the rx_copybreak threshold, the
buffer is copied into a new skb and sent up the stack. This allows the
dma mapped memory to be recycled back to FW.

Previously, the reuse of the DMA space was handled immediately.
This means that further packet processing has to wait until
h_add_logical_lan finishes for this packet.

Therefore, when reusing a packet, offload the hcall to the replenish
function. As a result, much of the shared logic between the recycle and
replenish functions can be removed.

This change increases TCP_RR packet rate by another 15% (370k to 430k
txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8078553.0 us / 190999.0 hits = AVG 42.3 us
NEW: ibmveth_poll = 7632787.0 us / 224060.0 hits = AVG 34.07 us
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240801211215.128101-3-nnac123@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b5381a55

ibmveth: Optimize poll rescheduling process · f128c7cf

Nick Child authored Aug 01, 2024

When the ibmveth driver processes less than the budget, it must call
napi_complete_done() to release the instance. This function will
return false if the driver should avoid rearming interrupts.
Previously, the driver was ignoring the return code of
napi_complete_done(). As a result, there were unnecessary calls to
enable the veth irq.
Therefore, use the return code napi_complete_done() to determine if
irq rearm is necessary.

Additionally, in the event that new data is received immediately after
rearming interrupts, rather than just rescheduling napi, also jump
back to the poll processing loop since we are already in the poll
function (and know that we did not expense all of budget).

This slight tweak results in a 15% increase in TCP_RR transaction rate
(320k to 370k txns). We can see the ftrace data supports this:
PREV: ibmveth_poll = 8818014.0 us / 182802.0 hits = AVG 48.24
NEW:  ibmveth_poll = 8082398.0 us / 191413.0 hits = AVG 42.22
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240801211215.128101-2-nnac123@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f128c7cf

linkmode: Change return type of linkmode_andnot to bool · 7e1d512d

Simon Horman authored Aug 01, 2024

linkmode_andnot() simply returns the result of bitmap_andnot().
And the return type of bitmap_andnot() is bool.
So it makes sense for the return type of linkmode_andnot()
to also be bool.

I checked all call-sites and they either ignore the return
value or treat it as a bool.

Compile tested only.

Link: https://lore.kernel.org/netdev/68088998-4486-4930-90a4-96a32f08c490@lunn.ch/Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240801-linkfield-bowl-v1-1-d58f68967802@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

7e1d512d

Merge branch 'add-second-qdma-support-for-en7581-eth-controller' · d29dd11e

Jakub Kicinski authored Aug 02, 2024

Lorenzo Bianconi says:

====================
Add second QDMA support for EN7581 eth controller

EN7581 SoC supports two independent QDMA controllers to connect the
Ethernet Frame Engine (FE) to the CPU. Introduce support for the second
QDMA controller. This is a preliminary series to support multiple FE ports
(e.g. connected to a second PHY controller).
====================

Link: https://patch.msgid.link/cover.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d29dd11e

net: airoha: Link the gdm port to the selected qdma controller · 9304640f

Lorenzo Bianconi authored Aug 01, 2024

Link the running gdm port to the qdma controller used to connect with
the CPU. Moreover, load all QDMA controllers available on EN7581 SoC.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/95b515df34ba4727f7ae5b14a1d0462cceec84ff.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

9304640f

net: airoha: Start all qdma NAPIs in airoha_probe() · 160231e3

Lorenzo Bianconi authored Aug 01, 2024

This is a preliminary patch to support multi-QDMA controllers.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/b51cf69c94d8cbc81e0a0b35587f024d01e6d9c0.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

160231e3

net: airoha: Allow mapping IO region for multiple qdma controllers · e618447c

Lorenzo Bianconi authored Aug 01, 2024

Map MMIO regions of both qdma controllers available on EN7581 SoC.
Run airoha_hw_cleanup routine for both QDMA controllers available on
EN7581 SoC removing airoha_eth module or in airoha_probe error path.
This is a preliminary patch to support multi-QDMA controllers.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/a734ae608da14b67ae749b375d880dbbc70868ea.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e618447c

net: airoha: Use qdma pointer as private structure in airoha_irq_handler routine · e3d6bfdf

Lorenzo Bianconi authored Aug 01, 2024

This is a preliminary patch to support multi-QDMA controllers.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/1e40c3cb973881c0eb3c3c247c78550da62054ab.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e3d6bfdf

net: airoha: Add airoha_qdma pointer in airoha_tx_irq_queue/airoha_queue structures · 9a2500ab

Lorenzo Bianconi authored Aug 01, 2024

Move airoha_eth pointer in airoha_qdma structure from
airoha_tx_irq_queue/airoha_queue ones. This is a preliminary patch to
introduce support for multi-QDMA controllers available on EN7581.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/074565b82fd0ceefe66e186f21133d825dbd48eb.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

9a2500ab

net: airoha: Move irq_mask in airoha_qdma structure · 19e47fc2

Lorenzo Bianconi authored Aug 01, 2024

QDMA controllers have independent irq lines, so move irqmask in
airoha_qdma structure. This is a preliminary patch to support multiple
QDMA controllers.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/1c8a06e8be605278a7b2f3cd8ac06e74bf5ebf2b.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

19e47fc2

net: airoha: Move airoha_queues in airoha_qdma · 245c7bc8

Lorenzo Bianconi authored Aug 01, 2024

QDMA controllers available in EN7581 SoC have independent tx/rx hw queues
so move them in airoha_queues structure.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/795fc4797bffbf7f0a1351308aa9bf0e65b5126e.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

245c7bc8

net: airoha: Introduce airoha_qdma struct · 16874d1c

Lorenzo Bianconi authored Aug 01, 2024

Introduce airoha_qdma struct and move qdma IO register mapping in
airoha_qdma. This is a preliminary patch to enable both QDMA controllers
available on EN7581 SoC.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/7df163bdc72ee29c3d27a0cbf54522ffeeafe53c.1722522582.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

16874d1c

eth: fbnic: select DEVLINK and PAGE_POOL · 9a95b7a8

Simon Horman authored Aug 02, 2024

Build bot reports undefined references to devlink functions.
And local testing revealed undefined references to page_pool functions.

Based on a patch by Jakub Kicinski <kuba@kernel.org>

Fixes: 1a9d4889 ("eth: fbnic: Allocate core device specific structures and devlink interface")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408011219.hiPmwwAs-lkp@intel.com/Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240802-fbnic-select-v2-1-41f82a3e0178@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

9a95b7a8

selftests: net: ksft: print more of the stack for checks · 46619175

Jakub Kicinski authored Aug 01, 2024

Print more stack frames and the failing line when check fails.
This helps when tests use helpers to do the checks.

Before:

  # At ./ksft/drivers/net/hw/rss_ctx.py line 92:
  # Check failed 1037698 >= 396893.0 traffic on other queues:[344612, 462380, 233020, 449174, 342298]
  not ok 8 rss_ctx.test_rss_context_queue_reconfigure

After:

  # Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 387, in test_rss_context_queue_reconfigure:
  # Check|     test_rss_queue_reconfigure(cfg, main_ctx=False)
  # Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 230, in test_rss_queue_reconfigure:
  # Check|     _send_traffic_check(cfg, port, ctx_ref, { 'target': (0, 3),
  # Check| At ./ksft/drivers/net/hw/rss_ctx.py, line 92, in _send_traffic_check:
  # Check|     ksft_lt(sum(cnts[i] for i in params['noise']), directed / 2,
  # Check failed 1045235 >= 405823.5 traffic on other queues (context 1)':[460068, 351995, 565970, 351579, 127270]
  not ok 8 rss_ctx.test_rss_context_queue_reconfigure

Link: https://patch.msgid.link/20240801232317.545577-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

46619175

selftests: net: ksft: replace 95 with errno.EOPNOTSUPP · a48395f2

Stanislav Fomichev authored Aug 01, 2024

Petr suggested to use errno.EOPNOTSUPP instead of hard-coded 95
in the new test case. Adjust existing ones to match this style.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20240802000309.2368-3-sdf@fomichev.meSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a48395f2

selftests: net: ksft: support marking tests as disruptive · f8793068

Stanislav Fomichev authored Aug 01, 2024

Add new @ksft_disruptive decorator to mark the tests that might
be disruptive to the system. Depending on how well the previous
test works in the CI we might want to disable disruptive tests
by default and only let the developers run them manually.

KSFT framework runs disruptive tests by default. DISRUPTIVE=False
environment (or config file) can be used to disable these tests.
ksft_setup should be called by the test cases that want to use
new decorator (ksft_setup is only called via NetDrvEnv/NetDrvEpEnv for now).

In the future we can add similar decorators to, for example, avoid
running slow tests all the time. And/or have some option to run
only 'fast' tests for some sort of smoke test scenario.

  $ DISRUPTIVE=False ./stats.py
  KTAP version 1
  1..5
  ok 1 stats.check_pause
  ok 2 stats.check_fec
  ok 3 stats.pkt_byte_sum
  ok 4 stats.qstat_by_ifindex
  ok 5 stats.check_down # SKIP marked as disruptive
  # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:1 error:0

v3:
- parse yes and properly treat non-zero nums as true (Petr)

v2:
- convert from cli argument to env variable (Jakub)
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20240802000309.2368-2-sdf@fomichev.meSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f8793068

selftests: net-drv: exercise queue stats when the device is down · ab100097

Stanislav Fomichev authored Aug 01, 2024

Verify that total device stats don't decrease after it has been turned down.
Also make sure the device doesn't crash when we access per-queue stats
when it's down (in case it tries to access some pointers that are NULL).

  KTAP version 1
  1..5
  ok 1 stats.check_pause
  ok 2 stats.check_fec
  ok 3 stats.pkt_byte_sum
  ok 4 stats.qstat_by_ifindex
  ok 5 stats.check_down
  # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0

v3:
- use errno.EOPNOTSUPP (Petr)
- move qstat[0] under try (Petr)

v2:
- KTAP output formatting (Jakub)
- defer instead of try/finally (Jakub)
- disappearing stats is an error (Jakub)
- ksft_ge instead of open coding (Jakub)
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20240802000309.2368-1-sdf@fomichev.meSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ab100097

net: remove IFF_* re-definition · 49675f5b

Jakub Kicinski authored Aug 01, 2024

We re-define values of enum netdev_priv_flags as preprocessor
macros with the same name. I guess this was done to avoid breaking
out of tree modules which may use #ifdef X for kernel compatibility?
Commit 7aa98047 ("net: move net_device priv_flags out from UAPI")
which added the enum doesn't say. In any case, the flags with defines
are quite old now, and defines for new flags don't get added.
OOT drivers have to resort to code greps for compat detection, anyway.
Let's delete these defines, save LoC, help LXR link to the right place.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20240801163401.378723-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

49675f5b

Merge branch 'axienet-coding-style' into main · ce21e520

David S. Miller authored Aug 02, 2024

Radhey Shyam Pandey says:

====================
net: axienet: Fix coding style issues

This patchset replace all occurences of (1<<x) by BIT(x) to get rid
of checkpatch.pl "CHECK" output "Prefer using the BIT macro".

It also removes unnecessary ftrace-like logging, add missing blank line
after declaration and remove unnecessary parentheses around 'ndev->mtu
<= XAE_JUMBO_MTU' and 'ndev->mtu > XAE_MTU'.

Changes for v2:
- Split each coding style change into separate patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ce21e520

net: axienet: remove unnecessary parentheses · 48ba8a1d

Radhey Shyam Pandey authored Jul 31, 2024

Remove unnecessary parentheses around 'ndev->mtu
<= XAE_JUMBO_MTU' and 'ndev->mtu > XAE_MTU'. Reported
by checkpatch.

CHECK: Unnecessary parentheses around 'ndev->mtu > XAE_MTU'
+       if ((ndev->mtu > XAE_MTU) &&
+           (ndev->mtu <= XAE_JUMBO_MTU)) {

CHECK: Unnecessary parentheses around 'ndev->mtu <= XAE_JUMBO_MTU'
+       if ((ndev->mtu > XAE_MTU) &&
+           (ndev->mtu <= XAE_JUMBO_MTU)) {
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

48ba8a1d

net: axienet: remove unnecessary ftrace-like logging · f83828a0

Radhey Shyam Pandey authored Jul 31, 2024

remove unnecessary ftrace-like logging. Also fixes below
checkpatch WARNING.

WARNING: Unnecessary ftrace-like logging - prefer using ftrace
+       dev_dbg(&ndev->dev, "%s\n", __func__);
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

f83828a0

net: axienet: add missing blank line after declaration · f7061a3e

Radhey Shyam Pandey authored Jul 31, 2024

Add missing blank line after declaration. Fixes below
checkpatch warnings.

WARNING: Missing a blank line after declarations
+       struct sockaddr *addr = p;
+       axienet_set_mac_address(ndev, addr->sa_data);

WARNING: Missing a blank line after declarations
+       struct axienet_local *lp = netdev_priv(ndev);
+       disable_irq(lp->tx_irq);
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

f7061a3e

net: axienet: Replace the occurrences of (1<<x) by BIT(x) · 3ff578c9

Appana Durga Kedareswara Rao authored Jul 31, 2024

Replace all occurences of (1<<x) by BIT(x) to get rid of checkpatch.pl
"CHECK" output "Prefer using the BIT macro".
Signed-off-by: Appana Durga Kedareswara Rao <appana.durga.rao@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

3ff578c9

Merge branch 'vsock-virtio' into main · 3361a6ea

David S. Miller authored Aug 02, 2024

Luigi Leonardi says:

====================
vsock: avoid queuing on intermediate queue if possible

This series introduces an optimization for vsock/virtio to reduce latency
and increase the throughput: When the guest sends a packet to the host,
and the intermediate queue (send_pkt_queue) is empty, if there is enough
space, the packet is put directly in the virtqueue.

v3->v4
While running experiments on fio with 64B payload, I realized that there
was a mistake in my fio configuration, so I re-ran all the experiments
and now the latency numbers are indeed lower with the patch applied.
I also noticed that I was kicking the host without the lock.

- Fixed a configuration mistake on fio and re-ran all experiments.
- Fio latency measurement using 64B payload.
- virtio_transport_send_skb_fast_path sends kick with the tx_lock acquired
- Addressed all minor style changes requested by maintainer.
- Rebased on latest net-next
- Link to v3: https://lore.kernel.org/r/20240711-pinna-v3-0-697d4164fe80@outlook.com

v2->v3
- Performed more experiments using iperf3 using multiple streams
- Handling of reply packets removed from virtio_transport_send_skb,
  as is needed just by the worker.
- Removed atomic_inc/atomic_sub when queuing directly to the vq.
- Introduced virtio_transport_send_skb_fast_path that handles the
  steps for sending on the vq.
- Fixed a missing mutex_unlock in error path.
- Changed authorship of the second commit
- Rebased on latest net-next

v1->v2
In this v2 I replaced a mutex_lock with a mutex_trylock because it was
insidea RCU critical section. I also added a check on tx_run, so if the
module is being removed the packet is not queued. I'd like to thank Stefano
for reporting the tx_run issue.

Applied all Stefano's suggestions:
    - Minor code style changes
    - Minor commit text rewrite
Performed more experiments:
     - Check if all the packets go directly to the vq (Matias' suggestion)
     - Used iperf3 to see if there is any improvement in overall throughput
      from guest to host
     - Pinned the vhost process to a pCPU.
     - Run fio using 512B payload
Rebased on latest net-next
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3361a6ea

test/vsock: add ioctl unsent bytes test · 18ee44ce

Luigi Leonardi authored Jul 30, 2024

Introduce two tests, one for SOCK_STREAM and one for SOCK_SEQPACKET,
which use SIOCOUTQ ioctl to check that the number of unsent bytes is
zero after delivering a packet.

vsock_connect and vsock_accept are no longer static: this is to
create more generic tests, allowing code to be reused for SEQPACKET
and STREAM.
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

18ee44ce

vsock/virtio: add SIOCOUTQ support for all virtio based transports · e6ab4500

Luigi Leonardi authored Jul 30, 2024

Introduce support for virtio_transport_unsent_bytes
ioctl for virtio_transport, vhost_vsock and vsock_loopback.

For all transports the unsent bytes counter is incremented
in virtio_transport_get_credit.

In virtio_transport (G2H) and in vhost-vsock (H2G) the counter
is decremented when the skbuff is consumed. In vsock_loopback the
same skbuff is passed from the transmitter to the receiver, so
the counter is decremented before queuing the skbuff to the
receiver.
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e6ab4500

vsock: add support for SIOCOUTQ ioctl · 744500d8

Luigi Leonardi authored Jul 30, 2024

Add support for ioctl(s) in AF_VSOCK.
The only ioctl available is SIOCOUTQ/TIOCOUTQ, which returns the number
of unsent bytes in the socket. This information is transport-specific
and is delegated to them using a callback.
Suggested-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

744500d8

net: Use of_property_read_bool() · 5fe164fb

Rob Herring (Arm) authored Jul 31, 2024

Use of_property_read_bool() to read boolean properties rather than
of_find_property(). This is part of a larger effort to remove callers
of of_find_property() and similar functions. of_find_property() leaks
the DT struct property and data pointers which is a problem for
dynamically allocated nodes which may be freed.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20240731191601.1714639-2-robh@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5fe164fb

net: mdio: Use of_property_count_u32_elems() to get property length · 0b0e9cdb

Rob Herring (Arm) authored Jul 31, 2024

Replace of_get_property() with the type specific
of_property_count_u32_elems() to get the property length.

This is part of a larger effort to remove callers of of_get_property()
and similar functions. of_get_property() leaks the DT property data
pointer which is a problem for dynamically allocated nodes which may
be freed.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240731201514.1839974-2-robh@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0b0e9cdb

net: phy: qca807x: Drop unnecessary and broken DT validation · 46e6acfe

Rob Herring (Arm) authored Jul 31, 2024

The check for "leds" and "gpio-controller" both being present is never
true because "leds" is a node, not a property. This could be fixed
with a check for child node, but there's really no need for the kernel
to validate a DT. Just continue ignoring the LEDs if GPIOs are present.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240731201703.1842022-2-robh@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

46e6acfe

net: mctp: Consistent peer address handling in ioctl tag allocation · 5fcf0801

John Wang authored Jul 30, 2024

When executing ioctl to allocate tags, if the peer address is 0,
mctp_alloc_local_tag now replaces it with 0xff. However, during tag
dropping, this replacement is not performed, potentially causing the key
not to be dropped as expected.
Signed-off-by: John Wang <wangzhiqiang02@ieisystem.com>
Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://patch.msgid.link/20240730084636.184140-1-wangzhiqiang02@ieisystem.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5fcf0801

01 Aug, 2024 8 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 5fa35bd3

Jakub Kicinski authored Aug 01, 2024

Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Link: https://patch.msgid.link/20240801131917.34494-1-pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5fa35bd3

Merge tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 183d46ff

Linus Torvalds authored Aug 01, 2024

Pull networking fixes from Paolo Abeni:
 "Including fixes from wireless, bleutooth, BPF and netfilter.

  Current release - regressions:

   - core: drop bad gso csum_start and offset in virtio_net_hdr

   - wifi: mt76: fix null pointer access in mt792x_mac_link_bss_remove

   - eth: tun: add missing bpf_net_ctx_clear() in do_xdp_generic()

   - phy: aquantia: only poll GLOBAL_CFG regs on aqr113, aqr113c and
     aqr115c

  Current release - new code bugs:

   - smc: prevent UAF in inet_create()

   - bluetooth: btmtk: fix kernel crash when entering btmtk_usb_suspend

   - eth: bnxt: reject unsupported hash functions

  Previous releases - regressions:

   - sched: act_ct: take care of padding in struct zones_ht_key

   - netfilter: fix null-ptr-deref in iptable_nat_table_init().

   - tcp: adjust clamping window for applications specifying SO_RCVBUF

  Previous releases - always broken:

   - ethtool: rss: small fixes to spec and GET

   - mptcp:
      - fix signal endpoint re-add
      - pm: fix backup support in signal endpoints

   - wifi: ath12k: fix soft lockup on suspend

   - eth: bnxt_en: fix RSS logic in __bnxt_reserve_rings()

   - eth: ice: fix AF_XDP ZC timeout and concurrency issues

   - eth: mlx5:
      - fix missing lock on sync reset reload
      - fix error handling in irq_pool_request_irq"

* tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  mptcp: fix duplicate data handling
  mptcp: fix bad RCVPRUNED mib accounting
  ipv6: fix ndisc_is_useropt() handling for PIO
  igc: Fix double reset adapter triggered from a single taprio cmd
  net: MAINTAINERS: Demote Qualcomm IPA to "maintained"
  net: wan: fsl_qmc_hdlc: Discard received CRC
  net: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex
  net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys
  net/mlx5e: Fix CT entry update leaks of modify header context
  net/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability
  net/mlx5: Fix missing lock on sync reset reload
  net/mlx5: Lag, don't use the hardcoded value of the first port
  net/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule
  net/mlx5: Fix error handling in irq_pool_request_irq
  net/mlx5: Always drain health in shutdown callback
  net: Add skbuff.h to MAINTAINERS
  r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY
  netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init().
  netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init().
  net: drop bad gso csum_start and offset in virtio_net_hdr
  ...

183d46ff

ethtool: Don't check for NULL info in prepare_data callbacks · 743ff021

Simon Horman authored Jul 31, 2024

Since commit f946270d ("ethtool: netlink: always pass genl_info to
.prepare_data") the info argument of prepare_data callbacks is never
NULL. Remove checks present in callback implementations.

Link: https://lore.kernel.org/netdev/20240703121237.3f8b9125@kernel.org/Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240731-prepare_data-null-check-v1-1-627f2320678f@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

743ff021

RDS: IB: Remove unused declarations · f9c141fc

Yue Haibing authored Jul 31, 2024

Commit f4f943c9 ("RDS: IB: ack more receive completions to improve performance")
removed rds_ib_recv_tasklet_fn() implementation but not the declaration.
And commit ec16227e ("RDS/IB: Infiniband transport") declared but never implemented
other functions.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Link: https://patch.msgid.link/20240731063630.3592046-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f9c141fc

net: ethernet: mtk_eth_soc: drop clocks unused by Ethernet driver · 887b1d1a

Daniel Golle authored Jul 30, 2024

Clocks for SerDes and PHY are going to be handled by standalone drivers
for each of those hardware components. Drop them from the Ethernet driver.

The clocks which are being removed for this patch are responsible for
the for the SerDes PCS and PHYs used for the 2nd and 3rd MAC which are
anyway not yet supported. Hence backwards compatibility is not an issue.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/b5faaf69b5c6e3e155c64af03706c3c423c6a1c9.1722335682.git.daniel@makrotopia.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

887b1d1a

net/mlx5: Reclaim max 50K pages at once · 501c3005

Anand Khoje authored Jul 30, 2024

In non FLR context, at times CX-5 requests release of ~8 million FW pages.
This needs humongous number of cmd mailboxes, which to be released once
the pages are reclaimed. Release of humongous number of cmd mailboxes is
consuming cpu time running into many seconds. Which with non preemptible
kernels is leading to critical process starving on that cpu’s RQ.
On top of it, the FW does not use all the mailbox messages as it has a
limit of releasing 50K pages at once per MLX5_CMD_OP_MANAGE_PAGES +
MLX5_PAGES_TAKE device command. Hence, the allocation of these many
mailboxes is extra and adds unnecessary overhead.
To alleviate this, this change restricts the total number of pages
a worker will try to reclaim to maximum 50K pages in one go.

Our tests have shown significant benefit of this change in terms of
time consumed by dma_pool_free().
During a test where an event was raised by HCA
to release 1.3 Million pages, following observations were made:

- Without this change:
Number of mailbox messages allocated was around 20K, to accommodate
the DMA addresses of 1.3 million pages.
The average time spent by dma_pool_free() to free the DMA pool is between
16 usec to 32 usec.
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@                                        287
            1024 |@@@                                      1332
            2048 |@                                        656
            4096 |@@@@@                                    2599
            8192 |@@@@@@@@@@                               4755
           16384 |@@@@@@@@@@@@@@@                          7545
           32768 |@@@@@                                    2501
           65536 |                                         0

- With this change:
Number of mailbox messages allocated was around 800; this was to
accommodate DMA addresses of only 50K pages.
The average time spent by dma_pool_free() to free the DMA pool in this case
lies between 1 usec to 2 usec.
           value  ------------- Distribution ------------- count
             256 |                                         0
             512 |@@@@@@@@@@@@@@@@@@                       346
            1024 |@@@@@@@@@@@@@@@@@@@@@@                   435
            2048 |                                         0
            4096 |                                         0
            8192 |                                         1
           16384 |                                         0
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://patch.msgid.link/20240730073634.114407-1-anand.a.khoje@oracle.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

501c3005

Merge branch 'mptcp-fix-duplicate-data-handling' · 25010bfd

Paolo Abeni authored Aug 01, 2024

Matthieu Baerts says:

====================
mptcp: fix duplicate data handling

In some cases, the subflow-level's copied_seq counter was incorrectly
increased, leading to an unexpected subflow reset.

Patch 1/2 fixes the RCVPRUNED MIB counter that was attached to the wrong
event since its introduction in v5.14, backported to v5.11.

Patch 2/2 fixes the copied_seq counter issues, is present since v5.10.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================

Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-dup-data-v1-0-bde833fa628a@kernel.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>

25010bfd

mptcp: fix duplicate data handling · 68cc9247

Paolo Abeni authored Jul 31, 2024

When a subflow receives and discards duplicate data, the mptcp
stack assumes that the consumed offset inside the current skb is
zero.

With multiple subflows receiving data simultaneously such assertion
does not held true. As a result the subflow-level copied_seq will
be incorrectly increased and later on the same subflow will observe
a bad mapping, leading to subflow reset.

Address the issue taking into account the skb consumed offset in
mptcp_subflow_discard_data().

Fixes: 04e4cd4f ("mptcp: cleanup mptcp_subflow_discard_data()")
Cc: stable@vger.kernel.org
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/501Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

68cc9247