Commits · 54c5297801f3b9140c751c7f5660770c52dea24e · Kirill Smelkov / linux

07 Aug, 2023 16 commits

net/mlx5: Handle SF IRQ request in the absence of SF IRQ pool · 54c52978

Maher Sanalla authored Jun 22, 2023

In case the SF IRQ pool is not available due to setup limitations,
SF currently relies on the already allocated PF IRQs to fulfill
its IRQ vector requests.

However, with the dynamic EQ allocation introduced in the next patch,
it is possible that not all IRQs of PF will be allocated after the driver
is loaded. In such case, if a SF requests a completion IRQ without having
its own independent IRQ pool, SF will lack a PF IRQ to utilize.

To address this scenario, allocate an IRQ for the SF from the PF's IRQ pool
on demand. The new IRQ will be shared between the SF and it's PF.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

54c52978

net/mlx5: Rename mlx5_comp_vectors_count() to mlx5_comp_vectors_max() · 674dd4e2

Maher Sanalla authored Jun 22, 2023

To accurately represent its purpose, rename the function that retrieves
the value of maximum vectors from mlx5_comp_vectors_count() to
mlx5_comp_vectors_max().
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

674dd4e2

net/mlx5: Add IRQ vector to CPU lookup function · f3147015

Maher Sanalla authored Jun 12, 2023

Currently, once driver load completes, IRQ requests were performed for all
vectors. However, as we move to support dynamic creation of EQs, this will
not be the case as some IRQs will not exist at this stage. Thus, in such
case, use the default CPU to IRQ mapping which is the serial mapping based
on IRQ vector index. Meaning, the n'th vector gets mapped to the n'th CPU.

Introduce an API function mlx5_comp_vector_cpu() that takes an IRQ index and
provides the corresponding CPU mapping. It utilizes the existing IRQ
affinity if defined, or resorts to the default serialized CPU mapping
otherwise.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

f3147015

net/mlx5: Introduce mlx5_cpumask_default_spread · ddd2c79d

Maher Sanalla authored Jun 12, 2023

For better code readability in the completion IRQ request code, define
the cpu lookup per completion vector logic in a separate function.

The new method mlx5_cpumask_default_spread() given a vector index 'n'
will return the 'nth' cpu. This new method will be used also in the next
patch.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

ddd2c79d

net/mlx5: Implement single completion EQ create/destroy methods · e3e56775

Maher Sanalla authored Jun 11, 2023

Currently, create_comp_eqs() function handles the creation of all
completion EQs for all the vectors on driver load. While on driver
unload, destroy_comp_eqs() performs the equivalent job.

In preparation for dynamic EQ creation, replace create_comp_eqs() /
destroy_comp_eqs() with  create_comp_eq() / destroy_comp_eq() functions
which will receive a vector index and allocate/destroy an EQ for that
specific vector. Thus, allowing more flexibility in the management
of completion EQs.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

e3e56775

net/mlx5: Use xarray to store and manage completion EQs · 273c697f

Maher Sanalla authored Jun 19, 2023

Use xarray to store the completion EQs instead of a linked list.
The xarray offers more scalability, reduced memory overhead, and
facilitates the lookup of a certain EQ given a vector index.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

273c697f

net/mlx5: Refactor completion IRQ request/release handlers in EQ layer · 54b2cf41

Maher Sanalla authored Jun 12, 2023

Break the completion IRQ request/release functions into per-vector
handlers for both PCI devices and SFs in the EQ layer.

On EQ table creation, loop over all vectors and request an IRQ for each
one using the new per-vector functions. Perform the symmetrical change
when releasing IRQs on EQ table cleanup.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

54b2cf41

net/mlx5: Use xarray to store and manage completion IRQs · c8a0245c

Maher Sanalla authored Jun 18, 2023

Use xarray to store the completion IRQs instead of a fixed-size allocated
array as not all completion IRQs will be requested on driver load, but
rather on demand when an EQ is created. The xarray offers more scalability,
reduced memory overhead, and provides the ability to dynamically resize the
array when needed.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

c8a0245c

net/mlx5: Refactor completion IRQ request/release API · a1772de7

Maher Sanalla authored Jun 11, 2023

Introduce a per-vector completion IRQ request API that requests a
single IRQ for a given vector index instead of multiple IRQs request API.
On driver load, loop over all completion vectors and request an IRQ for
each one via the newly introduced API.

Symmetrically, introduce an IRQ release API per vector. On driver
unload, loop over all vectors and release each completion IRQ via
the new per-vector API.

As IRQ vectors will be requested dynamically later in the patchset,
add a cpumask of the bounded CPUs to avoid the possible mapping of
two IRQs of the same device to the same cpu.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a1772de7

net/mlx5: Track the current number of completion EQs · 18cf3d31

Maher Sanalla authored Jun 09, 2023

In preparation to allocate completion EQs, add a counter to track the
number of completion EQs currently allocated. Store the maximum number
of EQs in max_comp_eqs variable.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

18cf3d31

udp/udplite: Remove unused function declarations udp{,lite}_get_port() · cc97777c

Yue Haibing authored Aug 05, 2023

Commit 6ba5a3c5 ("[UDP]: Make full use of proto.h.udp_hash innovation.")
removed these implementations but leave declarations.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cc97777c

net: sfp: Remove unused function declaration sfp_link_configure() · a6ab5c29

Yue Haibing authored Aug 05, 2023

Commit ce0aa27f ("sfp: add sfp-bus to bridge between network devices and sfp cages")
declared but never implemented it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6ab5c29

ndisc: Remove unused ndisc_ifinfo_sysctl_strategy() declaration · 2c6af36b

Yue Haibing authored Aug 05, 2023

Commit f8572d8f ("sysctl net: Remove unused binary sysctl code")
left behind this declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c6af36b

net: pkt_cls: Remove unused inline helpers · 992b4785

Yue Haibing authored Aug 05, 2023

Commit acb67442 ("net: sched: introduce per-block callbacks")
implemented these but never used it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

992b4785

neighbour: Remove unused function declaration pneigh_for_each() · 047551cd

Yue Haibing authored Aug 05, 2023

pneigh_for_each() is never implemented since the beginning of git history.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

047551cd

net/tls: Remove unused function declarations · f6ecb68b

Yue Haibing authored Aug 05, 2023

Commit 3c4d7559 ("tls: kernel TLS support") declared but never implemented
these functions.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f6ecb68b

06 Aug, 2023 16 commits

net: omit ndo_hwtstamp_get() call when possible in dev_set_hwtstamp_phylib() · c35e927c

Vladimir Oltean authored Aug 04, 2023

Setting dev->priv_flags & IFF_SEE_ALL_HWTSTAMP_REQUESTS is only legal
for drivers which were converted to ndo_hwtstamp_get() and
ndo_hwtstamp_set(), and it is only there that we call ndo_hwtstamp_set()
for a request that otherwise goes to phylib (for stuff like packet traps,
which need to be undone if phylib failed, hence the old_cfg logic).

The problem is that we end up calling ndo_hwtstamp_get() when we don't
need to (even if the SIOCSHWTSTAMP wasn't intended for phylib, or if it
was, but the driver didn't set IFF_SEE_ALL_HWTSTAMP_REQUESTS). For those
unnecessary conditions, we share a code path with virtual drivers (vlan,
macvlan, bonding) where ndo_hwtstamp_get() is implemented as
generic_hwtstamp_get_lower(), and may be resolved through
generic_hwtstamp_ioctl_lower() if the lower device is unconverted.

I.e. this situation:

$ ip link add link eno0 name eno0.100 type vlan id 100
$ hwstamp_ctl -i eno0.100 -t 1

We are unprepared to deal with this, because if ndo_hwtstamp_get() is
resolved through a legacy ndo_eth_ioctl(SIOCGHWTSTAMP) lower_dev
implementation, that needs a non-NULL old_cfg.ifr pointer, and we don't
have it.

But we don't even need to deal with it either. In the general case,
drivers may not even implement SIOCGHWTSTAMP handling, only SIOCSHWTSTAMP,
so it makes sense to completely avoid a SIOCGHWTSTAMP call if we can.

The solution is to split the single "if" condition into 3 smaller ones,
thus separating the decision to call ndo_hwtstamp_get() from the
decision to call ndo_hwtstamp_set(). The third "if" condition is
identical to the first one, and both are subsets of the second one.
Thus, the "cfg" argument of kernel_hwtstamp_config_changed() is always
valid.
Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89iLOspJsvjPj+y8jikg7erXDomWe8sqHMdfL_2LQSFrPAg@mail.gmail.com/
Fixes: fd770e85 ("net: remove phy_has_hwtstamp() -> phy_mii_ioctl() decision from converted drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c35e927c

net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address · 54024dbe

Yang Yingliang authored Aug 04, 2023

Use eth_broadcast_addr() to assign broadcast address instead
of memset().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

54024dbe

ibmvnic: remove unused rc variable · 813f3662

Yu Liao authored Aug 04, 2023

gcc with W=1 reports
drivers/net/ethernet/ibm/ibmvnic.c:194:13: warning: variable 'rc' set but not used [-Wunused-but-set-variable]
                                            ^
This variable is not used so remove it.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308040609.zQsSXWXI-lkp@intel.com/Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

813f3662

net: mana: Add page pool for RX buffers · b1d13f7a

Haiyang Zhang authored Aug 04, 2023

Add page pool for RX buffers for faster buffer cycle and reduce CPU
usage.

The standard page pool API is used.

With iperf and 128 threads test, this patch improved the throughput
by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to
10-50%.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1d13f7a

Merge branch 'gve-desc' · 48ae409a

David S. Miller authored Aug 06, 2023

Rushil Gupta says:

====================
gve: Add QPL mode for DQO descriptor format

GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

48ae409a

gve: update gve.rst · 5a3f8d12

Rushil Gupta authored Aug 04, 2023

Add a note about QPL and RDA mode
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a3f8d12

gve: RX path for DQO-QPL · e7075ab4

Rushil Gupta authored Aug 04, 2023

The RX path allocates the QPL page pool at queue creation, and
tries to reuse these pages through page recycling. This patch
ensures that on refill no non-QPL pages are posted to the device.

When the driver is running low on free buffers, an ondemand
allocation step kicks in that allocates a non-qpl page for
SKB business to free up the QPL page in use.

gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does
not attempt to mark buffer as used if a non-qpl page was allocated
ondemand.
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e7075ab4

gve: Tx path for DQO-QPL · a6fb8d5a

Rushil Gupta authored Aug 04, 2023

Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers.
When a packet needs to be transmitted, we break the packet into max
GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX
descriptor.
We allocate the TX buffers from the free list in dqo_tx.
We store these TX buffer indices in an array in the pending_packet
structure.

The TX buffers are returned to the free list in dqo_compl after
receiving packet completion or when removing packets from miss
completions list.
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6fb8d5a

gve: Control path for DQO-QPL · 66ce8e6b

Rushil Gupta authored Aug 04, 2023

GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO descriptor format.

Add checks, abi-changes and device options to support
QPL mode for DQO in addition to GQI. Also, use
pages-per-qpl supplied by device-option to control the
size of the "queue-page-list".
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66ce8e6b

Merge branch 'tcp-options-lockless' · 16fd7539

David S. Miller authored Aug 06, 2023

Eric Dumazet says:

====================
tcp: set few options locklessly

This series is avoiding the socket lock for six TCP options.

They are not heavily used, but this exercise can give
ideas for other parts of TCP/IP stack :)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

16fd7539

tcp: set TCP_DEFER_ACCEPT locklessly · 6e97ba55

Eric Dumazet authored Aug 04, 2023

rskq_defer_accept field can be read/written without
the need of holding the socket lock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e97ba55

tcp: set TCP_LINGER2 locklessly · a81722dd

Eric Dumazet authored Aug 04, 2023

tp->linger2 can be set locklessly as long as readers
use READ_ONCE().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a81722dd

tcp: set TCP_KEEPCNT locklessly · 84485080

Eric Dumazet authored Aug 04, 2023

tp->keepalive_probes can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

84485080

tcp: set TCP_KEEPINTVL locklessly · 6fd70a6b

Eric Dumazet authored Aug 04, 2023

tp->keepalive_intvl can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fd70a6b

tcp: set TCP_USER_TIMEOUT locklessly · d58f2e15

Eric Dumazet authored Aug 04, 2023

icsk->icsk_user_timeout can be set locklessly,
if all read sides use READ_ONCE().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d58f2e15

tcp: set TCP_SYNCNT locklessly · d44fd4a7

Eric Dumazet authored Aug 04, 2023

icsk->icsk_syn_retries can safely be set without locking the socket.

We have to add READ_ONCE() annotations in tcp_fastopen_synack_timer()
and tcp_write_timeout().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d44fd4a7

05 Aug, 2023 7 commits

Merge tag 'wireless-next-2023-08-04' of... · 81083076

Jakub Kicinski authored Aug 04, 2023

Merge tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.6

The first pull request for v6.6 and only driver patches this time.
Nothing special really standing out, it has been quiet most likely due
to vacations.

Major changes:

rtl8xxxu
 - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
   RTL8192EU and RTL8723BU

mwifiex
 - allow moving to a different namespace

mt76
 - preparation for mt7925 support
 - mt7981 support

ath12k
 - Extremely High Throughput (EHT) PHY support for Wi-Fi 7

* tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (172 commits)
  wifi: rtw89: return failure if needed firmware elements are not recognized
  wifi: rtw89: add to parse firmware elements of BB and RF tables
  wifi: rtw89: introduce infrastructure of firmware elements
  wifi: rtw89: add firmware suit for BB MCU 0/1
  wifi: rtw89: add firmware parser for v1 format
  wifi: rtw89: introduce v1 format of firmware header
  wifi: rtw89: support firmware log with formatted text
  wifi: rtw89: recognize log format from firmware file
  wifi: ath12k: avoid deadlock by change ieee80211_queue_work for regd_update_work
  wifi: ath12k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
  wifi: ath12k: relax list iteration in ath12k_mac_vif_unref()
  wifi: ath12k: configure puncturing bitmap
  wifi: ath12k: parse WMI service ready ext2 event
  wifi: ath12k: add MLO header in peer association
  wifi: ath12k: peer assoc for 320 MHz
  wifi: ath12k: add WMI support for EHT peer
  wifi: ath12k: prepare EHT peer assoc parameters
  wifi: ath12k: add EHT PHY modes
  wifi: ath12k: propagate EHT capabilities to userspace
  wifi: ath12k: WMI support to process EHT capabilities
  ...
====================

Link: https://lore.kernel.org/r/87msz7j942.fsf@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

81083076

Merge branch 'tcp-disable-header-prediction-for-md5' · 90cd5467

Jakub Kicinski authored Aug 04, 2023

Kuniyuki Iwashima says:

====================
tcp: Disable header prediction for MD5.

The 1st patch disable header prediction for MD5 flow and the 2nd
patch updates the stale comment in tcp_parse_options().
====================

Link: https://lore.kernel.org/r/20230803224552.69398-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

90cd5467

tcp: Update stale comment for MD5 in tcp_parse_options(). · b2051536

Kuniyuki Iwashima authored Aug 03, 2023

Since commit 9ea88a15 ("tcp: md5: check md5 signature without socket
lock"), the MD5 option is checked in tcp_v[46]_rcv().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-3-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b2051536

tcp: Disable header prediction for MD5 flow. · d0f2b7a9

Kuniyuki Iwashima authored Aug 03, 2023

TCP socket saves the minimum required header length in tcp_header_len
of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
to generate a part of TCP header in tcp_sock(sk)->pred_flags.

In tcp_rcv_established(), if the incoming packet has the same pattern
with pred_flags, we enter the fast path and skip full option parsing.

The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
again later in tcp_rcv_established() unless other options exist.  We
add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in two paths to avoid the
slow path.

For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
to tcp_header_len in tcp_create_openreq_child() after 3WHS.

On the other hand, we do it in tcp_connect_init() for active open
connections.  However, the value is overwritten while processing
SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().

These two cases will have the wrong value in pred_flags and never go
into the fast path.

We could update tcp_header_len in tcp_rcv_synsent_state_process(), but
a test with slightly modified netperf which uses MD5 for each flow shows
that the slow path is actually a bit faster than the fast path.

  On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem)

  $ for i in {1..10}; do
  ./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256;
  done

  Avg of 10
  * 36e68ead  : 10.376 Gbps
  * all fast path : 10.374 Gbps (patch v2, See Link)
  * all slow path : 10.394 Gbps

The header prediction is not worth adding complexity for MD5, so let's
disable it for MD5.

Link: https://lore.kernel.org/netdev/20230803042214.38309-1-kuniyu@amazon.com/Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-2-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d0f2b7a9

net: phy: move marking PHY on SFP module into SFP code · f4bf4678

Russell King (Oracle) authored Aug 03, 2023

Move marking the PHY as being on a SFP module into the SFP code between
getting the PHY device (and thus initialising the phy_device structure)
and registering the discovered device.

This means that PHY drivers can use phy_on_sfp() in their match and
get_features methods.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qRaga-001vKt-8X@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f4bf4678

mlxsw: spectrum: Remove unused function declarations · 852c18d5

Yue Haibing authored Aug 03, 2023

Commit c3d2ed93 ("mlxsw: Remove old parsing depth infrastructure")
left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put().
And commit 532b49e4 ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU")
remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the
declarations.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

852c18d5

ixgbevf: Remove unused function declarations · f5f2d9bb

Yue Haibing authored Aug 03, 2023

ixgbe_napi_add_all()/ixgbe_napi_del_all() are declared but never implemented in
commit 92915f71 ("ixgbevf: Driver main and ethool interface module and main header")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230803141904.15316-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f5f2d9bb

04 Aug, 2023 1 commit

af_vsock: Remove unused declaration vsock_release_pending()/vsock_init_tap() · 781486e4

Yue Haibing authored Aug 03, 2023

Commit d021c344 ("VSOCK: Introduce VM Sockets") declared but never implemented
vsock_release_pending(). Also vsock_init_tap() never implemented since introduction
in commit 531b3748 ("VSOCK: Add vsockmon tap functions").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20230803134507.22660-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

781486e4