Commits · 6af3b428cd02b09db79f0ee0265dcb8799d9327c · Kirill Smelkov / linux

04 Mar, 2022 29 commits

Merge branch 'skb-drop-reasons' · 6af3b428

David S. Miller authored Mar 04, 2022

Menglong Dong says:

====================
net: dev: add skb drop reasons to net/core/dev.c

In the commit c504e5c2 ("net: skb: introduce kfree_skb_reason()"),
we added the support of reporting the reasons of skb drops to kfree_skb
tracepoint. And in this series patches, reasons for skb drops are added
to the link layer, which means that 'net/core/dev.c' is our target.

Following functions are processed:

sch_handle_egress()
__dev_xmit_skb()
enqueue_to_backlog()
do_xdp_generic()
sch_handle_ingress()
__netif_receive_skb_core()

and following new drop reasons are added (what they mean can be see in
the document of them):

SKB_DROP_REASON_QDISC_EGRESS
SKB_DROP_REASON_QDISC_DROP
SKB_DROP_REASON_CPU_BACKLOG
SKB_DROP_REASON_XDP
SKB_DROP_REASON_QDISC_INGRESS
SKB_DROP_REASON_PTYPE_ABSENT

In order to add skb drop reasons to kfree_skb_list(), the function
kfree_skb_list_reason() is introduced in the 2th patch, which will be
used in __dev_xmit_skb() in the 3th patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6af3b428

net: dev: use kfree_skb_reason() for __netif_receive_skb_core() · 6c2728b7

Menglong Dong authored Mar 04, 2022

Add reason for skb drops to __netif_receive_skb_core() when packet_type
not found to handle the skb. For this purpose, the drop reason
SKB_DROP_REASON_PTYPE_ABSENT is introduced. Take ether packets for
example, this case mainly happens when L3 protocol is not supported.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c2728b7

net: dev: use kfree_skb_reason() for sch_handle_ingress() · a568aff2

Menglong Dong authored Mar 04, 2022

Replace kfree_skb() used in sch_handle_ingress() with
kfree_skb_reason(). Following drop reasons are introduced:

SKB_DROP_REASON_TC_INGRESS
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a568aff2

net: dev: use kfree_skb_reason() for do_xdp_generic() · 7e726ed8

Menglong Dong authored Mar 04, 2022

Replace kfree_skb() used in do_xdp_generic() with kfree_skb_reason().
The drop reason SKB_DROP_REASON_XDP is introduced for this case.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e726ed8

net: dev: use kfree_skb_reason() for enqueue_to_backlog() · 44f0bd40

Menglong Dong authored Mar 04, 2022

Replace kfree_skb() used in enqueue_to_backlog() with
kfree_skb_reason(). The skb rop reason SKB_DROP_REASON_CPU_BACKLOG is
introduced for the case of failing to enqueue the skb to the per CPU
backlog queue. The further reason can be backlog queue full or RPS
flow limition, and I think we needn't to make further distinctions.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

44f0bd40

net: dev: add skb drop reasons to __dev_xmit_skb() · 7faef054

Menglong Dong authored Mar 04, 2022

Add reasons for skb drops to __dev_xmit_skb() by replacing
kfree_skb_list() with kfree_skb_list_reason(). The drop reason of
SKB_DROP_REASON_QDISC_DROP is introduced for qdisc enqueue fails.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7faef054

net: skb: introduce the function kfree_skb_list_reason() · 215b0f19

Menglong Dong authored Mar 04, 2022

To report reasons of skb drops, introduce the function
kfree_skb_list_reason() and make kfree_skb_list() an inline call to
it. This function will be used in the next commit in
__dev_xmit_skb().
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

215b0f19

net: dev: use kfree_skb_reason() for sch_handle_egress() · 98b4d7a4

Menglong Dong authored Mar 04, 2022

Replace kfree_skb() used in sch_handle_egress() with kfree_skb_reason().
The drop reason SKB_DROP_REASON_TC_EGRESS is introduced. Considering
the code path of tc egerss, we make it distinct with the drop reason
of SKB_DROP_REASON_QDISC_DROP in the next commit.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

98b4d7a4

Merge branch 'netif_rx' · 9f9919f7

David S. Miller authored Mar 04, 2022

Sebastian Andrzej Siewior says:

====================
net: Convert user to netif_rx().

This is the first batch of converting netif_rx_ni() caller to
netif_rx(). The change making this possible is net-next and
netif_rx_ni() is a wrapper around netif_rx(). This is a clean up in
order to remove netif_rx_ni().

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: bridge@lists.linux-foundation.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kurt Kanzenbach <kurt@linutronix.de>
Cc: linux-doc@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: Łukasz Stelmach <l.stelmach@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc: UNGLinuxDriver@microchip.com
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9f9919f7

net: dev: Use netif_rx(). · ad0a043f

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad0a043f

net: bridge: Use netif_rx(). · 2e83bdd5

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: bridge@lists.linux-foundation.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e83bdd5

net: macvlan: Use netif_rx(). · 566214f4

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

566214f4

net: ethernet: Use netif_rx(). · 90f77c1c

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Łukasz Stelmach <l.stelmach@samsung.com>
Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

90f77c1c

net: dsa: Use netif_rx(). · db00cc9d

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Kurt Kanzenbach <kurt@linutronix.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Signed-off-by: David S. Miller <davem@davemloft.net>

db00cc9d

net: caif: Use netif_rx(). · 3fb4430e

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

3fb4430e

net: sgi-xp: Use netif_rx(). · 4343b866

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Robin Holt <robinmholt@gmail.com>
Cc: Steve Wahl <steve.wahl@hpe.com>
Cc: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

4343b866

net: xtensa: Use netif_rx(). · aa4e5761

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: linux-xtensa@linux-xtensa.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa4e5761

docs: networking: Use netif_rx(). · 21f95a88

Sebastian Andrzej Siewior authored Mar 03, 2022

Since commit
   baebdf48 ("net: dev: Makes sure netif_rx() can be invoked in any context.")

the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.

Use netif_rx().

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

21f95a88

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 4ee508ff

David S. Miller authored Mar 04, 2022

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-03-03

Jacob Keller says:

This series refactors the ice networking driver VF storage from a simple
static array to a hash table. It also introduces krefs and proper locking
and protection to prevent common use-after-free and concurrency issues.

There are two motivations for this work. First is to make the ice driver
more resilient by preventing a whole class of use-after-free bugs that can
occur around concurrent access to VF structures while removing VFs.

The second is to prepare the ice driver for future virtualization work to
support Scalable IOV, an alternative VF implementation compared to Single
Root IOV. The new VF implementation will allow for more dynamic VF creation
and removal, necessitating a more robust implementation for VF storage that
can't rely on the existing mechanisms to prevent concurrent access
violations.

The first few patches are cleanup and preparatory work needed to make the
conversion to the hash table safe. Following this preparatory work is a
patch to migrate the VF structures and variables to a new sub-structure for
code clarity. Next introduce new interface functions to abstract the VF
storage. Finally, the driver is actually converted to the hash table and
kref implementation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4ee508ff

Merge branch 'ocelot-felix-cleanups' · f2ecfa06

David S. Miller authored Mar 04, 2022

Vladimir Oltean says:

====================
Cleanups for ocelot/felix drivers

This patch set is an assorted collection of minor cleanups brought to
the felix DSA driver and ocelot switch library.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f2ecfa06

net: dsa: felix: remove redundant assignment in felix_8021q_cpu_port_deinit · 162fbf6a

Vladimir Oltean authored Mar 03, 2022

Due to an apparently incorrect conflict resolution on my part in commit
54c31984 ("net: mscc: ocelot: enforce FDB isolation when
VLAN-unaware"), "ocelot->ports[port]->is_dsa_8021q_cpu = false" was
supposed to be replaced by "ocelot_port_unset_dsa_8021q_cpu(ocelot, port)"
which does the same thing, and more. But now we have both, so the direct
assignment is redundant. Remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

162fbf6a

net: dsa: felix: print error message in felix_check_xtr_pkt() · 5d3bb7dd

Vladimir Oltean authored Mar 03, 2022

Packet extraction failures over register-based MMIO are silent, and
difficult to pinpoint. Add an error message to remedy this.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5d3bb7dd

net: dsa: felix: initialize "err" to 0 in felix_check_xtr_pkt() · dbd03285

Vladimir Oltean authored Mar 03, 2022

Automated tools complain that felix_check_xtr_pkt() has logic to drain
the CPU queue on the reception of a PTP packet over Ethernet, yet it
returns an uninitialized error code in the case where the CPU queue was
empty.

This is not likely to happen (/possible if hardware works correctly),
but it isn't a fatal condition either. The PTP packet will be dequeued
from the CPU queue when the next PTP packet arrives. So initialize "err"
to 0 for the case where nothing was dequeued during this iteration.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dbd03285

net: dsa: felix: drop the ptp_type argument from felix_check_xtr_pkt() · d219b4b6

Vladimir Oltean authored Mar 03, 2022

The DSA ->port_rxtstamp() function is never called for PTP_CLASS_NONE:

dsa_skb_defer_rx_timestamp:

	if (type == PTP_CLASS_NONE)
		return false;

	if (likely(ds->ops->port_rxtstamp))
		return ds->ops->port_rxtstamp(ds, p->dp->index, skb, type);

So practically, the argument is unused, so remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d219b4b6

net: dsa: felix: remove ocelot->npi assignment from felix_8021q_cpu_port_init · 28c1305b

Vladimir Oltean authored Mar 03, 2022

This assignment is redundant, since ocelot->npi has already been set to
-1 by felix_npi_port_deinit(). Call path:

felix_change_tag_protocol
-> felix_del_tag_protocol(DSA_TAG_PROTO_OCELOT)
   -> felix_teardown_tag_npi
      -> felix_npi_port_deinit
-> felix_set_tag_protocol(DSA_TAG_PROTO_OCELOT_8021Q)
   -> felix_setup_tag_8021q
      -> felix_8021q_cpu_port_init
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

28c1305b

net: mscc: ocelot: use pretty names for IPPROTO_UDP and IPPROTO_TCP · c3cde44f

Vladimir Oltean authored Mar 03, 2022

Hardcoding these IP protocol numbers in is2_entry_set() obscures the
purpose of the code, so replace the magic numbers with the definitions
from linux/in.h.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3cde44f

net: mscc: ocelot: use list_for_each_entry in ocelot_vcap_block_remove_filter · c5a0edae

Vladimir Oltean authored Mar 03, 2022

Simplify ocelot_vcap_block_remove_filter by using list_for_each_entry
instead of list_for_each.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5a0edae

net/smc: fix document build WARNING from smc-sysctl.rst · f9f52c34

Dust Li authored Mar 03, 2022

Stephen reported the following warning messages from smc-sysctl.rst

Documentation/networking/smc-sysctl.rst:3: WARNING: Title overline
too short.
Documentation/networking/smc-sysctl.rst: WARNING: document isn't
included in any toctree

Fix the title overline and add smc-sysctl entry into
Documentation/networking/index.rst

Fixes: 12bbb0d1 ("net/smc: add sysctl for autocorking")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220303113527.62047-1-dust.li@linux.alibaba.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f9f52c34

net: marvell: Use min() instead of doing it manually · 2f5e65de

Haowen Bai authored Mar 03, 2022

Fix following coccicheck warning:
drivers/net/ethernet/marvell/mv643xx_eth.c:1664:35-36: WARNING opportunity for min()
Signed-off-by: Haowen Bai <baihaowen88@gmail.com>
Link: https://lore.kernel.org/r/1646271529-7659-1-git-send-email-baihaowen88@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

2f5e65de

03 Mar, 2022 11 commits

ice: convert VF storage to hash table with krefs and RCU · 3d5985a1

Jacob Keller authored Feb 16, 2022

The ice driver stores VF structures in a simple array which is allocated
once at the time of VF creation. The VF structures are then accessed
from the array by their VF ID. The ID must be between 0 and the number
of allocated VFs.

Multiple threads can access this table:

 * .ndo operations such as .ndo_get_vf_cfg or .ndo_set_vf_trust
 * interrupts, such as due to messages from the VF using the virtchnl
   communication
 * processing such as device reset
 * commands to add or remove VFs

The current implementation does not keep track of when all threads are
done operating on a VF and can potentially result in use-after-free
issues caused by one thread accessing a VF structure after it has been
released when removing VFs. Some of these are prevented with various
state flags and checks.

In addition, this structure is quite static and does not support a
planned future where virtualization can be more dynamic. As we begin to
look at supporting Scalable IOV with the ice driver (as opposed to just
supporting Single Root IOV), this structure is not sufficient.

In the future, VFs will be able to be added and removed individually and
dynamically.

To allow for this, and to better protect against a whole class of
use-after-free bugs, replace the VF storage with a combination of a hash
table and krefs to reference track all of the accesses to VFs through
the hash table.

A hash table still allows efficient look up of the VF given its ID, but
also allows adding and removing VFs. It does not require contiguous VF
IDs.

The use of krefs allows the cleanup of the VF memory to be delayed until
after all threads have released their reference (by calling ice_put_vf).

To prevent corruption of the hash table, a combination of RCU and the
mutex table_lock are used. Addition and removal from the hash table use
the RCU-aware hash macros. This allows simple read-only look ups that
iterate to locate a single VF can be fast using RCU. Accesses which
modify the hash table, or which can't take RCU because they sleep, will
hold the mutex lock.

By using this design, we have a stronger guarantee that the VF structure
can't be released until after all threads are finished operating on it.
We also pave the way for the more dynamic Scalable IOV implementation in
the future.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

3d5985a1

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 80901bff

Jakub Kicinski authored Mar 03, 2022

net/batman-adv/hard-interface.c
commit 690bb6fb ("batman-adv: Request iflink once in batadv-on-batadv check")
commit 6ee3c393 ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
commit 4d08b7b5 ("net/smc: Fix cleanup when register ULP fails")
commit 462791bb ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/Signed-off-by: Jakub Kicinski <kuba@kernel.org>

80901bff

Merge tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · b949c21f

Linus Torvalds authored Mar 03, 2022

Pull networking fixes from Jakub Kicinski:
 "Including fixes from can, xfrm, wifi, bluetooth, and netfilter.

  Lots of various size fixes, the length of the tag speaks for itself.
  Most of the 5.17-relevant stuff comes from xfrm, wifi and bt trees
  which had been lagging as you pointed out previously. But there's also
  a larger than we'd like portion of fixes for bugs from previous
  releases.

  Three more fixes still under discussion, including and xfrm revert for
  uAPI error.

  Current release - regressions:

   - iwlwifi: don't advertise TWT support, prevent FW crash

   - xfrm: fix the if_id check in changelink

   - xen/netfront: destroy queues before real_num_tx_queues is zeroed

   - bluetooth: fix not checking MGMT cmd pending queue, make scanning
     work again

  Current release - new code bugs:

   - mptcp: make SIOCOUTQ accurate for fallback socket

   - bluetooth: access skb->len after null check

   - bluetooth: hci_sync: fix not using conn_timeout

   - smc: fix cleanup when register ULP fails

   - dsa: restore error path of dsa_tree_change_tag_proto

   - iwlwifi: fix build error for IWLMEI

   - iwlwifi: mvm: propagate error from request_ownership to the user

  Previous releases - regressions:

   - xfrm: fix pMTU regression when reported pMTU is too small

   - xfrm: fix TCP MSS calculation when pMTU is close to 1280

   - bluetooth: fix bt_skb_sendmmsg not allocating partial chunks

   - ipv6: ensure we call ipv6_mc_down() at most once, prevent leaks

   - ipv6: prevent leaks in igmp6 when input queues get full

   - fix up skbs delta_truesize in UDP GRO frag_list

   - eth: e1000e: fix possible HW unit hang after an s0ix exit

   - eth: e1000e: correct NVM checksum verification flow

   - ptp: ocp: fix large time adjustments

  Previous releases - always broken:

   - tcp: make tcp_read_sock() more robust in presence of urgent data

   - xfrm: distinguishing SAs and SPs by if_id in xfrm_migrate

   - xfrm: fix xfrm_migrate issues when address family changes

   - dcb: flush lingering app table entries for unregistered devices

   - smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error

   - mac80211: fix EAPoL rekey fail in 802.3 rx path

   - mac80211: fix forwarded mesh frames AC & queue selection

   - netfilter: nf_queue: fix socket access races and bugs

   - batman-adv: fix ToCToU iflink problems and check the result belongs
     to the expected net namespace

   - can: gs_usb, etas_es58x: fix opened_channel_cnt's accounting

   - can: rcar_canfd: register the CAN device when fully ready

   - eth: igb, igc: phy: drop premature return leaking HW semaphore

   - eth: ixgbe: xsk: change !netif_carrier_ok() handling in
     ixgbe_xmit_zc(), prevent live lock when link goes down

   - eth: stmmac: only enable DMA interrupts when ready

   - eth: sparx5: move vlan checks before any changes are made

   - eth: iavf: fix races around init, removal, resets and vlan ops

   - ibmvnic: more reset flow fixes

  Misc:

   - eth: fix return value of __setup handlers"

* tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
  ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()
  net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change
  ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
  selftests: mlxsw: resource_scale: Fix return value
  selftests: mlxsw: tc_police_scale: Make test more robust
  net: dcb: disable softirqs in dcbnl_flush_dev()
  bnx2: Fix an error message
  sfc: extend the locking on mcdi->seqno
  net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
  net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
  net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
  tcp: make tcp_read_sock() more robust
  bpf, sockmap: Do not ignore orig_len parameter
  net: ipa: add an interconnect dependency
  net: fix up skbs delta_truesize in UDP GRO frag_list
  iwlwifi: mvm: return value for request_ownership
  nl80211: Update bss channel on channel switch for P2P_CLIENT
  iwlwifi: fix build error for IWLMEI
  ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments
  batman-adv: Don't expect inter-netns unique iflink indices
  ...

b949c21f

ice: introduce VF accessor functions · fb916db1

Jacob Keller authored Feb 16, 2022

Before we switch the VF data structure storage mechanism to a hash,
introduce new accessor functions to define the new interface.

* ice_get_vf_by_id is a function used to obtain a reference to a VF from
  the table based on its VF ID
* ice_has_vfs is used to quickly check if any VFs are configured
* ice_get_num_vfs is used to get an exact count of how many VFs are
  configured

We can drop the old ice_validate_vf_id function, since every caller was
just going to immediately access the VF table to get a reference
anyways. This way we simply use the single ice_get_vf_by_id to both
validate the VF ID is within range and that there exists a VF with that
ID.

This change enables us to more easily convert the codebase to the hash
table since most callers now properly use the interface.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

fb916db1

ice: factor VF variables to separate structure · 000773c0

Jacob Keller authored Feb 16, 2022

We maintain a number of values for VFs within the ice_pf structure. This
includes the VF table, the number of allocated VFs, the maximum number
of supported SR-IOV VFs, the number of queue pairs per VF, the number of
MSI-X vectors per VF, and a bitmap of the VFs with detected MDD events.

We're about to add a few more variables to this list. Clean this up
first by extracting these members out into a new ice_vfs structure
defined in ice_virtchnl_pf.h
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

000773c0

Merge tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · e58bd49d

Linus Torvalds authored Mar 03, 2022

Pull MIPS fixes from Thomas Bogendoerfer:

 - Fix memory detection for MT7621 devices

 - Fix setnocoherentio kernel option

 - Fix warning when CONFIG_SCHED_CORE is enabled

* tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: ralink: mt7621: use bitwise NOT instead of logical
  mips: setup: fix setnocoherentio() boolean setting
  MIPS: smp: fill in sibling and core maps earlier
  MIPS: ralink: mt7621: do memory detection on KSEG1

e58bd49d

Merge tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux · 4d5ae234

Linus Torvalds authored Mar 03, 2022

Pull auxdisplay fixes from Miguel Ojeda:
 "A few lcd2s fixes from Andy Shevchenko"

* tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux:
  auxdisplay: lcd2s: Use proper API to free the instance of charlcd object
  auxdisplay: lcd2s: Fix memory leak in ->remove()
  auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature

4d5ae234

ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report() · 2d3916f3

Eric Dumazet authored Mar 03, 2022

While investigating on why a synchronize_net() has been added recently
in ipv6_mc_down(), I found that igmp6_event_query() and igmp6_event_report()
might drop skbs in some cases.

Discussion about removing synchronize_net() from ipv6_mc_down()
will happen in a different thread.

Fixes: f185de28 ("mld: add new workqueues for process mld events")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220303173728.937869-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

2d3916f3

ice: convert ice_for_each_vf to include VF entry iterator · c4c2c7db

Jacob Keller authored Feb 16, 2022

The ice_for_each_vf macro is intended to be used to loop over all VFs.
The current implementation relies on an iterator that is the index into
the VF array in the PF structure. This forces all users to perform a
look up themselves.

This abstraction forces a lot of duplicate work on callers and leaks the
interface implementation to the caller. Replace this with an
implementation that includes the VF pointer the primary iterator. This
version simplifies callers which just want to iterate over every VF, as
they no longer need to perform their own lookup.

The "i" iterator value is replaced with a new unsigned int "bkt"
parameter, as this will match the necessary interface for replacing
the VF array with a hash table. For now, the bkt is the VF ID, but in
the future it will simply be the hash bucket index. Document that it
should not be treated as a VF ID.

This change aims to simplify switching from the array to a hash table. I
considered alternative implementations such as an xarray but decided
that the hash table was the simplest and most suitable implementation. I
also looked at methods to hide the bkt iterator entirely, but I couldn't
come up with a feasible solution that worked for hash table iterators.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

c4c2c7db

ice: use ice_for_each_vf for iteration during removal · 19281e86

Jacob Keller authored Feb 16, 2022

When removing VFs, the driver takes a weird approach of assigning
pf->num_alloc_vfs to 0 before iterating over the VFs using a temporary
variable.

This logic has been in the driver for a long time, and seems to have
been carried forward from i40e.

We want to refactor the way VFs are stored, and iterating over the data
structure without the ice_for_each_vf interface impedes this work.

The logic relies on implicitly using the num_alloc_vfs as a sort of
"safe guard" for accessing VF data.

While this sort of guard makes sense for Single Root IOV where all VFs
are added at once, the data structures don't work for VFs which can be
added and removed dynamically. We also have a separate state flag,
ICE_VF_DEINIT_IN_PROGRESS which is a stronger protection against
concurrent removal and access.

Avoid the custom tmp iteration and replace it with the standard
ice_for_each_vf iterator. Delay the assignment of num_alloc_vfs until
after this loop finishes.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

19281e86

ice: remove checks in ice_vc_send_msg_to_vf · 59e1f857

Jacob Keller authored Feb 16, 2022

The ice_vc_send_msg_to_vf function is used by the PF to send a response
to a VF. This function has overzealous checks to ensure its not passed a
NULL VF pointer and to ensure that the passed in struct ice_vf has a
valid vf_id sub-member.

These checks have existed since commit 1071a835 ("ice: Implement
virtchnl commands for AVF support") and function as simple sanity
checks.

We are planning to refactor the ice driver to use a hash table along
with appropriate locks in a future refactor. This change will modify how
the ice_validate_vf_id function works. Instead of a simple >= check to
ensure the VF ID is between some range, it will check the hash table to
see if the specified VF ID is actually in the table. This requires that
the function properly lock the table to prevent race conditions.

The checks may seem ok at first glance, but they don't really provide
much benefit.

In order for ice_vc_send_msg_to_vf to have these checks fail, the
callers must either (1) pass NULL as the VF, (2) construct an invalid VF
pointer manually, or (3) be using a VF pointer which becomes invalid
after they obtain it properly using ice_get_vf_by_id.

For (1), a cursory glance over callers of ice_vc_send_msg_to_vf can show
that in most cases the functions already operate assuming their VF
pointer is valid, such as by derferencing vf->pf or other members.

They obtain the VF pointer by accessing the VF array using the VF ID,
which can never produce a NULL value (since its a simple address
operation on the array it will not be NULL.

The sole exception for (1) is that ice_vc_process_vf_msg will forward a
NULL VF pointer to this function as part of its goto error handler
logic. This requires some minor cleanup to simply exit immediately when
an invalid VF ID is detected (Rather than use the same error flow as
the rest of the function).

For (2), it is unexpected for a flow to construct a VF pointer manually
instead of accessing the VF array. Defending against this is likely to
just hide bad programming.

For (3), it is definitely true that VF pointers could become invalid,
for example if a thread is processing a VF message while the VF gets
removed. However, the correct solution is not to add additional checks
like this which do not guarantee to prevent the race. Instead we plan to
solve the root of the problem by preventing the possibility entirely.

This solution will require the change to a hash table with proper
locking and reference counts of the VF structures. When this is done,
ice_validate_vf_id will require locking of the hash table. This will be
problematic because all of the callers of ice_vc_send_msg_to_vf will
already have to take the lock to obtain the VF pointer anyways. With a
mutex, this leads to a double lock that could hang the kernel thread.

Avoid this by removing the checks which don't provide much value, so
that we can safely add the necessary protections properly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

59e1f857