Commits · 6215b608a8c4d4a478721e14a6faa0dc56e4a693 · Kirill Smelkov / linux

09 Sep, 2021 8 commits

sfc: last resort fallback for lack of xdp tx queues · 6215b608

Íñigo Huguet authored Sep 09, 2021

Previous patch addressed the situation of having some free resources for
xdp tx but not enough for one tx queue per CPU. This patch address the
worst case of not having resources at all for xdp tx.

Instead of using queues dedicated to xdp, normal queues used by network
stack are shared for both cases, using __netif_tx_lock for
synchronization. Also queue stop/restart must be considered in the xdp
path to avoid freezing the queue.

This is not the ideal situation we might want to be, and a performance
penalty is expected both for normal and xdp traffic, but at least XDP
will work in all possible situations (with a warning in the logs),
improving a bit the pain of not knowing in what situations we can use it
and in what situations we cannot.
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6215b608

sfc: fallback for lack of xdp tx queues · 41544618

Íñigo Huguet authored Sep 09, 2021

If there are not enough resources to allocate one TX queue per core for
XDP TX it was completely disabled.

This patch implements a fallback solution for sharing the available
queues using __netif_tx_lock for synchronization. In the normal case that
there is one TX queue per CPU, no locking is done, as it was before.

With this fallback solution, XDP TX will work in much more cases that
were failing, specially in machines with many CPUs. It's hard for XDP
users to know what features are supported across different NICs and
configurations, so they will benefit on having wider support.
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

41544618

net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP · 2a48d96f

Joakim Zhang authored Sep 09, 2021

Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid
build warning with !CONFIG_PM_SLEEP:

>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
     796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
     775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors

Fixes: 276aae37 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2a48d96f

net/l2tp: Fix reference count leak in l2tp_udp_recv_core · 9b6ff7eb

Xiyu Yang authored Sep 09, 2021

The reference count leak issue may take place in an error handling
path. If both conditions of tunnel->version == L2TP_HDR_VER_3 and the
return value of l2tp_v3_ensure_opt_in_linear is nonzero, the function
would directly jump to label invalid, without decrementing the reference
count of the l2tp_session object session increased earlier by
l2tp_tunnel_get_session(). This may result in refcount leaks.

Fix this issue by decrease the reference count before jumping to the
label invalid.

Fixes: 4522a70d ("l2tp: fix reading optional fields of L2TPv3")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9b6ff7eb

net/af_unix: fix a data-race in unix_dgram_poll · 04f08eb4

Eric Dumazet authored Sep 08, 2021

syzbot reported another data-race in af_unix [1]

Lets change __skb_insert() to use WRITE_ONCE() when changing
skb head qlen.

Also, change unix_dgram_poll() to use lockless version
of unix_recvq_full()

It is verry possible we can switch all/most unix_recvq_full()
to the lockless version, this will be done in a future kernel version.

[1] HEAD commit: 8596e589

BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll

write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
 __skb_insert include/linux/skbuff.h:1938 [inline]
 __skb_queue_before include/linux/skbuff.h:2043 [inline]
 __skb_queue_tail include/linux/skbuff.h:2076 [inline]
 skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
 unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
 sock_sendmsg_nosec net/socket.c:703 [inline]
 sock_sendmsg net/socket.c:723 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
 ___sys_sendmsg net/socket.c:2446 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
 __do_sys_sendmmsg net/socket.c:2561 [inline]
 __se_sys_sendmmsg net/socket.c:2558 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
 skb_queue_len include/linux/skbuff.h:1869 [inline]
 unix_recvq_full net/unix/af_unix.c:194 [inline]
 unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
 sock_poll+0x23e/0x260 net/socket.c:1288
 vfs_poll include/linux/poll.h:90 [inline]
 ep_item_poll fs/eventpoll.c:846 [inline]
 ep_send_events fs/eventpoll.c:1683 [inline]
 ep_poll fs/eventpoll.c:1798 [inline]
 do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
 __do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
 __se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
 __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000001b -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G        W         5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 86b18aaa ("skbuff: fix a data race in skb_queue_len()")
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04f08eb4

net: macb: fix use after free on rmmod · d82d5303

Tong Zhang authored Sep 08, 2021

plat_dev->dev->platform_data is released by platform_device_unregister(),
use of pclk and hclk is a use-after-free. Since device unregister won't
need a clk device we adjust the function call sequence to fix this issue.

[   31.261225] BUG: KASAN: use-after-free in macb_remove+0x77/0xc6 [macb_pci]
[   31.275563] Freed by task 306:
[   30.276782]  platform_device_release+0x25/0x80
Suggested-by: Nicolas Ferre <Nicolas.Ferre@microchip.com>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d82d5303

ibmvnic: check failover_pending in login response · 273c29e9

Sukadev Bhattiprolu authored Sep 08, 2021

If a failover occurs before a login response is received, the login
response buffer maybe undefined. Check that there was no failover
before accessing the login response buffer.

Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

273c29e9

vhost_net: fix OoB on sendmsg() failure. · 3c4cea8f

Paolo Abeni authored Sep 08, 2021

If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp'
and 'done_idx' indexes are left unchanged. If such failure happens
when batched_xdp == VHOST_NET_BATCH, the next call to
vhost_net_build_xdp() will access and write memory outside the xdp
buffers area.

Since sendmsg() can only error with EBADFD, this change addresses the
issue explicitly freeing the XDP buffers batch on error.

Fixes: 0a0be13b ("vhost_net: batch submitting XDP buffers to underlayer sockets")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c4cea8f

08 Sep, 2021 9 commits

net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume · 276aae37

Joakim Zhang authored Sep 08, 2021

commit 5f585913 ("net: stmmac: delete the eee_ctrl_timer after
napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
unfortunately, it only can resolve it for system reboot stress test. System
hang also can be reproduced easily during system suspend/resume stess test
when mount NFS on i.MX8MP EVK board.

In stmmac driver, eee feature is combined to phylink framework. When do
system suspend, phylink_stop() would queue delayed work, it invokes
stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
but it is not enough. Looking into eee_ctrl_timer expire callback
stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
asserted. This interrupt has chance to be issued when LPI state entry/exit
from the MAC, and at that time, clock could have been already disabled.
The result is that system hang when driver try to touch register from
interrupt handler.

The reason why above commit can fix system hang issue in stmmac_release()
is that, deactivate eee_ctrl_timer not just after napi disabled, further
after irq freed.

In conclusion, hardware would generate LPI interrupt when clock has been
disabled during suspend or resume, since hardware is in eee mode and LPI
interrupt enabled.

Interrupts from MAC, MTL and DMA level are enabled and never been disabled
when system suspend, so postpone clocks management from suspend stage to
noirq suspend stage should be more safe.

Fixes: 5f585913 ("net: stmmac: delete the eee_ctrl_timer after napi disabled")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

276aae37

net: ipa: initialize all filter table slots · b5c10223

Alex Elder authored Sep 07, 2021

There is an off-by-one problem in ipa_table_init_add(), when
initializing filter tables.

In that function, the number of filter table entries is determined
based on the number of set bits in the filter map.  However that
count does *not* include the extra "slot" in the filter table that
holds the filter map itself.  Meanwhile, ipa_table_addr() *does*
include the filter map in the memory it returns, but because the
count it's provided doesn't include it, it includes one too few
table entries.

Fix this by including the extra slot for the filter map in the count
computed in ipa_table_init_add().

Note: ipa_filter_reset_table() does not have this problem; it resets
filter table entries one by one, but does not overwrite the filter
bitmap.

Fixes: 2b9feef2 ("soc: qcom: ipa: filter and routing tables")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b5c10223

net: phylink: Update SFP selected interface on advertising changes · ea269a6f

Nathan Rossi authored Sep 02, 2021

Currently changes to the advertising state via ethtool do not cause any
reselection of the configured interface mode after the SFP is already
inserted and initially configured.

While it is not typical to change the advertised link modes for an
interface using an SFP in certain use cases it is desirable. In the case
of a SFP port that is capable of handling both SFP and SFP+ modules it
will automatically select between 1G and 10G modes depending on the
supported mode of the SFP. However if the SFP module is capable of
working in multiple modes (e.g. a SFP+ DAC that can operate at 1G or
10G), one end of the cable may be attached to a SFP 1000base-x port thus
the SFP+ end must be manually configured to the 1000base-x mode in order
for the link to be established.

This change causes the ethtool setting of advertised mode changes to
reselect the interface mode so that the link can be established.
Additionally when a module is inserted the advertising mode is reset to
match the supported modes of the module.
Signed-off-by: Nathan Rossi <nathan.rossi@digi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ea269a6f

ne2000: fix unused function warning · d7e203ff

Arnd Bergmann authored Sep 07, 2021

Geert noticed a warning on MIPS TX49xx, Atari and presuambly other
platforms when the driver is built-in but NETDEV_LEGACY_INIT is
disabled:

drivers/net/ethernet/8390/ne.c:909:20: warning: ‘ne_add_devices’ defined but not used [-Wunused-function]

Merge the two module init functions into a single one with an
IS_ENABLED() check to replace the incorrect #ifdef.

Fixes: 4228c394 ("make legacy ISA probe optional")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d7e203ff

Merge tag 'mlx5-fixes-2021-09-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · c324f023

David S. Miller authored Sep 08, 2021

Saeed Mahameed says:

====================
mlx5 fixes 2021-09-07

This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.

Included here, a patch which solves a build warning reported on
linux-kernel mailing list [1]:
Fix commit ("net/mlx5: Bridge, fix uninitialized variable usage")

I hope this series can make it to rc1.

[1] https://www.spinics.net/lists/netdev/msg765481.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c324f023

ibmvnic: check failover_pending in login response · d437f5aa

Sukadev Bhattiprolu authored Sep 07, 2021

If a failover occurs before a login response is received, the login
response buffer maybe undefined. Check that there was no failover
before accessing the login response buffer.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d437f5aa

mctp: perform route destruction under RCU read lock · 581edcd0

Jeremy Kerr authored Sep 08, 2021

The kernel test robot reports:

  [  843.509974][  T345] =============================
  [  843.524220][  T345] WARNING: suspicious RCU usage
  [  843.538791][  T345] 5.14.0-rc2-00606-g889b7da2 #1 Not tainted
  [  843.553617][  T345] -----------------------------
  [  843.567412][  T345] net/mctp/route.c:310 RCU-list traversed in non-reader section!!

- we're missing the rcu read lock acquire around the destruction path.

This change adds the acquire/release - the path is already atomic, and
we're using the _rcu list iterators.
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

581edcd0

dccp: don't duplicate ccid when cloning dccp sock · d9ea761f

Lin, Zhenpeng authored Sep 08, 2021

Commit 2677d206 ("dccp: don't free ccid2_hc_tx_sock ...") fixed
a UAF but reintroduced CVE-2017-6074.

When the sock is cloned, two dccps_hc_tx_ccid will reference to the
same ccid. So one can free the ccid object twice from two socks after
cloning.

This issue was found by "Hadar Manor" as well and assigned with
CVE-2020-16119, which was fixed in Ubuntu's kernel. So here I port
the patch from Ubuntu to fix it.

The patch prevents cloned socks from referencing the same ccid.

Fixes: 2677d206 ("dccp: don't free ccid2_hc_tx_sock ...")
Signed-off-by: Zhenpeng Lin <zplin@psu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>

d9ea761f

dt-bindings: net: sun8i-emac: Add compatible for D1 · 0f31ab21

Samuel Holland authored Sep 07, 2021

The D1 SoC contains EMAC hardware which is compatible with the A64 EMAC.
Add the new compatible string, with the A64 as a fallback.
Signed-off-by: Samuel Holland <samuel@sholland.org>
Acked-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: David S. Miller <davem@davemloft.net>

0f31ab21

07 Sep, 2021 23 commits

net/mlx5e: Fix condition when retrieving PTP-rqn · 8db6a54f

Aya Levin authored Aug 29, 2021

When activating the PTP-RQ, redirect the RQT from drop-RQ to PTP-RQ.
Use mlx5e_channels_get_ptp_rqn to retrieve the rqn. This helper returns
a boolean (not status), hence caller should consider return value 0 as a
fail. Change the caller interpretation of the return value.

Fixes: 43ec0f41 ("net/mlx5e: Hide all implementation details of mlx5e_rx_res")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

8db6a54f

net/mlx5e: Fix mutual exclusion between CQE compression and HW TS · c91c1da7

Aya Levin authored Jul 15, 2021

Some profiles of the driver don't support a dedicated PTP-RQ, hence can't
support HW TS and CQE compression simultaneously. When HW TS is enabled
the COE compression is disabled, and should be restored when the HW TS
is turned off. Add rx_filter as an input to modifying CQE compression to
enforce this restriction.

Fixes: 256f79d1 ("net/mlx5e: Fix HW TS with CQE compression according to profile")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

c91c1da7

net/mlx5: Fix potential sleeping in atomic context · ee27e330

Maor Gottlieb authored Sep 01, 2021

Fixes the below flow of sleeping in atomic context by releasing
the RCU lock before calling to free_match_list.

build_match_list() <- disables preempt
-> free_match_list()
   -> tree_put_node()
      -> down_write_ref_node() <- take write lock

Fixes: 693c6883 ("net/mlx5: Add hash table for flow groups in flow table")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

ee27e330

net/mlx5: FWTrace, cancel work on alloc pd error flow · dfe6fd72

Saeed Mahameed authored Aug 18, 2021

Handle error flow on mlx5_core_alloc_pd() failure,
read_fw_strings_work must be canceled.

Fixes: c71ad41c ("net/mlx5: FW tracer, events handling")
Reported-by: Pavel Machek (CIP) <pavel@denx.de>
Suggested-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>

dfe6fd72

net/mlx5: Lag, don't update lag if lag isn't supported · da8252d5

Mark Bloch authored Aug 11, 2021

In NICs that don't support LAG, the LAG control structure won't be
allocated. If it wasn't allocated it means LAG doesn't exists and can be
skipped.

Fixes: cac1eb2c ("net/mlx5: Lag, properly lock eswitch if needed")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

da8252d5

net/mlx5: Fix rdma aux device on devlink reload · 897ae4b4

Parav Pandit authored Aug 17, 2021

RDMA auxdev parameter registration was skipped for eswitch manager PCI PF.
Due to this when devlink parameter is read, it reads as false in below
code flow.

$ devlink dev reload pci/0000:06:00.0
  devlink_reload()
    mlx5_load_one()
      mlx5_attach_device()
        is_ib_enabled()
          devlink_param_driverinit_value_get()

Due to this, is_ib_enabled() returns false for the RDMA auxiliary
device. This results into a skipping RDMA auxiliary device creation on
reload.

There is no need to check for eswitch manager capability to support RDMA
auxiliary device. Hence, fix it by skipping eswitch manager capability.

Fixes: 87158ced ("net/mlx5: Support enable_rdma devlink dev param")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

897ae4b4

net/mlx5: Bridge, fix uninitialized variable usage · 8343268e

Vlad Buslov authored Aug 18, 2021

In some conditions variable 'err' is not assigned with value in
mlx5_esw_bridge_port_obj_attr_set() and mlx5_esw_bridge_port_changeupper()
functions after recent changes to support LAG. Initialize the variable with
zero value in both cases.
Reported-by: Colin King <colin.king@canonical.com>
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
CC: linux-kernel@vger.kernel.org
Fixes: ff9b7521 ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>

8343268e

Merge tag 'net-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 626bf91a

Linus Torvalds authored Sep 07, 2021

Pull networking fixes and stragglers from Jakub Kicinski:
 "Networking stragglers and fixes, including changes from netfilter,
  wireless and can.

  Current release - regressions:

   - qrtr: revert check in qrtr_endpoint_post(), fixes audio and wifi

   - ip_gre: validate csum_start only on pull

   - bnxt_en: fix 64-bit doorbell operation on 32-bit kernels

   - ionic: fix double use of queue-lock, fix a sleeping in atomic

   - can: c_can: fix null-ptr-deref on ioctl()

   - cs89x0: disable compile testing on powerpc

  Current release - new code bugs:

   - bridge: mcast: fix vlan port router deadlock, consistently disable
     BH

  Previous releases - regressions:

   - dsa: tag_rtl4_a: fix egress tags, only port 0 was working

   - mptcp: fix possible divide by zero

   - netfilter: nft_ct: protect nft_ct_pcpu_template_refcnt with mutex

   - netfilter: socket: icmp6: fix use-after-scope

   - stmmac: fix MAC not working when system resume back with WoL active

  Previous releases - always broken:

   - ip/ip6_gre: use the same logic as SIT interfaces when computing
     v6LL address

   - seg6: set fc_nlinfo in nh_create_ipv4, nh_create_ipv6

   - mptcp: only send extra TCP acks in eligible socket states

   - dsa: lantiq_gswip: fix maximum frame length

   - stmmac: fix overall budget calculation for rxtx_napi

   - bnxt_en: fix firmware version reporting via devlink

   - renesas: sh_eth: add missing barrier to fix freeing wrong tx
     descriptor

  Stragglers:

   - netfilter: conntrack: switch to siphash

   - netfilter: refuse insertion if chain has grown too large

   - ncsi: add get MAC address command to get Intel i210 MAC address"

* tag 'net-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  ieee802154: Remove redundant initialization of variable ret
  net: stmmac: fix MAC not working when system resume back with WoL active
  net: phylink: add suspend/resume support
  net: renesas: sh_eth: Fix freeing wrong tx descriptor
  bonding: 3ad: pass parameter bond_params by reference
  cxgb3: fix oops on module removal
  can: c_can: fix null-ptr-deref on ioctl()
  can: rcar_canfd: add __maybe_unused annotation to silence warning
  net: wwan: iosm: Unify IO accessors used in the driver
  net: wwan: iosm: Replace io.*64_lo_hi() with regular accessors
  net: qcom/emac: Replace strlcpy with strscpy
  ip6_gre: Revert "ip6_gre: add validation for csum_start"
  net: hns3: make hclgevf_cmd_caps_bit_map0 and hclge_cmd_caps_bit_map0 static
  selftests/bpf: Test XDP bonding nest and unwind
  bonding: Fix negative jump label count on nested bonding
  MAINTAINERS: add VM SOCKETS (AF_VSOCK) entry
  stmmac: dwmac-loongson:Fix missing return value
  iwlwifi: fix printk format warnings in uefi.c
  net: create netdev->dev_addr assignment helpers
  bnxt_en: Fix possible unintended driver initiated error recovery
  ...

626bf91a

Merge tag 'linux-watchdog-5.15-rc1' of git://www.linux-watchdog.org/linux-watchdog · 4c00e1e2

Linus Torvalds authored Sep 07, 2021

Pull watchdog updates from Wim Van Sebroeck:

 - add Mediatek MT7986 & MT8195 wdt support

 - add Maxim MAX63xx

 - drop bd70528 support

 - rewrite ixp4xx to watchdog framework

 - constify static struct watchdog_ops for sl28cpld_wdt, mpc8xxx_wdt and
   tqmx86

 - introduce watchdog_dev_suspend/resume

 - several fixes and improvements

* tag 'linux-watchdog-5.15-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  dt-bindings: watchdog: Add compatible for Mediatek MT7986
  watchdog: ixp4xx: Rewrite driver to use core
  watchdog: Start watchdog in watchdog_set_last_hw_keepalive only if appropriate
  watchdog: max63xx_wdt: Add device tree probing
  dt-bindings: watchdog: Add Maxim MAX63xx bindings
  watchdog: mediatek: mt8195: add wdt support
  dt-bindings: reset: mt8195: add toprgu reset-controller header file
  watchdog: tqmx86: Constify static struct watchdog_ops
  watchdog: mpc8xxx_wdt: Constify static struct watchdog_ops
  watchdog: sl28cpld_wdt: Constify static struct watchdog_ops
  watchdog: iTCO_wdt: Fix detection of SMI-off case
  watchdog: bcm2835_wdt: consider system-power-controller property
  watchdog: imx2_wdg: notify wdog core to stop ping worker on suspend
  watchdog: introduce watchdog_dev_suspend/resume
  watchdog: Fix NULL pointer dereference when releasing cdev
  watchdog: only run driver set_pretimeout op if device supports it
  watchdog: bd70528 drop bd70528 support

4c00e1e2

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 192ad3c2

Linus Torvalds authored Sep 07, 2021

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - Page ownership tracking between host EL1 and EL2
   - Rely on userspace page tables to create large stage-2 mappings
   - Fix incompatibility between pKVM and kmemleak
   - Fix the PMU reset state, and improve the performance of the virtual
     PMU
   - Move over to the generic KVM entry code
   - Address PSCI reset issues w.r.t. save/restore
   - Preliminary rework for the upcoming pKVM fixed feature
   - A bunch of MM cleanups
   - a vGIC fix for timer spurious interrupts
   - Various cleanups

  s390:
   - enable interpretation of specification exceptions
   - fix a vcpu_idx vs vcpu_id mixup

  x86:
   - fast (lockless) page fault support for the new MMU
   - new MMU now the default
   - increased maximum allowed VCPU count
   - allow inhibit IRQs on KVM_RUN while debugging guests
   - let Hyper-V-enabled guests run with virtualized LAPIC as long as
     they do not enable the Hyper-V "AutoEOI" feature
   - fixes and optimizations for the toggling of AMD AVIC (virtualized
     LAPIC)
   - tuning for the case when two-dimensional paging (EPT/NPT) is
     disabled
   - bugfixes and cleanups, especially with respect to vCPU reset and
     choosing a paging mode based on CR0/CR4/EFER
   - support for 5-level page table on AMD processors

  Generic:
   - MMU notifier invalidation callbacks do not take mmu_lock unless
     necessary
   - improved caching of LRU kvm_memory_slot
   - support for histogram statistics
   - add statistics for halt polling and remote TLB flush requests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits)
  KVM: Drop unused kvm_dirty_gfn_invalid()
  KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
  KVM: MMU: mark role_regs and role accessors as maybe unused
  KVM: MIPS: Remove a "set but not used" variable
  x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait
  KVM: stats: Add VM stat for remote tlb flush requests
  KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count()
  KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page
  KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality
  Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
  KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page
  kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710
  kvm: x86: Increase MAX_VCPUS to 1024
  kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS
  KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
  KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host
  KVM: s390: index kvm->arch.idle_mask by vcpu_idx
  KVM: s390: Enable specification exception interpretation
  KVM: arm64: Trim guest debug exception handling
  KVM: SVM: Add 5-level page table support for SVM
  ...

192ad3c2

Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · a2b28235

Linus Torvalds authored Sep 07, 2021

Pull dmi fix from Jean Delvare.

Unbreak some existing udev/hwdb modalias matches due to misplaced
product_sku field.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi: Move product_sku info to the end of the modalias

a2b28235

Merge tag 'ntb-5.15' of git://github.com/jonmason/ntb · 1735715e

Linus Torvalds authored Sep 07, 2021

Pull NTB updates from Jon Mason:
 "Bug fixes and clean-ups for Linux v5.15"

* tag 'ntb-5.15' of git://github.com/jonmason/ntb:
  NTB: switch from 'pci_' to 'dma_' API
  ntb: ntb_pingpong: remove redundant initialization of variables msg_data and spad_data
  NTB: perf: Fix an error code in perf_setup_inbuf()
  NTB: Fix an error code in ntb_msit_probe()
  ntb: intel: remove invalid email address in header comment

1735715e

Merge tag 'rproc-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc · 21f577b0

Linus Torvalds authored Sep 07, 2021

Pull remoteproc updates from Bjorn Andersson:

 - move the crash recovery worker to the freezable work queue to avoid
   interaction with other drivers during suspend & resume

 - fix a couple of typos in comments

 - add support for handling the audio DSP on SDM660

 - fix a race between the Qualcomm wireless subsystem driver and the
   associated driver for the RF chip

* tag 'rproc-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
  remoteproc: q6v5_pas: Add sdm660 ADSP PIL compatible
  dt-bindings: remoteproc: qcom: adsp: Add SDM660 ADSP
  remoteproc: use freezable workqueue for crash notifications
  remoteproc: fix kernel doc for struct rproc_ops
  remoteproc: fix an typo in fw_elf_get_class code comments
  remoteproc: qcom: wcnss: Fix race with iris probe

21f577b0

Merge tag 'backlight-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · 2d7b4cdb

Linus Torvalds authored Sep 07, 2021

Pull backlight updates from Lee Jones:
 "Fix-ups:
   - Improve bootloader/kernel device handover

  Bug Fixes:
   - Stabilise backlight in ktd253 driver"

* tag 'backlight-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: pwm_bl: Improve bootloader/kernel device handover
  backlight: ktd253: Stabilize backlight

2d7b4cdb

Merge tag 'mfd-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 86406a9e

Linus Torvalds authored Sep 07, 2021

Pull MFD updates from Lee Jones:
 "Core Frameworks:
   - Add support for registering devices via MFD cells to Simple MFD (I2C)

  New Drivers:
   - Add support for Renesas Synchronization Management Unit (SMU)

  New Device Support:
   - Add support for N5010 to Intel M10 BMC
   - Add support for Cannon Lake to Intel LPSS ACPI
   - Add support for Samsung SSG{1,2} to ST-Ericsson's U8500 family
   - Add support for TQMx110EB and TQMxE40x to TQ-Systems PLD TQMx86

  New Functionality:
   - Add support for GPIO to Intel LPC ICH
   - Add support for Reset to Texas Instruments TPS65086

  Fix-ups:
   - Trivial, sorting, whitespace, renaming, etc; mt6360-core, db8500-prcmu-regs, tqmx86
   - Device Tree fiddling; syscon, axp20x, qcom,pm8008, ti,tps65086, brcm,cru
   - Use proper APIs for IRQ map resolution; ab8500-core, stmpe, tc3589x, wm8994-irq
   - Pass 'supplied-from' property through axp288_fuel_gauge via swnode
   - Remove unused file entry; MAINTAINERS
   - Make interrupt line optional; tps65086
   - Rename db8500-cpuidle driver symbol; db8500-prcmu
   - Remove support for unused hardware; tqmx86
   - Provide a standard LPC clock frequency for unknown boards; tqmx86
   - Remove unused code; ti_am335x_tscadc
   - Use of_iomap() instead of ioremap(); syscon

  Bug Fixes:
   - Clear GPIO IRQ resource flags when no IRQ is set; tqmx86
   - Fix incorrect/misleading frequencies; db8500-prcmu
   - Mitigate namespace clash with other GPIOBASE users"

* tag 'mfd-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (31 commits)
  mfd: lpc_sch: Rename GPIOBASE to prevent build error
  mfd: syscon: Use of_iomap() instead of ioremap()
  dt-bindings: mfd: Add Broadcom CRU
  mfd: ti_am335x_tscadc: Delete superfluous error message
  mfd: tqmx86: Assume 24MHz LPC clock for unknown boards
  mfd: tqmx86: Add support for TQ-Systems DMI IDs
  mfd: tqmx86: Add support for TQMx110EB and TQMxE40x
  mfd: tqmx86: Fix typo in "platform"
  mfd: tqmx86: Remove incorrect TQMx90UC board ID
  mfd: tqmx86: Clear GPIO IRQ resource when no IRQ is set
  mfd: simple-mfd-i2c: Add support for registering devices via MFD cells
  mfd/cpuidle: ux500: Rename driver symbol
  mfd: tps65086: Add cell entry for reset driver
  mfd: tps65086: Make interrupt line optional
  dt-bindings: mfd: Convert tps65086.txt to YAML
  MAINTAINERS: Adjust ARM/NOMADIK/Ux500 ARCHITECTURES to file renaming
  mfd: db8500-prcmu: Handle missing FW variant
  mfd: db8500-prcmu: Rename register header
  mfd: axp20x: Add supplied-from property to axp288_fuel_gauge cell
  mfd: Don't use irq_create_mapping() to resolve a mapping
  ...

86406a9e

Merge tag 'gpio-updates-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 5e6a5845

Linus Torvalds authored Sep 07, 2021

Pull gpio updates from Bartosz Golaszewski:
 "We mostly have various improvements and refactoring all over the place
  but also some interesting new features - like the virtio GPIO driver
  that allows guest VMs to use host's GPIOs. We also have a new/old GPIO
  driver for rockchip - this one has been split out of the pinctrl
  driver.

  Summary:

   - new driver: gpio-virtio allowing a guest VM running linux to access
     GPIO lines provided by the host

   - split the GPIO driver out of the rockchip pin control driver

   - add support for a new model to gpio-aspeed-sgpio, refactor the
     driver and use generic device property interfaces, improve property
     sanitization

   - add ACPI support to gpio-tegra186

   - improve the code setting the line names to support multiple GPIO
     banks per device

   - constify a bunch of OF functions in the core GPIO code and make the
     declaration for one of the core OF functions we use consistent
     within its header

   - use software nodes in intel_quark_i2c_gpio

   - add support for the gpio-line-names property in gpio-mt7621

   - use the standard GPIO function for setting the GPIO names in
     gpio-brcmstb

   - fix a bunch of leaks and other bugs in gpio-mpc8xxx

   - use generic pm callbacks in gpio-ml-ioh

   - improve resource management and PM handling in gpio-mlxbf2

   - modernize and improve the gpio-dwapb driver

   - coding style improvements in gpio-rcar

   - documentation fixes and improvements

   - update the MAINTAINERS entry for gpio-zynq

   - minor tweaks in several drivers"

* tag 'gpio-updates-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (35 commits)
  gpio: mpc8xxx: Use 'devm_gpiochip_add_data()' to simplify the code and avoid a leak
  gpio: mpc8xxx: Fix a potential double iounmap call in 'mpc8xxx_probe()'
  gpio: mpc8xxx: Fix a resources leak in the error handling path of 'mpc8xxx_probe()'
  gpio: viperboard: remove platform_set_drvdata() call in probe
  gpio: virtio: Add missing mailings lists in MAINTAINERS entry
  gpio: virtio: Fix sparse warnings
  gpio: remove the obsolete MX35 3DS BOARD MC9S08DZ60 GPIO functions
  gpio: max730x: Use the right include
  gpio: Add virtio-gpio driver
  gpio: mlxbf2: Use DEFINE_RES_MEM_NAMED() helper macro
  gpio: mlxbf2: Use devm_platform_ioremap_resource()
  gpio: mlxbf2: Drop wrong use of ACPI_PTR()
  gpio: mlxbf2: Convert to device PM ops
  gpio: dwapb: Get rid of legacy platform data
  mfd: intel_quark_i2c_gpio: Convert GPIO to use software nodes
  gpio: dwapb: Read GPIO base from gpio-base property
  gpio: dwapb: Unify ACPI enumeration checks in get_irq() and configure_irqs()
  gpiolib: Deduplicate forward declaration in the consumer.h header
  MAINTAINERS: update gpio-zynq.yaml reference
  gpio: tegra186: Add ACPI support
  ...

5e6a5845

Merge tag 'fuse-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 75b96f0e

Linus Torvalds authored Sep 07, 2021

Pull fuse updates from Miklos Szeredi:

 - Allow mounting an active fuse device. Previously the fuse device
   would always be mounted during initialization, and sharing a fuse
   superblock was only possible through mount or namespace cloning

 - Fix data flushing in syncfs (virtiofs only)

 - Fix data flushing in copy_file_range()

 - Fix a possible deadlock in atomic O_TRUNC

 - Misc fixes and cleanups

* tag 'fuse-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: remove unused arg in fuse_write_file_get()
  fuse: wait for writepages in syncfs
  fuse: flush extending writes
  fuse: truncate pagecache on atomic_o_trunc
  fuse: allow sharing existing sb
  fuse: move fget() to fuse_get_tree()
  fuse: move option checking into fuse_fill_super()
  fuse: name fs_context consistently
  fuse: fix use after free in fuse_read_interrupt()

75b96f0e

Merge tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux · 996fe061

Linus Torvalds authored Sep 07, 2021

Pull kgdb updates from Daniel Thompson:
 "Changes for kgdb/kdb this cycle are dominated by a change from Sumit
  that removes as small (256K) private heap from kdb. This is change
  I've hoped for ever since I discovered how few users of this heap
  remained in the kernel, so many thanks to Sumit for hunting these
  down.

  The other change is an incremental step towards SPDX headers"

* tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kernel: debug: Convert to SPDX identifier
  kdb: Rename members of struct kdbtab_t
  kdb: Simplify kdb_defcmd macro logic
  kdb: Get rid of redundant kdb_register_flags()
  kdb: Rename struct defcmd_set to struct kdb_macro
  kdb: Get rid of custom debug heap allocator

996fe061

Revert "memcg: enable accounting for pollfd and select bits arrays" · 0bcfe68b

Linus Torvalds authored Sep 07, 2021

This reverts commit b6558434.

Just like with the memcg lock accounting, the kernel test robot reports
a sizeable performance regression for this commit, and while it clearly
does the rigth thing in theory, we'll need to look at just how to avoid
or minimize the performance overhead of the memcg accounting.

People already have suggestions on how to do that, but it's "future
work".

So revert it for now.

[ Note: the first link below is for this same commit but a different
  commit ID, because it's the kernel test robot ended up noticing it in
  Andrew Morton's patch queue ]

Link: https://lore.kernel.org/lkml/20210905132732.GC15026@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0bcfe68b

Revert "memcg: enable accounting for file lock caches" · 3754707b

Linus Torvalds authored Sep 07, 2021

This reverts commit 0f12156d.

The kernel test robot reports a sizeable performance regression for this
commit, and while it clearly does the rigth thing in theory, we'll need
to look at just how to avoid or minimize the performance overhead of the
memcg accounting.

People already have suggestions on how to do that, but it's "future
work".

So revert it for now.

Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3754707b

Revert "mm/gup: remove try_get_page(), call try_get_compound_head() directly" · cd1adf1b

Linus Torvalds authored Sep 07, 2021

This reverts commit 9857a17f.

That commit was completely broken, and I should have caught on to it
earlier.  But happily, the kernel test robot noticed the breakage fairly
quickly.

The breakage is because "try_get_page()" is about avoiding the page
reference count overflow case, but is otherwise the exact same as a
plain "get_page()".

In contrast, "try_get_compound_head()" is an entirely different beast,
and uses __page_cache_add_speculative() because it's not just about the
page reference count, but also about possibly racing with the underlying
page going away.

So all the commentary about how

 "try_get_page() has fallen a little behind in terms of maintenance,
  try_get_compound_head() handles speculative page references more
  thoroughly"

was just completely wrong: yes, try_get_compound_head() handles
speculative page references, but the point is that try_get_page() does
not, and must not.

So there's no lack of maintainance - there are fundamentally different
semantics.

A speculative page reference would be entirely wrong in "get_page()",
and it's entirely wrong in "try_get_page()".  It's not about
speculation, it's purely about "uhhuh, you can't get this page because
you've tried to increment the reference count too much already".

The reason the kernel test robot noticed this bug was that it hit the
VM_BUG_ON() in __page_cache_add_speculative(), which is all about
verifying that the context of any speculative page access is correct.
But since that isn't what try_get_page() is all about, the VM_BUG_ON()
tests things that are not correct to test for try_get_page().
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cd1adf1b

ieee802154: Remove redundant initialization of variable ret · 0f77f2de

Colin Ian King authored Sep 07, 2021

The variable ret is being initialized with a value that is never read, it
is being updated later on. The assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0f77f2de

Merge branch 'stmmac-wol-fix' · d1bf7338

David S. Miller authored Sep 07, 2021

Joakim Zhang says:

====================
net: stmmac: fix WoL issue

This patch set fixes stmmac not working after system resume back with WoL
active. Thanks a lot for Russell King keeps looking into this issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d1bf7338