Commits · 77cbf790e5b482256662e14c8b6ef4fecb07d06d · Kirill Smelkov / linux

16 Mar, 2021 6 commits

nl80211: fix locking for wireless device netns change · 77cbf790

Johannes Berg authored Mar 10, 2021

We have all the network interfaces marked as netns-local
since the only reasonable thing to do right now is to set
a whole device, including all netdevs, into a different
network namespace. For this reason, we also have our own
way of changing the network namespace.

Unfortunately, the RTNL locking changes broke this, and
it now results in many RTNL assertions. The trivial fix
for those (just hold RTNL for the changes) however leads
to deadlocks in the cfg80211 netdev notifier.

Since we only need the wiphy, and that's still protected
by the RTNL, add a new NL80211_FLAG_NO_WIPHY_MTX flag to
the nl80211 ops and use it to _not_ take the wiphy mutex
but only the RTNL. This way, the notifier does all the
work necessary during unregistration/registration of the
netdevs from the old and in the new namespace.
Reported-by: Sid Hayn <sidhayn@gmail.com>
Fixes: a05829a7 ("cfg80211: avoid holding the RTNL when calling the driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210310215839.eadf7c43781b.I5fc6cf6676f800ab8008e03bbea9c3349b02d804@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

77cbf790

mac80211: Check crypto_aead_encrypt for errors · 58d25626

Daniel Phan authored Mar 09, 2021

crypto_aead_encrypt returns <0 on error, so if these calls are not checked,
execution may continue with failed encrypts. It also seems that these two
crypto_aead_encrypt calls are the only instances in the codebase that are
not checked for errors.
Signed-off-by: Daniel Phan <daniel.phan36@gmail.com>
Link: https://lore.kernel.org/r/20210309204137.823268-1-daniel.phan36@gmail.comSigned-off-by: Johannes Berg <johannes.berg@intel.com>

58d25626

mac80211: Allow HE operation to be longer than expected. · 0f7e90fa

Brian Norris authored Feb 23, 2021

We observed some Cisco APs sending the following HE Operation IE in
associate response:

  ff 0a 24 f4 3f 00 01 fc ff 00 00 00

Its HE operation parameter is 0x003ff4, so the expected total length is
7 which does not match the actual length = 10. This causes association
failing with "HE AP is missing HE Capability/operation."

According to P802.11ax_D4 Table9-94, HE operation is extensible, and
according to 802.11-2016 10.27.8, STA should discard the part beyond
the maximum length and parse the truncated element.

Allow HE operation element to be longer than expected to handle this
case and future extensions.

Fixes: e4d005b8 ("mac80211: refactor extended element parsing")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Yen-lin Lai <yenlinlai@chromium.org>
Link: https://lore.kernel.org/r/20210223051926.2653301-1-yenlinlai@chromium.orgSigned-off-by: Johannes Berg <johannes.berg@intel.com>

0f7e90fa

mac80211: minstrel_ht: remove unused variable 'mg' · 29175be0

Johannes Berg authored Feb 19, 2021

This probably came in through some refactoring and what is
now a call to minstrel_ht_group_min_rate_offset(), remove
the unused variable.
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210219105744.f2538a80f6cf.I3d53554c158d5b896ac07ea546bceac67372ec28@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

29175be0

mac80211: fix double free in ibss_leave · 3bd801b1

Markus Theil authored Feb 13, 2021

Clear beacon ie pointer and ie length after free
in order to prevent double free.

==================================================================
BUG: KASAN: double-free or invalid-free \
in ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876

CPU: 0 PID: 8472 Comm: syz-executor100 Not tainted 5.11.0-rc6-syzkaller #0
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2c6 mm/kasan/report.c:230
 kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355
 ____kasan_slab_free+0xcc/0xe0 mm/kasan/common.c:341
 kasan_slab_free include/linux/kasan.h:192 [inline]
 __cache_free mm/slab.c:3424 [inline]
 kfree+0xed/0x270 mm/slab.c:3760
 ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876
 rdev_leave_ibss net/wireless/rdev-ops.h:545 [inline]
 __cfg80211_leave_ibss+0x19a/0x4c0 net/wireless/ibss.c:212
 __cfg80211_leave+0x327/0x430 net/wireless/core.c:1172
 cfg80211_leave net/wireless/core.c:1221 [inline]
 cfg80211_netdev_notifier_call+0x9e8/0x12c0 net/wireless/core.c:1335
 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040
 call_netdevice_notifiers_extack net/core/dev.c:2052 [inline]
 call_netdevice_notifiers net/core/dev.c:2066 [inline]
 __dev_close_many+0xee/0x2e0 net/core/dev.c:1586
 __dev_close net/core/dev.c:1624 [inline]
 __dev_change_flags+0x2cb/0x730 net/core/dev.c:8476
 dev_change_flags+0x8a/0x160 net/core/dev.c:8549
 dev_ifsioc+0x210/0xa70 net/core/dev_ioctl.c:265
 dev_ioctl+0x1b1/0xc40 net/core/dev_ioctl.c:511
 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
 sock_ioctl+0x477/0x6a0 net/socket.c:1177
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+93976391bf299d425f44@syzkaller.appspotmail.com
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20210213133653.367130-1-markus.theil@tu-ilmenau.deSigned-off-by: Johannes Berg <johannes.berg@intel.com>

3bd801b1

mac80211: fix rate mask reset · 1944015f

Johannes Berg authored Feb 12, 2021

Coverity reported the strange "if (~...)" condition that's
always true. It suggested that ! was intended instead of ~,
but upon further analysis I'm convinced that what really was
intended was a comparison to 0xff/0xffff (in HT/VHT cases
respectively), since this indicates that all of the rates
are enabled.

Change the comparison accordingly.

I'm guessing this never really mattered because a reset to
not having a rate mask is basically equivalent to having a
mask that enables all rates.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: 2ffbe6d3 ("mac80211: fix and optimize MCS mask handling")
Fixes: b119ad6e ("mac80211: add rate mask logic for vht rates")
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210212112213.36b38078f569.I8546a20c80bc1669058eb453e213630b846e107b@changeidSigned-off-by: Johannes Berg <johannes.berg@intel.com>

1944015f

15 Mar, 2021 5 commits

mptcp: fix ADD_ADDR HMAC in case port is specified · 13832ae2

Davide Caratti authored Mar 15, 2021

Currently, Linux computes the HMAC contained in ADD_ADDR sub-option using
the Address Id and the IP Address, and hardcodes a destination port equal
to zero. This is not ok for ADD_ADDR with port: ensure to account for the
endpoint port when computing the HMAC, in compliance with RFC8684 §3.4.1.

Fixes: 22fb85ff ("mptcp: add port support for ADD_ADDR suboption writing")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

13832ae2

tcp: relookup sock for RST+ACK packets handled by obsolete req sock · 7233da86

Alexander Ovechkin authored Mar 15, 2021

Currently tcp_check_req can be called with obsolete req socket for which big
socket have been already created (because of CPU race or early demux
assigning req socket to multiple packets in gro batch).

Commit e0f9759f ("tcp: try to keep packet if SYN_RCV race
is lost") added retry in case when tcp_check_req is called for PSH|ACK packet.
But if client sends RST+ACK immediatly after connection being
established (it is performing healthcheck, for example) retry does not
occur. In that case tcp_check_req tries to close req socket,
leaving big socket active.

Fixes: e0f9759f ("tcp: try to keep packet if SYN_RCV race is lost")
Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
Reported-by: Oleg Senin <olegsenin@yandex-team.ru>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7233da86

tipc: better validate user input in tipc_nl_retrieve_key() · 0217ed28

Eric Dumazet authored Mar 15, 2021

Before calling tipc_aead_key_size(ptr), we need to ensure
we have enough data to dereference ptr->keylen.

We probably also want to make sure tipc_aead_key_size()
wont overflow with malicious ptr->keylen values.

Syzbot reported:

BUG: KMSAN: uninit-value in __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
BUG: KMSAN: uninit-value in tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
CPU: 0 PID: 21060 Comm: syz-executor.5 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:120
 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
 __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
 tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x1319/0x1610 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x6fa/0x810 net/netlink/af_netlink.c:2494
 genl_rcv+0x63/0x80 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x11d6/0x14a0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x1740/0x1840 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg net/socket.c:672 [inline]
 ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
 ___sys_sendmsg net/socket.c:2399 [inline]
 __sys_sendmsg+0x714/0x830 net/socket.c:2432
 __compat_sys_sendmsg net/compat.c:347 [inline]
 __do_compat_sys_sendmsg net/compat.c:354 [inline]
 __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
 __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
RIP: 0023:0xf7f60549
Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f555a5fc EFLAGS: 00000296 ORIG_RAX: 0000000000000172
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000200
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
 kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104
 kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76
 slab_alloc_node mm/slub.c:2907 [inline]
 __kmalloc_node_track_caller+0xa37/0x1430 mm/slub.c:4527
 __kmalloc_reserve net/core/skbuff.c:142 [inline]
 __alloc_skb+0x2f8/0xb30 net/core/skbuff.c:210
 alloc_skb include/linux/skbuff.h:1099 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
 netlink_sendmsg+0xdbc/0x1840 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg net/socket.c:672 [inline]
 ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
 ___sys_sendmsg net/socket.c:2399 [inline]
 __sys_sendmsg+0x714/0x830 net/socket.c:2432
 __compat_sys_sendmsg net/compat.c:347 [inline]
 __do_compat_sys_sendmsg net/compat.c:354 [inline]
 __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
 __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

Fixes: e1f32190 ("tipc: add support for AEAD key setting via netlink")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tuong Lien <tuong.t.lien@dektech.com.au>
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0217ed28

net: phylink: Fix phylink_err() function name error in phylink_major_config · d82c6c1a

Ong Boon Leong authored Mar 15, 2021

if pl->mac_ops->mac_finish() failed, phylink_err should use
"mac_finish" instead of "mac_prepare".

Fixes: b7ad14c2 ("net: phylink: re-implement interface configuration with PCS")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d82c6c1a

net: hdlc_x25: Prevent racing between "x25_close" and "x25_xmit"/"x25_rx" · bf0ffea3

Xie He authored Mar 14, 2021

"x25_close" is called by "hdlc_close" in "hdlc.c", which is called by
hardware drivers' "ndo_stop" function.
"x25_xmit" is called by "hdlc_start_xmit" in "hdlc.c", which is hardware
drivers' "ndo_start_xmit" function.
"x25_rx" is called by "hdlc_rcv" in "hdlc.c", which receives HDLC frames
from "net/core/dev.c".

"x25_close" races with "x25_xmit" and "x25_rx" because their callers race.

However, we need to ensure that the LAPB APIs called in "x25_xmit" and
"x25_rx" are called before "lapb_unregister" is called in "x25_close".

This patch adds locking to ensure when "x25_xmit" and "x25_rx" are doing
their work, "lapb_unregister" is not yet called in "x25_close".

Reasons for not solving the racing between "x25_close" and "x25_xmit" by
calling "netif_tx_disable" in "x25_close":
1. We still need to solve the racing between "x25_close" and "x25_rx";
2. The design of the HDLC subsystem assumes the HDLC hardware drivers
have full control over the TX queue, and the HDLC protocol drivers (like
this driver) have no control. Controlling the queue here in the protocol
driver may interfere with hardware drivers' control of the queue.

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bf0ffea3

14 Mar, 2021 4 commits

flow_dissector: fix byteorder of dissected ICMP ID · a25f8222

Alexander Lobakin authored Mar 12, 2021

flow_dissector_key_icmp::id is of type u16 (CPU byteorder),
ICMP header has its ID field in network byteorder obviously.
Sparse says:

net/core/flow_dissector.c:178:43: warning: restricted __be16 degrades to integer

Convert ID value to CPU byteorder when storing it into
flow_dissector_key_icmp.

Fixes: 5dec597e ("flow_dissector: extract more ICMP information")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: David S. Miller <davem@davemloft.net>

a25f8222

net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() · 50535249

Eric Dumazet authored Mar 12, 2021

struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently
does not clear it before copying kernel data to user space.

It might be too late to name the hole since sockaddr_qrtr structure is uapi.

BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:120
 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
 kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402
 kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
 instrument_copy_to_user include/linux/instrumented.h:121 [inline]
 _copy_to_user+0x1ac/0x270 lib/usercopy.c:33
 copy_to_user include/linux/uaccess.h:209 [inline]
 move_addr_to_user+0x3a2/0x640 net/socket.c:237
 ____sys_recvmsg+0x696/0xd50 net/socket.c:2575
 ___sys_recvmsg net/socket.c:2610 [inline]
 do_recvmmsg+0xa97/0x22d0 net/socket.c:2710
 __sys_recvmmsg net/socket.c:2789 [inline]
 __do_sys_recvmmsg net/socket.c:2812 [inline]
 __se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805
 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805
 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x465f69
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69
RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003
RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60
R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000

Local variable ----addr@____sys_recvmsg created at:
 ____sys_recvmsg+0x168/0xd50 net/socket.c:2550
 ____sys_recvmsg+0x168/0xd50 net/socket.c:2550

Bytes 2-3 of 12 are uninitialized
Memory access of size 12 starts at ffff88817c627b40
Data copied to user address 0000000020000140

Fixes: bdabad3e ("net: Add Qualcomm IPC router")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Courtney Cavin <courtney.cavin@sonymobile.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

50535249

net: arcnet: com20020 fix error handling · 6577b9a5

Tong Zhang authored Mar 14, 2021

There are two issues when handling error case in com20020pci_probe()

1. priv might be not initialized yet when calling com20020pci_remove()
from com20020pci_probe(), since the priv is set at the very last but it
can jump to error handling in the middle and priv remains NULL.
2. memory leak - the net device is allocated in alloc_arcdev but not
properly released if error happens in the middle of the big for loop

[    1.529110] BUG: kernel NULL pointer dereference, address: 0000000000000008
[    1.531447] RIP: 0010:com20020pci_remove+0x15/0x60 [com20020_pci]
[    1.536805] Call Trace:
[    1.536939]  com20020pci_probe+0x3f2/0x48c [com20020_pci]
[    1.537226]  local_pci_probe+0x48/0x80
[    1.539918]  com20020pci_init+0x3f/0x1000 [com20020_pci]
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6577b9a5

devlink: fix typo in documentation · ad236ccd

Eva Dengler authored Mar 13, 2021

This commit fixes three spelling typos in devlink-dpipe.rst and
devlink-port.rst.
Signed-off-by: Eva Dengler <eva.dengler@fau.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad236ccd

13 Mar, 2021 9 commits

net: ipa: terminate message handler arrays · 3a9ef3e1

Alex Elder authored Mar 12, 2021

When a QMI handle is initialized, an array of message handler
structures is provided, defining how any received message should
be handled based on its type and message ID.  The QMI core code
traverses this array when a message arrives and calls the function
associated with the (type, msg_id) found in the array.

The array is supposed to be terminated with an empty (all zero)
entry though.  Without it, an unsupported message will cause
the QMI core code to go past the end of the array.

Fix this bug, by properly terminating the message handler arrays
provided when QMI handles are set up by the IPA driver.

Fixes: 530f9216 ("soc: qcom: ipa: AP/modem communications")
Reported-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3a9ef3e1

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git · 47251a36

David S. Miller authored Mar 12, 2021

/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-03-12

This series contains updates to ice, i40e, ixgbe and igb drivers.

Magnus adjusts the return value for xsk allocation for ice. This fixes
reporting of napi work done and matches the behavior of other Intel NIC
drivers for xsk allocations.

Maciej moves storing of the rx_offset value to after the build_skb flag
is set as this flag affects the offset value for ice, i40e, and ixgbe.

Li RongQing resolves an issue where an Rx buffer can be reused
prematurely with XDP redirect for igb.
====================

47251a36

ibmvnic: update MAINTAINERS · 6afa455e

Lijun Pan authored Mar 12, 2021

Tom wrote most of the driver code and his experience is valuable to us.
Add him as a Reviewer so that patches will be Cc'ed and reviewed by him.
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6afa455e

selftests: mptcp: Restore packet capture option in join tests · a673321a

Mat Martineau authored Mar 12, 2021

The join self tests previously used the '-c' command line option to
enable creation of pcap files for the tests that run, but the change to
allow running a subset of the join tests made overlapping use of that
option.

Restore the capture functionality with '-c' and move the syncookie test
option to '-k'.

Fixes: 1002b89f ("selftests: mptcp: add command line arguments for mptcp_join.sh")
Acked-and-tested-by: Geliang Tang <geliangtang@gmail.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a673321a

net: dsa: mt7530: setup core clock even in TRGMII mode · c3b8e079

Ilya Lipnitskiy authored Mar 12, 2021

A recent change to MIPS ralink reset logic made it so mt7530 actually
resets the switch on platforms such as mt7621 (where bit 2 is the reset
line for the switch). That exposed an issue where the switch would not
function properly in TRGMII mode after a reset.

Reconfigure core clock in TRGMII mode to fix the issue.

Tested on Ubiquiti ER-X (MT7621) with TRGMII mode enabled.

Fixes: 3f9ef778 ("MIPS: ralink: manage low reset lines")
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3b8e079

mptcp: fix bit MPTCP_PUSH_PENDING tests · 2e5de7e0

Dan Carpenter authored Mar 12, 2021

The MPTCP_PUSH_PENDING define is 6 and these tests should be testing if
BIT(6) is set.

Fixes: c2e6048f ("mptcp: fix race in release_cb")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e5de7e0

net: phy: broadcom: Fix RGMII delays for BCM50160 and BCM50610M · b1dd9bf6

Florian Fainelli authored Mar 11, 2021

The PHY driver entry for BCM50160 and BCM50610M calls
bcm54xx_config_init() but does not call bcm54xx_config_clock_delay() in
order to configuration appropriate clock delays on the PHY, fix that.

Fixes: 73333626 ("net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1dd9bf6

ftgmac100: Restart MAC HW once · 68970873

Dylan Hung authored Mar 12, 2021

The interrupt handler may set the flag to reset the mac in the future,
but that flag is not cleared once the reset has occurred.

Fixes: 10cbd640 ("ftgmac100: Rework NAPI & interrupts handling")
Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

68970873

net: axienet: Fix probe error cleanup · 59cd4f19

Robert Hancock authored Mar 11, 2021

The driver did not always clean up all allocated resources when probe
failed. Fix the probe cleanup path to clean up everything that was
allocated.

Fixes: 57baf8cc ("net: axienet: Handle deferred probe on clock properly")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

59cd4f19

12 Mar, 2021 15 commits

net: correct sk_acceptq_is_full() · f211ac15

liuyacan authored Mar 12, 2021

The "backlog" argument in listen() specifies
the maximom length of pending connections,
so the accept queue should be considered full
if there are exactly "backlog" elements.
Signed-off-by: liuyacan <yacanliu@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f211ac15

Revert "net: bonding: fix error return code of bond_neigh_init()" · 080bfa1e

David S. Miller authored Mar 12, 2021

This reverts commit 2055a99d.

This change rejects legitimate configurations.

A slave doesn't need to exist nor implement ndo_slave_setup.
Signed-off-by: David S. Miller <davem@davemloft.net>

080bfa1e

igb: avoid premature Rx buffer reuse · 98dfb02a

Li RongQing authored Jan 21, 2021

Igb needs a similar fix as commit 75aab4e1 ("i40e: avoid
premature Rx buffer reuse")

The page recycle code, incorrectly, relied on that a page fragment
could not be freed inside xdp_do_redirect(). This assumption leads to
that page fragments that are used by the stack/XDP redirect can be
reused and overwritten.

To avoid this, store the page count prior invoking xdp_do_redirect().

Longer explanation:

Intel NICs have a recycle mechanism. The main idea is that a page is
split into two parts. One part is owned by the driver, one part might
be owned by someone else, such as the stack.

t0: Page is allocated, and put on the Rx ring
              +---------------
used by NIC ->| upper buffer
(rx_buffer)   +---------------
              | lower buffer
              +---------------
  page count  == USHRT_MAX
  rx_buffer->pagecnt_bias == USHRT_MAX

t1: Buffer is received, and passed to the stack (e.g.)
              +---------------
              | upper buff (skb)
              +---------------
used by NIC ->| lower buffer
(rx_buffer)   +---------------
  page count  == USHRT_MAX
  rx_buffer->pagecnt_bias == USHRT_MAX - 1

t2: Buffer is received, and redirected
              +---------------
              | upper buff (skb)
              +---------------
used by NIC ->| lower buffer
(rx_buffer)   +---------------

Now, prior calling xdp_do_redirect():
  page count  == USHRT_MAX
  rx_buffer->pagecnt_bias == USHRT_MAX - 2

This means that buffer *cannot* be flipped/reused, because the skb is
still using it.

The problem arises when xdp_do_redirect() actually frees the
segment. Then we get:
  page count  == USHRT_MAX - 1
  rx_buffer->pagecnt_bias == USHRT_MAX - 2

From a recycle perspective, the buffer can be flipped and reused,
which means that the skb data area is passed to the Rx HW ring!

To work around this, the page count is stored prior calling
xdp_do_redirect().

Fixes: 9cbc948b ("igb: add XDP support")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Tested-by: Vishakha Jambekar <vishakha.jambekar@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

98dfb02a

ixgbe: move headroom initialization to ixgbe_configure_rx_ring · 76064573

Maciej Fijalkowski authored Mar 03, 2021

ixgbe_rx_offset(), that is supposed to initialize the Rx buffer headroom,
relies on __IXGBE_RX_BUILD_SKB_ENABLED flag.

Currently, the callsite of mentioned function is placed incorrectly
within ixgbe_setup_rx_resources() where Rx ring's build skb flag is not
set yet. This causes the XDP_REDIRECT to be partially broken due to
inability to create xdp_frame in the headroom space, as the headroom is
0.

Fix this by moving ixgbe_rx_offset() to ixgbe_configure_rx_ring() after
the flag setting, which happens to be set in ixgbe_set_rx_buffer_len.

Fixes: c0d4e9d2 ("ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Vishakha Jambekar <vishakha.jambekar@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

76064573

ice: move headroom initialization to ice_setup_rx_ctx · 89861c48

Maciej Fijalkowski authored Mar 03, 2021

ice_rx_offset(), that is supposed to initialize the Rx buffer headroom,
relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence.

Currently, the callsite of mentioned function is placed incorrectly
within ice_setup_rx_ring() where Rx ring's build skb flag is not
set yet. This causes the XDP_REDIRECT to be partially broken due to
inability to create xdp_frame in the headroom space, as the headroom is
0.

Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag
setting.

Fixes: f1b1f409 ("ice: store the result of ice_rx_offset() onto ice_ring")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

89861c48

i40e: move headroom initialization to i40e_configure_rx_ring · a8660626

Maciej Fijalkowski authored Mar 03, 2021

i40e_rx_offset(), that is supposed to initialize the Rx buffer headroom,
relies on I40E_RXR_FLAGS_BUILD_SKB_ENABLED flag.

Currently, the callsite of mentioned function is placed incorrectly
within i40e_setup_rx_descriptors() where Rx ring's build skb flag is not
set yet. This causes the XDP_REDIRECT to be partially broken due to
inability to create xdp_frame in the headroom space, as the headroom is
0.

For the record, below is the call graph:

i40e_vsi_open
 i40e_vsi_setup_rx_resources
  i40e_setup_rx_descriptors
   i40e_rx_offset() <-- sets offset to 0 as build_skb flag is set below

 i40e_vsi_configure_rx
  i40e_configure_rx_ring
   set_ring_build_skb_enabled(ring) <-- set build_skb flag

Fix this by moving i40e_rx_offset() to i40e_configure_rx_ring() after
the flag setting.

Fixes: f7bb0d71 ("i40e: store the result of i40e_rx_offset() onto i40e_ring")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

a8660626

ice: fix napi work done reporting in xsk path · ed0907e3

Magnus Karlsson authored Feb 05, 2021

Fix the wrong napi work done reporting in the xsk path of the ice
driver. The code in the main Rx processing loop was written to assume
that the buffer allocation code returns true if all allocations where
successful and false if not. In contrast with all other Intel NIC xsk
drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up
the work done reporting in the napi loop.

This can be fixed either by inverting the return value from
ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect
way, or by changing the return value of ice_alloc_rx_bufs_zc(). We
chose the latter as it makes all the xsk allocation functions for
Intel NICs behave in the same way. My guess is that it was this
unexpected discrepancy that gave rise to this bug in the first place.

Fixes: 5bb0c4b5 ("ice, xsk: Move Rx allocation out of while-loop")
Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ed0907e3

Merge branch 'htb-fixes' · 451b2596

David S. Miller authored Mar 11, 2021

Maxim Mikityanskiy says:

====================
Bugfixes for HTB

The HTB offload feature introduced a few bugs in HTB. One affects the
non-offload mode, preventing attaching qdiscs to HTB classes, and the
other affects the error flow, when the netdev doesn't support the
offload, but it was requested. This short series fixes them.
====================
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

451b2596

sch_htb: Fix offload cleanup in htb_destroy on htb_init failure · fb3a3e37

Maxim Mikityanskiy authored Mar 11, 2021

htb_init may fail to do the offload if it's not supported or if a
runtime error happens when allocating direct qdiscs. In those cases
TC_HTB_CREATE command is not sent to the driver, however, htb_destroy
gets called anyway and attempts to send TC_HTB_DESTROY.

It shouldn't happen, because the driver didn't receive TC_HTB_CREATE,
and also because the driver may not support ndo_setup_tc at all, while
q->offload is true, and htb_destroy mistakenly thinks the offload is
supported. Trying to call ndo_setup_tc in the latter case will lead to a
NULL pointer dereference.

This commit fixes the issues with htb_destroy by deferring assignment of
q->offload until after the TC_HTB_CREATE command. The necessary cleanup
of the offload entities is already done in htb_init.

Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb3a3e37

sch_htb: Fix select_queue for non-offload mode · 93bde210

Maxim Mikityanskiy authored Mar 11, 2021

htb_select_queue assumes it's always the offload mode, and it ends up in
calling ndo_setup_tc without any checks. It may lead to a NULL pointer
dereference if ndo_setup_tc is not implemented, or to an error returned
from the driver, which will prevent attaching qdiscs to HTB classes in
the non-offload mode.

This commit fixes the bug by adding the missing check to
htb_select_queue. In the non-offload mode it will return sch->dev_queue,
mimicking tc_modify_qdisc's behavior for the case where select_queue is
not implemented.

Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

93bde210

net: phy: broadcom: Add power down exit reset state delay · 7a1468ba

Florian Fainelli authored Mar 10, 2021

Per the datasheet, when we clear the power down bit, the PHY remains in
an internal reset state for 40us and then resume normal operation.
Account for that delay to avoid any issues in the future if
genphy_resume() changes.

Fixes: fe26821f ("net: phy: broadcom: Wire suspend/resume for BCM54810")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a1468ba

mISDN: fix crash in fritzpci · a9f81244

Tong Zhang authored Mar 10, 2021

setup_fritz() in avmfritz.c might fail with -EIO and in this case the
isac.type and isac.write_reg is not initialized and remains 0(NULL).
A subsequent call to isac_release() will dereference isac->write_reg and
crash.

[    1.737444] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    1.737809] #PF: supervisor instruction fetch in kernel mode
[    1.738106] #PF: error_code(0x0010) - not-present page
[    1.738378] PGD 0 P4D 0
[    1.738515] Oops: 0010 [#1] SMP NOPTI
[    1.738711] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted 5.12.0-rc2+ #78
[    1.739077] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-p
rebuilt.qemu.org 04/01/2014
[    1.739664] RIP: 0010:0x0
[    1.739807] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[    1.740200] RSP: 0018:ffffc9000027ba10 EFLAGS: 00010202
[    1.740478] RAX: 0000000000000000 RBX: ffff888102f41840 RCX: 0000000000000027
[    1.740853] RDX: 00000000000000ff RSI: 0000000000000020 RDI: ffff888102f41800
[    1.741226] RBP: ffffc9000027ba20 R08: ffff88817bc18440 R09: ffffc9000027b808
[    1.741600] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888102f41840
[    1.741976] R13: 00000000fffffffb R14: ffff888102f41800 R15: ffff8881008b0000
[    1.742351] FS:  00007fda3a38a8c0(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000
[    1.742774] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.743076] CR2: ffffffffffffffd6 CR3: 00000001021ec000 CR4: 00000000000006f0
[    1.743452] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.743828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.744206] Call Trace:
[    1.744339]  isac_release+0xcc/0xe0 [mISDNipac]
[    1.744582]  fritzpci_probe.cold+0x282/0x739 [avmfritz]
[    1.744861]  local_pci_probe+0x48/0x80
[    1.745063]  pci_device_probe+0x10f/0x1c0
[    1.745278]  really_probe+0xfb/0x420
[    1.745471]  driver_probe_device+0xe9/0x160
[    1.745693]  device_driver_attach+0x5d/0x70
[    1.745917]  __driver_attach+0x8f/0x150
[    1.746123]  ? device_driver_attach+0x70/0x70
[    1.746354]  bus_for_each_dev+0x7e/0xc0
[    1.746560]  driver_attach+0x1e/0x20
[    1.746751]  bus_add_driver+0x152/0x1f0
[    1.746957]  driver_register+0x74/0xd0
[    1.747157]  ? 0xffffffffc00d8000
[    1.747334]  __pci_register_driver+0x54/0x60
[    1.747562]  AVM_init+0x36/0x1000 [avmfritz]
[    1.747791]  do_one_initcall+0x48/0x1d0
[    1.747997]  ? __cond_resched+0x19/0x30
[    1.748206]  ? kmem_cache_alloc_trace+0x390/0x440
[    1.748458]  ? do_init_module+0x28/0x250
[    1.748669]  do_init_module+0x62/0x250
[    1.748870]  load_module+0x23ee/0x26a0
[    1.749073]  __do_sys_finit_module+0xc2/0x120
[    1.749307]  ? __do_sys_finit_module+0xc2/0x120
[    1.749549]  __x64_sys_finit_module+0x1a/0x20
[    1.749782]  do_syscall_64+0x38/0x90
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a9f81244

net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template · db74623a

Lv Yunlong authored Mar 10, 2021

In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by
vfree(). But unfortunately, it is used when extended is true.

Fixes: 7061b2bd ("qlogic: Deletion of unnecessary checks before two function calls")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

db74623a

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · ce6c13e4

David S. Miller authored Mar 11, 2021

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-03-11

This series contains updates to igc and e1000e drivers.

Sasha adds locking to reset task to prevent race condition for igc.

Muhammad fixes reporting of supported pause frame as well as advertised
pause frame for Tx/Rx off for igc.

Andre fixes timestamp retrieval from the wrong timer for igc.

Vitaly adds locking to reset task to prevent race condition for e1000e.

Dinghao Liu adds a missed check to return on error in
e1000_set_d0_lplu_state_82571.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ce6c13e4

net: sock: simplify tw proto registration · b80350f3

Tonghao Zhang authored Mar 11, 2021

Introduce the new function tw_prot_init (inspired by
req_prot_init) to simplify "proto_register" function.

tw_prot_cleanup will take care of a partially initialized
timewait_sock_ops.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b80350f3

11 Mar, 2021 1 commit

e1000e: Fix error handling in e1000_set_d0_lplu_state_82571 · b52912b8

Dinghao Liu authored Feb 28, 2021

There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571
that we have caught its return value but lack further handling.
Check and terminate the execution flow just like other e1e_wphy()
in this function.

Fixes: bc7f75fa ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

b52912b8