- 08 Dec, 2022 18 commits
-
-
Po-Hao Huang authored
To support multiple vifs, fw need more information of each role. Send this info to make things work as expected. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202061527.505668-5-pkshih@realtek.com
-
Po-Hao Huang authored
Remove according vifs from list if we couldn't set this interface up. Otherwise the rtwvif_list could contain unreferenced objects. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202061527.505668-4-pkshih@realtek.com
-
Po-Hao Huang authored
Disable hardware beacon related functions when ap stops. So hardware won't transmit beacons while interface is already removed. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202061527.505668-3-pkshih@realtek.com
-
Po-Hao Huang authored
If the interface is in AP/P2P GO mode, we adjust the TSF with random offset to avoid TBTT of different vifs to overlap and collide. For every new interface added, we adjust the value and resync for all interfaces. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202061527.505668-2-pkshih@realtek.com
-
Zong-Zhe Yang authored
Under some condition, we now have to do early request full firmware when rtw89_early_fw_feature_recognize(). In this case, we can avoid requesting full firmware twice during probing driver. So, we pass out full firmware from rtw89_early_fw_feature_recognize() if it's requested successfully. And then, if firmware is settled, we have no need to request full firmware again during normal initizating flow. Setting firmware flow is updated to be as the following. platform | early recognizing | normally initizating ----------------------------------------------------------------------- deny reading | request full FW | (no more FW requesting) partial file | | (obtain FW from early pahse) ----------------------------------------------------------------------- able to read | request partial FW | async request full FW partial file | (quite small chunk) | Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202060521.501512-3-pkshih@realtek.com
-
Zong-Zhe Yang authored
Kernel logs on platform enabling SECURITY_LOADPIN_ENFORCE ------ ``` LoadPin: firmware old-api-denied obj=<unknown> pid=810 cmdline="modprobe -q -- rtw89_8852ce" rtw89_8852ce 0000:01:00.0: loading /lib/firmware/rtw89/rtw8852c_fw.bin failed with error -1 rtw89_8852ce 0000:01:00.0: Direct firmware load for rtw89/rtw8852c_fw.bin failed with error -1 rtw89_8852ce 0000:01:00.0: failed to early request firmware: -1 ``` Trace ------ ``` request_partial_firmware_into_buf() > _request_firmware() >> fw_get_filesystem_firmware() >>> kernel_read_file_from_path_initns() >>>> kernel_read_file() >>>>> security_kernel_read_file() // It will iterate enabled LSMs' hooks for kernel_read_file. // With loadpin, it hooks loadpin_read_file. ``` If SECURITY_LOADPIN_ENFORCE is enabled, doing kernel_read_file() on partial files will be denied and return -EPERM (-1). Then, the outer API based on it, e.g. request_partial_firmware_into_buf(), will get the error. In the case, we cannot get the firmware stuffs right, even though there might be no error other than a permission issue on reading a partial file. So we have to request full firmware if SECURITY_LOADPIN_ENFORCE is enabled. It makes us still have a chance to do early firmware work on this kind of platforms. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202060521.501512-2-pkshih@realtek.com
-
Wang Yufen authored
Fix to return a negative error code instead of 0 when brcmf_chip_set_active() fails. In addition, change the return value for brcmf_pcie_exit_download_state() to keep consistent. Fixes: d380ebc9 ("brcmfmac: rename chip download functions") Signed-off-by: Wang Yufen <wangyufen@huawei.com> Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/1669959342-27144-1-git-send-email-wangyufen@huawei.com
-
Bitterblue Smith authored
The ra_report struct is used for reporting the TX rate via sta_statistics. The code which fills it out is duplicated in two places, and the RTL8188EU will need it in a third place. Move this code into a new function rtl8xxxu_update_ra_report. Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/0777ad35-fe03-473c-2e02-e3390bef5dd0@gmail.com
-
Bitterblue Smith authored
The gen 2 chips RTL8192EU and RTL8188FU periodically send the driver reports about the TX rate, and the driver passes these reports to sta_statistics. The reports from RTL8192EU may or may not include the channel width. The reports from RTL8188FU do not include it. Only access the c2h->ra_report.bw field if the report (skb) is big enough. The other problem fixed here is that the code was actually never changing the channel width initially reported by rtl8xxxu_bss_info_changed because the value of RATE_INFO_BW_20 is 0. Fixes: 0985d3a4 ("rtl8xxxu: Feed current txrate information for mac80211") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/5b41f1ae-72e7-6b7a-2459-b736399a1c40@gmail.com
-
Bitterblue Smith authored
This struct is used to access a sequence of bytes received from the wifi chip. It must not have any padding bytes between the members. This doesn't change anything on my system, possibly because currently none of the members need more than byte alignment. Fixes: b2b43b78 ("rtl8xxxu: Initial functionality to handle C2H events for 8723bu") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/1a270918-da22-ff5f-29fc-7855f740c5ba@gmail.com
-
Arend van Spriel authored
Using a namespace variant to make clear it is only intended to be used by the vendor-specific modules. The symbol will only truly export the symbols when the driver and consequently the vendor-specific part are built as kernel modules. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-8-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
Upon probe the driver determines the vendor supporting the device. Expose this information in the revinfo debugfs file. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-7-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
Broadcom BCA division develops its own firmware api and as such will likely diverge over time (or already has). Add support for handling this. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-6-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
Cypress uses the brcmfmac driver and releases firmware which will likely diverge over time (or already has). So adding support for handling that. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-5-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
The driver is being used by multiple vendors who develop the firmware api independently. So far the firmware api as used by the driver has not diverged (yet). This change adds framework for supporting multiple firmware apis. The vendor-specific support code has to provide a number of callback operations. Right now it is only attach and detach callbacks so no real functionality as the api is still common. This code only adds WCC variant anyway, which is selected for all devices right now. The vendor-specific part will be built in a separate module when the driver is configured to be built as a module through Kconfig, ie. when CONFIG_BRCMFMAC=m. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-4-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
In order to determine the vendor that released a firmware image for a specific device, the device table now sets the vendor identifier in driver info and it is stored in struct brcmf_bus::fwvid during probe. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-3-arend.vanspriel@broadcom.com
-
Arend van Spriel authored
Introduce a new bus callback .remove() which will unbind the device from the driver. This allows the common driver layer to stop handling a device. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129135446.151065-2-arend.vanspriel@broadcom.com
-
Jiapeng Chong authored
Functions write_nic_auto_inc_address() and write_nic_dword_auto_inc() are defined in the ipw2100.c file, but not called elsewhere, so remove these unused functions. drivers/net/wireless/intel/ipw2x00/ipw2100.c:427:20: warning: unused function 'write_nic_dword_auto_inc'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3285Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129062407.83157-1-jiapeng.chong@linux.alibaba.com
-
- 03 Dec, 2022 3 commits
-
-
Lorenzo Bianconi authored
In order to fix the following sleep while atomic bug always alloc pages with GFP_ATOMIC in mtk_wed_wo_queue_refill since page_frag_alloc runs in spin_lock critical section. [ 9.049719] Hardware name: MediaTek MT7986a RFB (DT) [ 9.054665] Call trace: [ 9.057096] dump_backtrace+0x0/0x154 [ 9.060751] show_stack+0x14/0x1c [ 9.064052] dump_stack_lvl+0x64/0x7c [ 9.067702] dump_stack+0x14/0x2c [ 9.071001] ___might_sleep+0xec/0x120 [ 9.074736] __might_sleep+0x4c/0x9c [ 9.078296] __alloc_pages+0x184/0x2e4 [ 9.082030] page_frag_alloc_align+0x98/0x1ac [ 9.086369] mtk_wed_wo_queue_refill+0x134/0x234 [ 9.090974] mtk_wed_wo_init+0x174/0x2c0 [ 9.094881] mtk_wed_attach+0x7c8/0x7e0 [ 9.098701] mt7915_mmio_wed_init+0x1f0/0x3a0 [mt7915e] [ 9.103940] mt7915_pci_probe+0xec/0x3bc [mt7915e] [ 9.108727] pci_device_probe+0xac/0x13c [ 9.112638] really_probe.part.0+0x98/0x2f4 [ 9.116807] __driver_probe_device+0x94/0x13c [ 9.121147] driver_probe_device+0x40/0x114 [ 9.125314] __driver_attach+0x7c/0x180 [ 9.129133] bus_for_each_dev+0x5c/0x90 [ 9.132953] driver_attach+0x20/0x2c [ 9.136513] bus_add_driver+0x104/0x1fc [ 9.140333] driver_register+0x74/0x120 [ 9.144153] __pci_register_driver+0x40/0x50 [ 9.148407] mt7915_init+0x5c/0x1000 [mt7915e] [ 9.152848] do_one_initcall+0x40/0x25c [ 9.156669] do_init_module+0x44/0x230 [ 9.160403] load_module+0x1f30/0x2750 [ 9.164135] __do_sys_init_module+0x150/0x200 [ 9.168475] __arm64_sys_init_module+0x18/0x20 [ 9.172901] invoke_syscall.constprop.0+0x4c/0xe0 [ 9.177589] do_el0_svc+0x48/0xe0 [ 9.180889] el0_svc+0x14/0x50 [ 9.183929] el0t_64_sync_handler+0x9c/0x120 [ 9.188183] el0t_64_sync+0x158/0x15c Fixes: 79968444 ("net: ethernet: mtk_wed: introduce wed wo support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://lore.kernel.org/r/67ca94bdd3d9eaeb86e52b3050fbca0bcf7bb02f.1669908312.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
kfree_rcu(1-arg) should be avoided as much as possible, since this is only possible from sleepable contexts, and incurr extra rcu barriers. I wish the 1-arg variant of kfree_rcu() would get a distinct name, like kfree_rcu_slow() to avoid it being abused. Fixes: 459837b5 ("net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Dmitry Safonov <dima@arista.com> Link: https://lore.kernel.org/r/20221202052847.2623997-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Merge tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.2 Third set of patches for v6.2. mt76 has a new driver for mt7996 Wi-Fi 7 devices and iwlwifi also got initial Wi-Fi 7 support. Otherwise smaller features and fixes. Major changes: ath10k - store WLAN firmware version in SMEM image table mt76 - mt7996: new driver for MediaTek Wi-Fi 7 (802.11be) devices - mt7986, mt7915: enable Wireless Ethernet Dispatch (WED) offload support - mt7915: add ack signal support - mt7915: enable coredump support - mt7921: remain_on_channel support - mt7921: channel context support iwlwifi - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support * tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (144 commits) wifi: ath10k: fix QCOM_SMEM dependency wifi: mt76: mt7921e: add pci .shutdown() support wifi: mt76: mt7915: mmio: fix naming convention wifi: mt76: mt7996: add support to configure spatial reuse parameter set wifi: mt76: mt7996: enable ack signal support wifi: mt76: mt7996: enable use_cts_prot support wifi: mt76: mt7915: rely on band_idx of mt76_phy wifi: mt76: mt7915: enable per bandwidth power limit support wifi: mt76: mt7915: introduce mt7915_get_power_bound() mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2() wifi: mt76: do not send firmware FW_FEATURE_NON_DL region wifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc wifi: mt76: fix coverity overrun-call in mt76_get_txpower() wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices wifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power wifi: mt76: mt7915: fix band_idx usage wifi: mt76: mt7915: enable .sta_set_txpwr support wifi: mt76: mt7915: add basedband Txpower info into debugfs wifi: mt76: mt7915: add support to configure spatial reuse parameter set wifi: mt76: mt7915: add missing MODULE_PARM_DESC ... ==================== Link: https://lore.kernel.org/r/20221202214254.D0D3DC433C1@smtp.kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 02 Dec, 2022 19 commits
-
-
Kalle Valo authored
Nathan noticed that when HWSPINLOCK is disabled there's a Kconfig warning: WARNING: unmet direct dependencies detected for QCOM_SMEM Depends on [n]: (ARCH_QCOM [=y] || COMPILE_TEST [=n]) && HWSPINLOCK [=n] Selected by [m]: - ATH10K_SNOC [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && ATH10K [=m] && (ARCH_QCOM [=y] || COMPILE_TEST [=n]) The problem here is that QCOM_SMEM depends on HWSPINLOCK so we cannot select QCOM_SMEM and instead we neeed to use 'depends on'. Reported-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/all/Y4YsyaIW+CPdHWv3@dev-arch.thelio-3990X/ Fixes: 4d79f6f3 ("wifi: ath10k: Store WLAN firmware version in SMEM image table") Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221202103027.25974-1-kvalo@kernel.org
-
Gerhard Engleder authored
Refill RX queue in batches of descriptors to improve performance. Refill is allowed to fail as long as a minimum number of descriptors is active. Thus, a limited number of failed RX buffer allocations is now allowed for normal operation. Previously every failed allocation resulted in a dropped frame. If the minimum number of active descriptors is reached, then RX buffers are still reused and frames are dropped. This ensures that the RX queue never runs empty and always continues to operate. Prework for future XDP support. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerhard Engleder authored
Without interrupt throttling, iperf server mode generates a CPU load of 100% (A53 1.2GHz). Also the throughput suffers with less than 900Mbit/s on a 1Gbit/s link. The reason is a high interrupt load with interrupts every ~20us. Reduce interrupt load by throttling of interrupts. Interrupt delay default is 64us. For iperf server mode the CPU load is significantly reduced to ~20% and the throughput reaches the maximum of 941MBit/s. Interrupts are generated every ~140us. RX and TX coalesce can be configured with ethtool. RX coalesce has priority over TX coalesce if the same interrupt is used. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerhard Engleder authored
Allow user space to read number of TX and RX queue. This is useful for device dependent qdisc configurations like TAPRIO with hardware offload. Also ethtool::get_per_queue_coalesce / set_per_queue_coalesce requires that interface. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerhard Engleder authored
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jonathan Toppins authored
Correct xmit hash steps for layer3+4 as introduced by commit 49aefd13 ("bonding: do not discard lowest hash bit for non layer3+4 hashing"). Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jonathan Toppins authored
With commit c1f897ce ("bonding: set default miimon value for non-arp modes if not set") the miimon default was changed from zero to 100 if arp_interval is also zero. Document this fact in bonding.rst. Fixes: c1f897ce ("bonding: set default miimon value for non-arp modes if not set") Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Shevchenko authored
The main usage of the struct thunderbolt_ip_frame_header is to handle the packets on the media layer. The header is bound to the protocol in which the byte ordering is crucial. However the data type definition doesn't use that and sparse is unhappy, for example (17 altogether): .../thunderbolt.c:718:23: warning: cast to restricted __le32 .../thunderbolt.c:966:42: warning: incorrect type in assignment (different base types) .../thunderbolt.c:966:42: expected unsigned int [usertype] frame_count .../thunderbolt.c:966:42: got restricted __le32 [usertype] Switch to the bitwise types in the struct thunderbolt_ip_frame_header to reduce this, but not completely solving (9 left), because the same data type is used for Rx header handled locally (in CPU byte order). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Shevchenko authored
Letting the compiler remove these functions when the kernel is built without CONFIG_PM_SLEEP support is simpler and less heavier for builds than the use of __maybe_unused attributes. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Some devlink instances may contain thousands of ports. Storing them in linked list and looking them up is not scalable. Convert the linked list into xarray. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Sebastian Andrzej Siewior says: ==================== I started playing with HSR and run into a problem. Tested latest upstream -rc and noticed more problems. Now it appears to work. For testing I have a small three node setup with iperf and ping. While iperf doesn't complain ping reports missing packets and duplicates. ==================== Link: https://lore.kernel.org/r/20221129164815.128922-1-bigeasy@linutronix.de/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
This test adds a basic HSRv0 network with 3 nodes. In its current shape it sends and forwards packets, announcements and so merges nodes based on MAC A/B information. It is able to detect duplicate packets and packetloss should any occur. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
self_node_db is a list_head with one entry of struct hsr_node. The purpose is to hold the two MAC addresses of the node itself. It is convenient to recycle the structure. However having a list_head and fetching always the first entry is not really optimal. Created a new data strucure contaning the two MAC addresses named hsr_self_node. Access that structure like an RCU protected pointer so it can be replaced on the fly without blocking the reader. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
hsr_register_frame_out() compares new sequence_nr vs the old one recorded in hsr_node::seq_out and if the new sequence_nr is higher then it will be written to hsr_node::seq_out as the new value. This operation isn't locked so it is possible that two frames with the same sequence number arrive (via the two slave devices) and are fed to hsr_register_frame_out() at the same time. Both will pass the check and update the sequence counter later to the same value. As a result the content of the same packet is fed into the stack twice. This was noticed by running ping and observing DUP being reported from time to time. Instead of using the hsr_priv::seqnr_lock for the whole receive path (as it is for sending in the master node) add an additional lock that is only used for sequence number checks and updates. Add a per-node lock that is used during sequence number reads and updates. Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
Sending frames via the hsr (master) device requires a sequence number which is tracked in hsr_priv::sequence_nr and protected by hsr_priv::seqnr_lock. Each time a new frame is sent, it will obtain a new id and then send it via the slave devices. Each time a packet is sent (via hsr_forward_do()) the sequence number is checked via hsr_register_frame_out() to ensure that a frame is not handled twice. This make sense for the receiving side to ensure that the frame is not injected into the stack twice after it has been received from both slave ports. There is no locking to cover the sending path which means the following scenario is possible: CPU0 CPU1 hsr_dev_xmit(skb1) hsr_dev_xmit(skb2) fill_frame_info() fill_frame_info() hsr_fill_frame_info() hsr_fill_frame_info() handle_std_frame() handle_std_frame() skb1's sequence_nr = 1 skb2's sequence_nr = 2 hsr_forward_do() hsr_forward_do() hsr_register_frame_out(, 2) // okay, send) hsr_register_frame_out(, 1) // stop, lower seq duplicate Both skbs (or their struct hsr_frame_info) received an unique id. However since skb2 was sent before skb1, the higher sequence number was recorded in hsr_register_frame_out() and the late arriving skb1 was dropped and never sent. This scenario has been observed in a three node HSR setup, with node1 + node2 having ping and iperf running in parallel. From time to time ping reported a missing packet. Based on tracing that missing ping packet did not leave the system. It might be possible (didn't check) to drop the sequence number check on the sending side. But if the higher sequence number leaves on wire before the lower does and the destination receives them in that order and it will drop the packet with the lower sequence number and never inject into the stack. Therefore it seems the only way is to lock the whole path from obtaining the sequence number and sending via dev_queue_xmit() and assuming the packets leave on wire in the same order (and don't get reordered by the NIC). Cover the whole path for the master interface from obtaining the ID until after it has been forwarded via hsr_forward_skb() to ensure the skbs are sent to the NIC in the order of the assigned sequence numbers. Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
The hsr device is a software device. Its net_device_ops::ndo_start_xmit() routine will process the packet and then pass the resulting skb to dev_queue_xmit(). During processing, hsr acquires a lock with spin_lock_bh() (hsr_add_node()) which needs to be promoted to the _irq() suffix in order to avoid a potential deadlock. Then there are the warnings in dev_queue_xmit() (due to local_bh_disable() with disabled interrupts) left. Instead trying to address those (there is qdisc and…) for netpoll sake, just disable netpoll on hsr. Disable netpoll on hsr and replace the _irqsave() locking with _bh(). Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
Due to the hashed-MAC optimisation one problem become visible: hsr_handle_sup_frame() walks over the list of available nodes and merges two node entries into one if based on the information in the supervision both MAC addresses belong to one node. The list-walk happens on a RCU protected list and delete operation happens under a lock. If the supervision arrives on both slave interfaces at the same time then this delete operation can occur simultaneously on two CPUs. The result is the first-CPU deletes the from the list and the second CPUs BUGs while attempting to dereference a poisoned list-entry. This happens more likely with the optimisation because a new node for the mac_B entry is created once a packet has been received and removed (merged) once the supervision frame has been received. Avoid removing/ cleaning up a hsr_node twice by adding a `removed' field which is set to true after the removal and checked before the removal. Fixes: f266a683 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
hsr_forward_skb() a skb and keeps information in an on-stack hsr_frame_info. hsr_get_node() assigns hsr_frame_info::node_src which is from a RCU list. This pointer is used later in hsr_forward_do(). I don't see a reason why this pointer can't vanish midway since there is no guarantee that hsr_forward_skb() is invoked from an RCU read section. Use rcu_read_lock() to protect hsr_frame_info::node_src from its assignment until it is no longer used. Fixes: f266a683 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sebastian Andrzej Siewior authored
The hlist optimisation (which not only uses hlist_head instead of list_head but also splits hsr_priv::node_db into an array of 256 slots) does not consider the "node merge": Upon starting the hsr network (with three nodes) a packet that is sent from node1 to node3 will also be sent from node1 to node2 and then forwarded to node3. As a result node3 will receive 2 packets because it is not able to filter out the duplicate. Each packet received will create a new struct hsr_node with macaddress_A only set the MAC address it received from (the two MAC addesses from node1). At some point (early in the process) two supervision frames will be received from node1. They will be processed by hsr_handle_sup_frame() and one frame will leave early ("Node has already been merged") and does nothing. The other frame will be merged as portB and have its MAC address written to macaddress_B and the hsr_node (that was created for it as macaddress_A) will be removed. From now on HSR is able to identify a duplicate because both packets sent from one node will result in the same struct hsr_node because hsr_get_node() will find the MAC address either on macaddress_A or macaddress_B. Things get tricky with the optimisation: If sender's MAC address is saved as macaddress_A then the lookup will work as usual. If the MAC address has been merged into macaddress_B of another hsr_node then the lookup won't work because it is likely that the data structure is in another bucket. This results in creating a new struct hsr_node and not recognising a possible duplicate. A way around it would be to add another hsr_node::mac_list_B and attach it to the other bucket to ensure that this hsr_node will be looked up either via macaddress_A _or_ macaddress_B. I however prefer to revert it because it sounds like an academic problem rather than real life workload plus it adds complexity. I'm not an HSR expert with what is usual size of a network but I would guess 40 to 60 nodes. With 10.000 nodes and assuming 60us for pass-through (from node to node) then it would take almost 600ms for a packet to almost wrap around which sounds a lot. Revert the hash MAC addresses optimisation. Fixes: 4acc45db ("net: hsr: use hlist_head instead of list_head for mac addresses") Cc: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-