- 13 Aug, 2023 7 commits
-
-
Yue Haibing authored
Commit 39de8281 ("RDS: Main header file") declared but never implemented rds_trans_init() and rds_trans_exit(), remove it. Commit d37c9359 ("RDS: Move loop-only function to loop.c") removed the implementation rds_message_inc_free() but not the declaration. Since commit 55b7ed0b ("RDS: Common RDMA transport code") rds_rdma_conn_connect() is never implemented and used. rds_tcp_map_seq() is never implemented and used since commit 70041088 ("RDS: Add TCP transport to RDS"). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
sk_diag_put_flags(), netlink_setsockopt(), netlink_getsockopt() and others use nlk->flags without correct locking. Use set_bit(), clear_bit(), test_bit(), assign_bit() to remove data-races. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Menglong Dong says: ==================== net: tcp: support probing OOM In this series, we make some small changes to make the tcp retransmission become zero-window probes if the receiver drops the skb because of memory pressure. In the 1st patch, we reply a zero-window ACK if the skb is dropped because out of memory, instead of dropping the skb silently. In the 2nd patch, we allow a zero-window ACK to update the window. In the 3rd patch, fix unexcepted socket die when snd_wnd is 0 in tcp_retransmit_timer(). In the 4th patch, we refactor the debug message in tcp_retransmit_timer() to make it more correct. After these changes, the tcp can probe the OOM of the receiver forever. Changes since v3: - make the timeout "2 * TCP_RTO_MAX" in the 3rd patch - tp->retrans_stamp is not based on jiffies and can't be compared with icsk->icsk_timeout in the 3rd patch. Fix it. - introduce the 4th patch Changes since v2: - refactor the code to avoid code duplication in the 1st patch - use after() instead of max() in tcp_rtx_probe0_timed_out() Changes since v1: - send 0 rwin ACK for the receive queue empty case when necessary in the 1st patch - send the ACK immediately by using the ICSK_ACK_NOW flag in the 1st patch - consider the case of the connection restart from idle, as Neal comment, in the 3rd patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Menglong Dong authored
The debug message in tcp_retransmit_timer() is slightly wrong, because they could be printed even if we did not receive a new ACK packet from the remote peer. Change it to probing zero-window, as it is a expected case now. The description may be not correct. Adding the duration since the last ACK we received, and the duration of the retransmission, which are useful for debugging. And the message now like this: Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 209ms ago, lasting 209ms Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 404ms ago, lasting 408ms Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 812ms ago, lasting 1224ms Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Menglong Dong authored
In tcp_retransmit_timer(), a window shrunk connection will be regarded as timeout if 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX'. This is not right all the time. The retransmits will become zero-window probes in tcp_retransmit_timer() if the 'snd_wnd==0'. Therefore, the icsk->icsk_rto will come up to TCP_RTO_MAX sooner or later. However, the timer can be delayed and be triggered after 122877ms, not TCP_RTO_MAX, as I tested. Therefore, 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX' is always true once the RTO come up to TCP_RTO_MAX, and the socket will die. Fix this by replacing the 'tcp_jiffies32' with '(u32)icsk->icsk_timeout', which is exact the timestamp of the timeout. However, "tp->rcv_tstamp" can restart from idle, then tp->rcv_tstamp could already be a long time (minutes or hours) in the past even on the first RTO. So we double check the timeout with the duration of the retransmission. Meanwhile, making "2 * TCP_RTO_MAX" as the timeout to avoid the socket dying too soon. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/netdev/CADxym3YyMiO+zMD4zj03YPM3FBi-1LHi6gSD2XT8pyAMM096pg@mail.gmail.com/Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Menglong Dong authored
Fow now, an ACK can update the window in following case, according to the tcp_may_update_window(): 1. the ACK acknowledged new data 2. the ACK has new data 3. the ACK expand the window and the seq of it is valid Now, we allow the ACK update the window if the window is 0, and the seq/ack of it is valid. This is for the case that the receiver replies an zero-window ACK when it is under memory stress and can't queue the new data. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Menglong Dong authored
For now, skb will be dropped when no memory, which makes client keep retrans util timeout and it's not friendly to the users. In this patch, we reply an ACK with zero-window in this case to update the snd_wnd of the sender to 0. Therefore, the sender won't timeout the connection and will probe the zero-window with the retransmits. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 12 Aug, 2023 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Tony Nguyen says: ==================== i40e: Replace one-element arrays with flexible-array members Replace one-element arrays with flexible-array members in multiple structures. This results in no differences in binary output. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: i40e: Replace one-element array with flex-array member in struct i40e_profile_aq_section i40e: Replace one-element array with flex-array member in struct i40e_section_table i40e: Replace one-element array with flex-array member in struct i40e_profile_segment i40e: Replace one-element array with flex-array member in struct i40e_package_header ==================== Link: https://lore.kernel.org/r/20230810175302.1964182-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arnd Bergmann authored
The init function is only referenced locally, so it should be static to avoid this warning: drivers/net/ethernet/amd/atarilance.c:370:28: error: no previous prototype for 'atarilance_probe' [-Werror=missing-prototypes] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20230810122528.1220434-2-arnd@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arnd Bergmann authored
The function is exported for no reason and should just be static: drivers/net/ethernet/sun/ldmvsw.c:127:5: error: no previous prototype for 'ldmvsw_open' [-Werror=missing-prototypes] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Link: https://lore.kernel.org/r/20230810122528.1220434-1-arnd@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 11 Aug, 2023 30 commits
-
-
David S. Miller authored
Alexis Lothoré says: ==================== net: dsa: rzn1-a5psw: add support for vlan and .port_bridge_flags this series enables vlan support in Renesas RZN1 internal ethernet switch, and is a follow up to the work initiated by Clement Leger a few months ago, who handed me over the topic. This new revision aims to iron the last few points raised by Vladimir to ensure that the driver is in line with switch drivers expectations, and is based on the lengthy discussion in [1] (thanks Vladimir for the valuable explanations) [1] https://lore.kernel.org/netdev/20230314163651.242259-1-clement.leger@bootlin.com/ ---- V5: - ensure that flooding can be enabled only if port belongs to a bridge - enable learning in a5psw_port_stp_state_set() only if port has learning enabled - toggle vlan tagging on vlan filtering in - removed reviewed-by on second patch since its modified RESEND V4: - Resent due to net-next being closed V4: - Fix missing CPU port bit in a5psw->bridged_ports - Use unsigned int for vlan_res_id parameters - Rename a5psw_get_vlan_res_entry() to a5psw_new_vlan_res_entry() - In a5psw_port_vlan_add(), return -ENOSPC when no VLAN entry is found - In a5psw_port_vlan_filtering(), compute "val" from "mask" V3: - Target net-next tree and correct version... V2: - Fixed a few formatting errors - Add .port_bridge_flags implementation ==================== Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Clément Léger authored
Add support for vlan operation (add, del, filtering) on the RZN1 driver. The a5psw switch supports up to 32 VLAN IDs with filtering, tagged/untagged VLANs and PVID for each ports. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Clément Léger authored
When running vlan test (bridge_vlan_aware/unaware.sh), there were some failure due to the lack .port_bridge_flag function to disable port flooding. Implement this operation for BR_LEARNING, BR_FLOOD, BR_MCAST_FLOOD and BR_BCAST_FLOOD. Since .port_bridge_flags affects the bits disabling learning for a port, ensure that any other modification on the same register done by a5psw_port_stp_state_set is in sync by using the port learning state to enable/disable learning on the port. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Clément Léger authored
.port_bridge_flags will be added and allows to modify the flood mask independently for each port. Keeping the existing bridged_ports write in a5psw_flooding_set_resolution() would potentially messed up this. Use a read-modify-write to set that value and move bridged_ports handling in bridge_port_join/leave. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hariprasad Kelam authored
The current implementation does not allow the user to enable both hw-tc-offload and ntuple features on the interface. These checks are added as TC flower offload and ntuple features internally configures the same hardware resource MCAM. But TC HTB offload configures the transmit scheduler which can be safely enabled on the interface with ntuple feature. This patch adds the same and ensures only TC flower offload and ntuple features are mutually exclusive. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Zhengchao Shao says: ==================== bonding: do some cleanups in bond driver Do some cleanups in bond driver. --- v2: use IS_ERR instead of NULL check in patch 2/5, update commit information in patch 3/5, remove inline modifier in patch 4/5 ==================== Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
The free_percpu function also could check whether "rr_tx_counter" parameter is NULL. Therefore, remove NULL check in bond_destructor. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
In bond_reset_slave_arr(), values are assigned and memory is released only when the variables "usable" and "all" are not NULL. But even if the "usable" and "all" variables are NULL, they can still work, because value will be checked in kfree_rcu. Therefore, use bond_set_slave_arr() and set the input parameters "usable_slaves" and "all_slaves" to NULL to simplify the code in bond_reset_slave_arr(). And the same to bond_uninit(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
Because debugfs_create_dir returns ERR_PTR, so bonding_debug_root will never be NULL. Remove redundant NULL check for bonding_debug_root in debugfs function. The later debugfs_create_dir/debugfs_remove_recursive /debugfs_remove_recursive functions will check the dentry with IS_ERR(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
Because debugfs_create_dir returns ERR_PTR, so IS_ERR should be used to check whether the directory is successfully created. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
Some functions are only used in initialization and exit functions, so add the __init/__net_init and __net_exit modifiers to these functions. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Krzysztof Kozlowski authored
'type' is an enum, thus cast of pointer on 64-bit compile test with W=1 causes: mvmdio.c:272:9: error: cast to smaller integer type 'enum orion_mdio_bus_type' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
-
Krzysztof Kozlowski authored
'enet_id' is an enum, thus cast of pointer on 64-bit compile test with W=1 causes: xgene_enet_main.c:2044:20: error: cast to smaller integer type 'enum xgene_enet_id' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Simon Horman <horms@kernel.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shradha Gupta authored
Extended performance counter stats in 'ethtool -S <interface>' for MANA VF to include GDMA tx LSO packets and bytes count. Tested-on: Ubuntu22 Testcases: 1. LISA testcase: PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic 2. LISA testcase: PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV 3. Validated the GDMA stat packets and byte counters Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sathesh Edara authored
Implement control plane mailbox versions for host and firmware. Versions are published in info area of control mailbox bar4 memory structure.Firmware will publish minimum and maximum supported versions.Control plane mailbox apis will check for firmware version before sending any control commands to firmware. Notifications from firmware will similarly be checked for host version compatibility. Signed-off-by: Sathesh Edara <sedara@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ratheesh Kannoth authored
Accept TC offload classifier rule only if SPI field can be extracted by HW. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Antonov authored
If netdev_mc_count() is not zero and not IFF_ALLMULTI, filter incoming multicast packets. The chip has a Multicast Address Hash Table for allowed multicast addresses, so we fill it. Implement .ndo_set_rx_mode and recalculate multicast hash table. Also observe change of IFF_PROMISC and IFF_ALLMULTI netdev flags. Signed-off-by: Sergei Antonov <saproj@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tahsin Erdogan authored
When gso.hdr_len is zero and a packet is transmitted via write() or writev(), all payload is treated as header which requires a contiguous memory allocation. This allocation request is harder to satisfy, and may even fail if there is enough fragmentation. Note that sendmsg() code path limits the linear copy length, so this change makes write()/writev() and sendmsg() paths more consistent. Signed-off-by: Tahsin Erdogan <trdgn@amazon.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230809164753.2247594-1-trdgn@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
The driver init/exit() function don't do anything special, it can use the module_pci_driver() macro to eliminate boilerplate code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://lore.kernel.org/r/20230810014633.3084355-1-yangyingliang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
Commit 61c9fed4 ("[SCTP]: A better solution to fix the race between sctp_peeloff() and sctp_rcv().") removed the implementation but left declaration in place. Remove it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20230809142323.9428-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
Merge net again, after pulling in x86/bugs fixes to clang linking errors. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yue Haibing authored
Commit 43e36921 ("caif: Move refcount from service layer to sock and dev.") declared but never implemented this. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230809134943.37844-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipJakub Kicinski authored
Cross merge x86 fixes to fix clang linking errors: ld.lld: error: ./arch/x86/kernel/vmlinux.lds:221: at least one side of the expression must be absolute These will hopefully be downstream by the time we ship the next batch of fixes. * 'x86/bugs' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Move gds_ucode_mitigated() declaration to header x86/speculation: Add cpu_show_gds() prototype driver core: cpu: Make cpu_show_not_affected() static x86/srso: Fix build breakage with the LLVM linker Documentation/srso: Document IBPB aspect and fix formatting driver core: cpu: Unify redundant silly stubs Documentation/hw-vuln: Unify filename specification in index Link: https://lore.kernel.org/all/CAHk-=wj_b+FGTnevQSBAtCWuhCk=0oQ_THvthBW2hzqpOTLFmg@mail.gmail.com/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Furong Xu authored
Commit abe80fdc ("net: stmmac: RX queue routing configuration") introduced RX queue routing to DWMAC4 core. This patch extend the support to XGMAC2 core. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230809020238.1136732-1-0x1207@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Andrew Lunn says: ==================== Support offload LED blinking to PHY. Allow offloading of the LED trigger netdev to PHY drivers and implement it for the Marvell PHY driver. Additionally, correct the handling of when the initial state of the LED cannot be represented by the trigger, and so an error is returned. As with ledtrig-timer, disable offload when the trigger is deactivate, or replaced by another trigger. ==================== Link: https://lore.kernel.org/r/20230808210436.838995-1-andrew@lunn.chSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Andrew Lunn authored
Ensure that the offloading of blinking is stopped when the trigger is deactivated. Calling led_set_brightness() is documented as stopping offload and setting the LED to a constant brightness. Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20230808210436.838995-5-andrew@lunn.chSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Andrew Lunn authored
Add the code needed to indicate if a given blinking pattern can be offloaded, to offload a pattern and to try to return the current pattern. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20230808210436.838995-4-andrew@lunn.chSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Andrew Lunn authored
Linux LEDs can be requested to perform hardware accelerated blinking to indicate link, RX, TX etc. Pass the rules for blinking to the PHY driver, if it implements the ops needed to determine if a given pattern can be offloaded, to offload it, and what the current offload is. Additionally implement the op needed to get what device the LED is for. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20230808210436.838995-3-andrew@lunn.chSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Andrew Lunn authored
When the netdev trigger is activates, it tries to determine what device the LED blinks for, and what the current blink mode is. The documentation for hw_control_get() says: * Return 0 on success, a negative error number on failing parsing the * initial mode. Error from this function is NOT FATAL as the device * may be in a not supported initial state by the attached LED trigger. */ For the Marvell PHY and the Armada 370-rd board, the initial LED blink mode is not supported by the trigger, so it returns an error. This resulted in not getting the device the LED is blinking for. As a result, the device is unknown and offloaded is never performed. Change to condition to always get the device if offloading is supported, and reduce the scope of testing for an error from hw_control_get() to skip setting trigger internal state if there is an error. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20230808210436.838995-2-andrew@lunn.chSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Li Zetao authored
The module_mhi_driver() will set "THIS_MODULE" to driver.owner when register a mhi_driver driver, so it is redundant initialization to set driver.owner in the statement. Remove it for clean code. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230808021238.2975585-1-lizetao1@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-