- 26 Nov, 2021 14 commits
-
-
Julian Wiedmann authored
ethtool_set_coalesce() now uses both the .get_coalesce() and .set_coalesce() callbacks. But the check for their availability is buggy, so changing the coalesce settings on a device where the driver provides only _one_ of the callbacks results in a NULL pointer dereference instead of an -EOPNOTSUPP. Fix the condition so that the availability of both callbacks is ensured. This also matches the netlink code. Note that reproducing this requires some effort - it only affects the legacy ioctl path, and needs a specific combination of driver options: - have .get_coalesce() and .coalesce_supported but no .set_coalesce(), or - have .set_coalesce() but no .get_coalesce(). Here eg. ethtool doesn't cause the crash as it first attempts to call ethtool_get_coalesce() and bails out on error. Fixes: f3ccfda1 ("ethtool: extend coalesce setting uAPI with CQE mode") Cc: Yufeng Mo <moyufeng@huawei.com> Cc: Huazhong Tan <tanhuazhong@huawei.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20211126175543.28000-1-jwi@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Thadeu Lima de Souza Cascardo authored
Device permissions is S_IALLUGO, with many unnecessary bits. Remove them and also remove read and write permissions from group and others. Before the change: crwsrwsrwt 1 0 0 10, 125 Nov 25 13:59 /dev/virtual_nci After the change: crw------- 1 0 0 10, 125 Nov 25 14:05 /dev/virtual_nci Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Bongsu Jeon <bongsu.jeon@samsung.com> Link: https://lore.kernel.org/r/20211125141457.716921-1-cascardo@canonical.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Davide Caratti authored
when the number of DRR classes decreases, the round-robin active list can contain elements that have already been freed in ets_qdisc_change(). As a consequence, it's possible to see a NULL dereference crash, caused by the attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets] Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100 FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0 Call Trace: <TASK> qdisc_peek_dequeued+0x29/0x70 [sch_ets] tbf_dequeue+0x22/0x260 [sch_tbf] __qdisc_run+0x7f/0x630 net_tx_action+0x290/0x4c0 __do_softirq+0xee/0x4f8 irq_exit_rcu+0xf4/0x130 sysvec_apic_timer_interrupt+0x52/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7f2aa7fc9ad4 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460 </TASK> Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000018 Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to remove from the active list those elements that are no more SP nor DRR. [1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/ v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock acquired, thanks to Cong Wang. v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb from the next list item. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: dcc68b4d ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yannick Vignon authored
The Tx queues were not disabled in situations where the driver needed to stop the interface to apply a new configuration. This could result in a kernel panic when doing any of the 3 following actions: * reconfiguring the number of queues (ethtool -L) * reconfiguring the size of the ring buffers (ethtool -G) * installing/removing an XDP program (ip l set dev ethX xdp) Prevent the panic by making sure netif_tx_disable is called when stopping an interface. Without this patch, the following kernel panic can be observed when doing any of the actions above: Unable to handle kernel paging request at virtual address ffff80001238d040 [....] Call trace: dwmac4_set_addr+0x8/0x10 dev_hard_start_xmit+0xe4/0x1ac sch_direct_xmit+0xe8/0x39c __dev_queue_xmit+0x3ec/0xaf0 dev_queue_xmit+0x14/0x20 [...] [ end trace 0000000000000002 ]--- Fixes: 5fabb012 ("net: stmmac: Add initial XDP support") Fixes: aa042f60 ("net: stmmac: Add support to Ethtool get/set ring parameters") Fixes: 0366f7e0 ("net: stmmac: add ethtool support for get/set channels") Signed-off-by: Yannick Vignon <yannick.vignon@nxp.com> Link: https://lore.kernel.org/r/20211124154731.1676949-1-yannick.vignon@oss.nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Jakub Kicinski says: ==================== tls: splice_read fixes As I work my way to unlocked and zero-copy TLS Rx the obvious bugs in the splice_read implementation get harder and harder to ignore. This is to say the fixes here are discovered by code inspection, I'm not aware of anyone actually using splice_read. ==================== Link: https://lore.kernel.org/r/20211124232557.2039757-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Previous patch fixes overriding callbacks incorrectly. Triggering the crash in sendpage_locked would be more spectacular but it's hard to get to, so take the easier path of proving this is broken and call getname. We're currently getting IPv4 socket info on an IPv6 socket. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
We replace proto_ops whenever TLS is configured for RX. But our replacement also overrides sendpage_locked, which will crash unless TX is also configured. Similarly we plug both of those in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely different implementation for TX. Last but not least we always plug in something based on inet_stream_ops even though a few of the callbacks differ for IPv6 (getname, release, bind). Use a callback building method similar to what we do for struct proto. Fixes: c46234eb ("tls: RX path for ktls") Fixes: d4ffb02d ("net/tls: enable sk_msg redirect to tls socket egress") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Add tests for half-received and peeked records. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
recvmsg() will put peek()ed and partially read records onto the rx_list. splice_read() needs to consult that list otherwise it may miss data. Align with recvmsg() and also put partially-read records onto rx_list. tls_sw_advance_skb() is pretty pointless now and will be removed in net-next. Fixes: 692d7b5d ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Make sure we correctly reject splicing non-data records. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
We don't support splicing control records. TLS 1.3 changes moved the record type check into the decrypt if(). The skb may already be decrypted and still be an alert. Note that decrypt_skb_update() is idempotent and updates ctx->decrypted so the if() is pointless. Reorder the check for decryption errors with the content type check while touching them. This part is not really a bug, because if decryption failed in TLS 1.3 content type will be DATA, and for TLS 1.2 it will be correct. Nevertheless its strange to touch output before checking if the function has failed. Fixes: fedf201e ("net: tls: Refactor control message handling on recv") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Test broken records. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Add helpers for sending and receiving special record types. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
We have the same code 3 times, about to add a fourth copy. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 25 Nov, 2021 12 commits
-
-
Dylan Hung authored
The issue happened randomly in runtime. The message "Link is Down" is popped but soon it recovered to "Link is Up". The "Link is Down" results from the incorrect read data for reading the PHY register via MDIO bus. The correct sequence for reading the data shall be: 1. fire the command 2. wait for command done (this step was missing) 3. wait for data idle 4. read data from data register Cc: stable@vger.kernel.org Fixes: f160e994 ("net: phy: Add mdio-aspeed") Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20211125024432.15809-1-dylan_hung@aspeedtech.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jesse Brandeburg authored
Oleksandr brought a bug report where netpoll causes trace messages in the log on igb. Danielle brought this back up as still occurring, so we'll try again. [22038.710800] ------------[ cut here ]------------ [22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll [22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0 As Alex suggested, change the driver to return work_done at the exit of napi_poll, which should be safe to do in this driver because it is not polling multiple queues in this single napi context (multiple queues attached to one MSI-X vector). Several other drivers contain the same simple sequence, so I hope this will not create new problems. Fixes: 16eb8815 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by: Danielle Ratson <danieller@nvidia.com> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Danielle Ratson <danieller@nvidia.com> Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Karsten Graul says: ==================== net/smc: fixes 2021-11-24 Patch 1 from DaXing fixes a possible loop in smc_listen(). Patch 2 prevents a NULL pointer dereferencing while iterating over the lower network devices. ==================== Link: https://lore.kernel.org/r/20211124123238.471429-1-kgraul@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Guo DaXing authored
The kernel_listen function in smc_listen will fail when all the available ports are occupied. At this point smc->clcsock->sk->sk_data_ready has been changed to smc_clcsock_data_ready. When we call smc_listen again, now both smc->clcsock->sk->sk_data_ready and smc->clcsk_data_ready point to the smc_clcsock_data_ready function. The smc_clcsock_data_ready() function calls lsmc->clcsk_data_ready which now points to itself resulting in an infinite loop. This patch restores smc->clcsock->sk->sk_data_ready with the old value. Fixes: a60a2b1e ("net/smc: reduce active tcp_listen workers") Signed-off-by: Guo DaXing <guodaxing@huawei.com> Acked-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Karsten Graul authored
Coverity reports a possible NULL dereferencing problem: in smc_vlan_by_tcpsk(): 6. returned_null: netdev_lower_get_next returns NULL (checked 29 out of 30 times). 7. var_assigned: Assigning: ndev = NULL return value from netdev_lower_get_next. 1623 ndev = (struct net_device *)netdev_lower_get_next(ndev, &lower); CID 1468509 (#1 of 1): Dereference null return value (NULL_RETURNS) 8. dereference: Dereferencing a pointer that might be NULL ndev when calling is_vlan_dev. 1624 if (is_vlan_dev(ndev)) { Remove the manual implementation and use netdev_walk_all_lower_dev() to iterate over the lower devices. While on it remove an obsolete function parameter comment. Fixes: cb9d43f6 ("net/smc: determine vlan_id of stacked net_device") Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Marek Behún says: ==================== phylink resolve fixes With information from me and my nagging, Russell has produced two fixes for phylink, which add code that triggers another phylink_resolve() from phylink_resolve(), if certain conditions are met: interface is being changed or link is down and previous link was up These are needed because sometimes the PCS callbacks may provide stale values if link / speed / ... ==================== Link: https://lore.kernel.org/r/20211123154403.32051-1-kabel@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Russell King (Oracle) authored
On mv88e6xxx 1G/2.5G PCS, the SerDes register 4.2001.2 has the following description: This register bit indicates when link was lost since the last read. For the current link status, read this register back-to-back. Thus to get current link state, we need to read the register twice. But doing that in the link change interrupt handler would lead to potentially ignoring link down events, which we really want to avoid. Thus this needs to be solved in phylink's resolve, by retriggering another resolve in the event when PCS reports link down and previous link was up, and by re-reading PCS state if the previous link was down. The wrong value is read when phylink requests change from sgmii to 2500base-x mode, and link won't come up. This fixes the bug. Fixes: 9525ae83 ("phylink: add phylink infrastructure") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Russell King (Oracle) authored
On PHY state change the phylink_resolve() function can read stale information from the MAC and report incorrect link speed and duplex to the kernel message log. Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell 88E6393X switch: - PHY driver triggers state change due to PHY interface mode being changed from 10gbase-r to 2500base-x due to copper change in speed from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed its interface to the host, or the interrupt about loss of SerDes link hadn't arrived yet (there can be a delay of several milliseconds for this), so we still think that the 10gbase-r mode is up - phylink_resolve() - phylink_mac_pcs_get_state() - this fills in speed=10g link=up - interface mode is updated to 2500base-x but speed is left at 10Gbps - phylink_major_config() - interface is changed to 2500base-x - phylink_link_up() - mv88e6xxx_mac_link_up() - .port_set_speed_duplex() - speed is set to 10Gbps - reports "Link is Up - 10Gbps/Full" to dmesg Afterwards when the interrupt finally arrives for mv88e6xxx, another resolve is forced in which we get the correct speed from phylink_mac_pcs_get_state(), but since the interface is not being changed anymore, we don't call phylink_major_config() but only phylink_mac_config(), which does not set speed/duplex anymore. To fix this, we need to force the link down and trigger another resolve on PHY interface change event. Fixes: 9525ae83 ("phylink: add phylink infrastructure") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Heiner Kallweit authored
Usage of phy_ethtool_get_link_ksettings() in the link status change handler isn't needed, and in combination with the referenced change it results in a deadlock. Simply remove the call and replace it with direct access to phydev->speed. The duplex argument of lan743x_phy_update_flowcontrol() isn't used and can be removed. Fixes: c10a485c ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency") Reported-by: Alessandro B Maurici <abmaurici@gmail.com> Tested-by: Alessandro B Maurici <abmaurici@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/40e27f76-0ba3-dcef-ee32-a78b9df38b0f@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
While testing BIG TCP patch series, I was expecting that TCP_RR workloads with 80KB requests/answers would send one 80KB TSO packet, then being received as a single GRO packet. It turns out this was not happening, and the root cause was that cubic Hystart ACK train was triggering after a few (2 or 3) rounds of RPC. Hystart was wrongly setting CWND/SSTHRESH to 30, while my RPC needed a budget of ~20 segments. Ideally these TCP_RR flows should not exit slow start. Cubic Hystart should reset itself at each round, instead of assuming every TCP flow is a bulk one. Note that even after this patch, Hystart can still trigger, depending on scheduling artifacts, but at a higher CWND/SSTHRESH threshold, keeping optimal TSO packet sizes. Tested: ip link set dev eth0 gro_ipv6_max_size 131072 gso_ipv6_max_size 131072 nstat -n; netperf -H ... -t TCP_RR -l 5 -- -r 80000,80000 -K cubic; nstat|egrep "Ip6InReceives|Hystart|Ip6OutRequests" Before: 8605 Ip6InReceives 87541 0.0 Ip6OutRequests 129496 0.0 TcpExtTCPHystartTrainDetect 1 0.0 TcpExtTCPHystartTrainCwnd 30 0.0 After: 8760 Ip6InReceives 88514 0.0 Ip6OutRequests 87975 0.0 Fixes: ae27e98a ("[TCP] CUBIC v2.3") Co-developed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Link: https://lore.kernel.org/r/20211123202535.1843771-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Florian Fainelli authored
Update the B53 Ethernet switch section to contain drivers/net/dsa/bcm_sf2*. Reported-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20211123222422.3745485-1-f.fainelli@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Merge tag 'ieee802154-for-net-2021-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2021-11-24 A fix from Alexander which has been brought up various times found by automated checkers. Make sure values are in u32 range. * tag 'ieee802154-for-net-2021-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan: net: ieee802154: handle iftypes as u32 ==================== Link: https://lore.kernel.org/r/20211124150934.3670248-1-stefan@datenfreihafen.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 24 Nov, 2021 4 commits
-
-
Kumar Thangavel authored
Update NC-SI command handler (both standard and OEM) to take into account of payload paddings in allocating skb (in case of payload size is not 32-bit aligned). The checksum field follows payload field, without taking payload padding into account can cause checksum being truncated, leading to dropped packets. Fixes: fb4ee675 ("net/ncsi: Add NCSI OEM command support") Signed-off-by: Kumar Thangavel <thangavel.k@hcl.com> Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
James Prestwood authored
This was previously added in selftests but never added to the Makefile Signed-off-by: James Prestwood <prestwoj@gmail.com> Link: https://lore.kernel.org/r/20211122171806.3529401-1-prestwoj@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jamal Hadi Salim authored
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20211122144252.25156-1-jhs@emojatatu.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
This file has not been updated for a while. Sync it before BIG TCP patch series. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211122184810.769159-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 23 Nov, 2021 10 commits
-
-
Marek Behún authored
Currently mvpp2_xdp_setup won't allow attaching XDP program if mtu > ETH_DATA_LEN (1500). The mvpp2_change_mtu on the other hand checks whether MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE. These two checks are semantically different. Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in mvpp2_rx we have xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM; xdp.frame_sz = PAGE_SIZE; Change the checks to check whether mtu > MVPP2_MAX_RX_BUF_SIZE Fixes: 07dd0a7a ("mvpp2: add basic XDP support") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Calling ipa_cmd_pipeline_clear() after stopping the channel underlying the AP<-modem RX endpoint can lead to a deadlock. This occurs in the ->runtime_suspend device power operation for the IPA driver. While this callback is in progress, any other requests for power will block until the callback returns. Stopping the AP<-modem RX channel does not prevent the modem from sending another packet to this endpoint. If a packet arrives for an RX channel when the channel is stopped, an SUSPEND IPA interrupt condition will be pending. Handling an IPA interrupt requires power, so ipa_isr_thread() calls pm_runtime_get_sync() first thing. The problem occurs because a "pipeline clear" command will not complete while such a SUSPEND interrupt condition exists. So the SUSPEND IPA interrupt handler won't proceed until it gets power; that won't happen until the ->runtime_suspend callback (and its "pipeline clear" command) completes; and that can't happen while the SUSPEND interrupt condition exists. It turns out that in this case there is no need to use the "pipeline clear" command. There are scenarios in which clearing the pipeline is required while suspending, but those are not (yet) supported upstream. So a simple fix, avoiding the potential deadlock, is to stop calling ipa_cmd_pipeline_clear() in ipa_endpoint_suspend(). This removes the only user of ipa_cmd_pipeline_clear(), so get rid of that function. It can be restored again whenever it's needed. This is basically a manual revert along with an explanation for commit 6cb63ea6 ("net: ipa: introduce ipa_cmd_tag_process()"). Fixes: 6cb63ea6 ("net: ipa: introduce ipa_cmd_tag_process()") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martyn Welch authored
The smsc95xx driver is dropping phy speed settings and causing a stack trace at device unbind: [ 536.379147] smsc95xx 2-1:1.0 eth1: unregister 'smsc95xx' usb-ci_hdrc.2-1, smsc95xx USB 2.0 Ethernet [ 536.425029] ------------[ cut here ]------------ [ 536.429650] WARNING: CPU: 0 PID: 439 at fs/kernfs/dir.c:1535 kernfs_remove_by_name_ns+0xb8/0xc0 [ 536.438416] kernfs: can not remove 'attached_dev', no directory [ 536.444363] Modules linked in: xts dm_crypt dm_mod atmel_mxt_ts smsc95xx usbnet [ 536.451748] CPU: 0 PID: 439 Comm: sh Tainted: G W 5.15.0 #1 [ 536.458636] Hardware name: Freescale i.MX53 (Device Tree Support) [ 536.464735] Backtrace: [ 536.467190] [<80b1c904>] (dump_backtrace) from [<80b1cb48>] (show_stack+0x20/0x24) [ 536.474787] r7:000005ff r6:8035b294 r5:600f0013 r4:80d8af78 [ 536.480449] [<80b1cb28>] (show_stack) from [<80b1f764>] (dump_stack_lvl+0x48/0x54) [ 536.488035] [<80b1f71c>] (dump_stack_lvl) from [<80b1f788>] (dump_stack+0x18/0x1c) [ 536.495620] r5:00000009 r4:80d9b820 [ 536.499198] [<80b1f770>] (dump_stack) from [<80124fac>] (__warn+0xfc/0x114) [ 536.506187] [<80124eb0>] (__warn) from [<80b1d21c>] (warn_slowpath_fmt+0xa8/0xdc) [ 536.513688] r7:000005ff r6:80d9b820 r5:80d9b8e0 r4:83744000 [ 536.519349] [<80b1d178>] (warn_slowpath_fmt) from [<8035b294>] (kernfs_remove_by_name_ns+0xb8/0xc0) [ 536.528416] r9:00000001 r8:00000000 r7:824926dc r6:00000000 r5:80df6c2c r4:00000000 [ 536.536162] [<8035b1dc>] (kernfs_remove_by_name_ns) from [<80b1f56c>] (sysfs_remove_link+0x4c/0x50) [ 536.545225] r6:7f00f02c r5:80df6c2c r4:83306400 [ 536.549845] [<80b1f520>] (sysfs_remove_link) from [<806f9c8c>] (phy_detach+0xfc/0x11c) [ 536.557780] r5:82492000 r4:83306400 [ 536.561359] [<806f9b90>] (phy_detach) from [<806f9cf8>] (phy_disconnect+0x4c/0x58) [ 536.568943] r7:824926dc r6:7f00f02c r5:82492580 r4:83306400 [ 536.574604] [<806f9cac>] (phy_disconnect) from [<7f00a310>] (smsc95xx_disconnect_phy+0x30/0x38 [smsc95xx]) [ 536.584290] r5:82492580 r4:82492580 [ 536.587868] [<7f00a2e0>] (smsc95xx_disconnect_phy [smsc95xx]) from [<7f001570>] (usbnet_stop+0x70/0x1a0 [usbnet]) [ 536.598161] r5:82492580 r4:82492000 [ 536.601740] [<7f001500>] (usbnet_stop [usbnet]) from [<808baa70>] (__dev_close_many+0xb4/0x12c) [ 536.610466] r8:83744000 r7:00000000 r6:83744000 r5:83745b74 r4:82492000 [ 536.617170] [<808ba9bc>] (__dev_close_many) from [<808bab78>] (dev_close_many+0x90/0x120) [ 536.625365] r7:00000001 r6:83745b74 r5:83745b8c r4:82492000 [ 536.631026] [<808baae8>] (dev_close_many) from [<808bf408>] (unregister_netdevice_many+0x15c/0x704) [ 536.640094] r9:00000001 r8:81130b98 r7:83745b74 r6:83745bc4 r5:83745b8c r4:82492000 [ 536.647840] [<808bf2ac>] (unregister_netdevice_many) from [<808bfa50>] (unregister_netdevice_queue+0xa0/0xe8) [ 536.657775] r10:8112bcc0 r9:83306c00 r8:83306c80 r7:8291e420 r6:83744000 r5:00000000 [ 536.665608] r4:82492000 [ 536.668143] [<808bf9b0>] (unregister_netdevice_queue) from [<808bfac0>] (unregister_netdev+0x28/0x30) [ 536.677381] r6:7f01003c r5:82492000 r4:82492000 [ 536.682000] [<808bfa98>] (unregister_netdev) from [<7f000b40>] (usbnet_disconnect+0x64/0xdc [usbnet]) [ 536.691241] r5:82492000 r4:82492580 [ 536.694819] [<7f000adc>] (usbnet_disconnect [usbnet]) from [<8076b958>] (usb_unbind_interface+0x80/0x248) [ 536.704406] r5:7f01003c r4:83306c80 [ 536.707984] [<8076b8d8>] (usb_unbind_interface) from [<8061765c>] (device_release_driver_internal+0x1c4/0x1cc) [ 536.718005] r10:8112bcc0 r9:80dff1dc r8:83306c80 r7:83744000 r6:7f01003c r5:00000000 [ 536.725838] r4:8291e420 [ 536.728373] [<80617498>] (device_release_driver_internal) from [<80617684>] (device_release_driver+0x20/0x24) [ 536.738302] r7:83744000 r6:810d4f4c r5:8291e420 r4:8176ae30 [ 536.743963] [<80617664>] (device_release_driver) from [<806156cc>] (bus_remove_device+0xf0/0x148) [ 536.752858] [<806155dc>] (bus_remove_device) from [<80610018>] (device_del+0x198/0x41c) [ 536.760880] r7:83744000 r6:8116e2e4 r5:8291e464 r4:8291e420 [ 536.766542] [<8060fe80>] (device_del) from [<80768fe8>] (usb_disable_device+0xcc/0x1e0) [ 536.774576] r10:8112bcc0 r9:80dff1dc r8:00000001 r7:8112bc48 r6:8291e400 r5:00000001 [ 536.782410] r4:83306c00 [ 536.784945] [<80768f1c>] (usb_disable_device) from [<80769c30>] (usb_set_configuration+0x514/0x8dc) [ 536.794011] r10:00000000 r9:00000000 r8:832c3600 r7:00000004 r6:810d5688 r5:00000000 [ 536.801844] r4:83306c00 [ 536.804379] [<8076971c>] (usb_set_configuration) from [<80775fac>] (usb_generic_driver_disconnect+0x34/0x38) [ 536.814236] r10:832c3610 r9:83745ef8 r8:832c3600 r7:00000004 r6:810d5688 r5:83306c00 [ 536.822069] r4:83306c00 [ 536.824605] [<80775f78>] (usb_generic_driver_disconnect) from [<8076b850>] (usb_unbind_device+0x30/0x70) [ 536.834100] r5:83306c00 r4:810d5688 [ 536.837678] [<8076b820>] (usb_unbind_device) from [<8061765c>] (device_release_driver_internal+0x1c4/0x1cc) [ 536.847432] r5:822fb480 r4:83306c80 [ 536.851009] [<80617498>] (device_release_driver_internal) from [<806176a8>] (device_driver_detach+0x20/0x24) [ 536.860853] r7:00000004 r6:810d4f4c r5:810d5688 r4:83306c80 [ 536.866515] [<80617688>] (device_driver_detach) from [<80614d98>] (unbind_store+0x70/0xe4) [ 536.874793] [<80614d28>] (unbind_store) from [<80614118>] (drv_attr_store+0x30/0x3c) [ 536.882554] r7:00000000 r6:00000000 r5:83739200 r4:80614d28 [ 536.888217] [<806140e8>] (drv_attr_store) from [<8035cb68>] (sysfs_kf_write+0x48/0x54) [ 536.896154] r5:83739200 r4:806140e8 [ 536.899732] [<8035cb20>] (sysfs_kf_write) from [<8035be84>] (kernfs_fop_write_iter+0x11c/0x1d4) [ 536.908446] r5:83739200 r4:00000004 [ 536.912024] [<8035bd68>] (kernfs_fop_write_iter) from [<802b87fc>] (vfs_write+0x258/0x3e4) [ 536.920317] r10:00000000 r9:83745f58 r8:83744000 r7:00000000 r6:00000004 r5:00000000 [ 536.928151] r4:82adacc0 [ 536.930687] [<802b85a4>] (vfs_write) from [<802b8b0c>] (ksys_write+0x74/0xf4) [ 536.937842] r10:00000004 r9:007767a0 r8:83744000 r7:00000000 r6:00000000 r5:82adacc0 [ 536.945676] r4:82adacc0 [ 536.948213] [<802b8a98>] (ksys_write) from [<802b8ba4>] (sys_write+0x18/0x1c) [ 536.955367] r10:00000004 r9:83744000 r8:80100244 r7:00000004 r6:76f47b58 r5:76fc0350 [ 536.963200] r4:00000004 [ 536.965735] [<802b8b8c>] (sys_write) from [<80100060>] (ret_fast_syscall+0x0/0x48) [ 536.973320] Exception stack(0x83745fa8 to 0x83745ff0) [ 536.978383] 5fa0: 00000004 76fc0350 00000001 007767a0 00000004 00000000 [ 536.986569] 5fc0: 00000004 76fc0350 76f47b58 00000004 76f47c7c 76f48114 00000000 7e87991c [ 536.994753] 5fe0: 00000498 7e879908 76e6dce8 76eca2e8 [ 536.999922] ---[ end trace 9b835d809816b435 ]--- The driver should not be connecting and disconnecting the PHY when the device is opened and closed, it should be stopping and starting the PHY. The phy should be connected as part of binding and disconnected during unbinding. As this results in the PHY not being reset during open, link speed, etc. settings set prior to the link coming up are now not being lost. It is necessary for phy_stop() to only be called when the phydev still exists (resolving the above stack trace). When unbinding, ".unbind" will be called prior to ".stop", with phy_disconnect() already having called phy_stop() before the phydev becomes inaccessible. Signed-off-by: Martyn Welch <martyn.welch@collabora.com> Cc: Steve Glendinning <steve.glendinning@shawell.net> Cc: UNGLinuxDriver@microchip.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: stable@kernel.org # v5.15 Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zheyu Ma authored
During the process of driver probing, probe function should return < 0 for failure, otherwise kernel will treat value == 0 as success. Therefore, we should set err to -EINVAL when adapter->registered_device_map is NULL. Otherwise kernel will assume that driver has been successfully probed and will cause unexpected errors. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
The original changes brakes MAC address assignment on older chip versions (see bug report [0]), and it brakes random MAC assignment. is_valid_ether_addr() requires that its argument is word-aligned. Add the missing alignment to array mac_addr. [0] https://bugzilla.kernel.org/show_bug.cgi?id=215087 Fixes: 1c5d09d5 ("ethernet: r8169: use eth_hw_addr_set()") Reported-by: Richard Herbert <rherbert@sympatico.ca> Tested-by: Richard Herbert <rherbert@sympatico.ca> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueDavid S. Miller authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-11-22 Maciej Fijalkowski says: Here are the two fixes for issues around ethtool's set_channels() callback for ice driver. Both are related to XDP resources. First one corrects the size of vsi->txq_map that is used to track the usage of Tx resources and the second one prevents the wrong refcounting of bpf_prog. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: prevent shutdown during setup The setup phase of the IPA driver occurs in one of two ways. Normally, it is done directly by the main driver probe function. But some systems (those having a "modem-init" DTS property) don't start setup until an SMP2P interrupt (sent by the modem) arrives. Because it isn't performed by the probe function, setup on "modem-init" systems could be underway at the time a driver remove (or shutdown) request arrives (or vice-versa). This situation can lead to hardware state not being cleaned up properly. This series addresses this problem by having the driver remove function disable the setup interrupt. A consequence of this is that setup will complete if it is underway when the remove function is called. So now, when removing the driver, setup: - will have already completed; - is underway, and will complete before proceeding; or - will not have begun (and will not occur). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The IPA setup_complete flag is set at the end of ipa_setup(), when the setup phase of initialization has completed successfully. This occurs as part of driver probe processing, or (if "modem-init" is specified in the DTS file) it is triggered by the "ipa-setup-ready" SMP2P interrupt generated by the modem. In the latter case, it's possible for driver shutdown (or remove) to begin while setup processing is underway, and this can't be allowed. The problem is that the setup_complete flag is not adequate to signal that setup is underway. If setup_complete is set, it will never be un-set, so that case is not a problem. But if setup_complete is false, there's a chance setup is underway. Because setup is triggered by an interrupt on a "modem-init" system, there is a simple way to ensure the value of setup_complete is safe to read. The threaded handler--if it is executing--will complete as part of a request to disable the "ipa-modem-ready" interrupt. This means that ipa_setup() (which is called from the handler) will run to completion if it was underway, or will never be called otherwise. The request to disable the "ipa-setup-ready" interrupt is currently made within ipa_modem_stop(). Instead, disable the interrupt outside that function in the two places it's called. In the case of ipa_remove(), this ensures the setup_complete flag is safe to read before we read it. Rename ipa_smp2p_disable() to be ipa_smp2p_irq_disable_setup(), to be more specific about its effect. Fixes: 530f9216 ("soc: qcom: ipa: AP/modem communications") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
We currently maintain a "disabled" Boolean flag to determine whether the "ipa-setup-ready" SMP2P IRQ handler does anything. That flag must be accessed under protection of a mutex. Instead, disable the SMP2P interrupt when requested, which prevents the interrupt handler from ever being called. More importantly, it synchronizes a thread disabling the interrupt with the completion of the interrupt handler in case they run concurrently. Use the IPA setup_complete flag rather than the disabled flag in the handler to determine whether to ignore any interrupts arriving after the first. Rename the "disabled" flag to be "setup_disabled", to be specific about its purpose. Fixes: 530f9216 ("soc: qcom: ipa: AP/modem communications") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: Two small fixes Patch #1 fixes a recent regression that prevents the driver from loading with old firmware versions. Patch #2 protects the driver from a NULL pointer dereference when working on top of a buggy firmware. This was never observed in an actual system, only on top of an emulator during development. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-