- 17 Dec, 2019 4 commits
-
-
Florian Fainelli authored
There were several issues with 53568438 ("net: dsa: b53: Add support for port_egress_floods callback") that resulted in breaking connectivity for standalone ports: - both user and CPU ports must allow unicast and multicast forwarding by default otherwise this just flat out breaks connectivity for standalone DSA ports - IP multicast is treated similarly as multicast, but has separate control registers - the UC, MC and IPMC lookup failure register offsets were wrong, and instead used bit values that are meaningful for the B53_IP_MULTICAST_CTRL register Fixes: 53568438 ("net: dsa: b53: Add support for port_egress_floods callback") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stefano Garzarella says: ==================== vsock/virtio: fix null-pointer dereference and related precautions This series mainly solves a possible null-pointer dereference in virtio_transport_recv_listen() introduced with the multi-transport support [PATCH 1]. PATCH 2 adds a WARN_ON check for the same potential issue and a returned error in the virtio_transport_send_pkt_info() function to avoid crashing the kernel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Garzarella authored
virtio_transport_get_ops() and virtio_transport_send_pkt_info() can only be used on connecting/connected sockets, since a socket assigned to a transport is required. This patch adds a WARN_ON() on virtio_transport_get_ops() to check this requirement, a comment and a returned error on virtio_transport_send_pkt_info(), Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Garzarella authored
With multi-transport support, listener sockets are not bound to any transport. So, calling virtio_transport_reset(), when an error occurs, on a listener socket produces the following null-pointer dereference: BUG: kernel NULL pointer dereference, address: 00000000000000e8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 20 Comm: kworker/0:1 Not tainted 5.5.0-rc1-ste-00003-gb4be21f316ac-dirty #56 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport] RIP: 0010:virtio_transport_send_pkt_info+0x20/0x130 [vmw_vsock_virtio_transport_common] Code: 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 10 44 8b 76 20 e8 c0 ba fe ff <48> 8b 80 e8 00 00 00 e8 64 e3 7d c1 45 8b 45 00 41 8b 8c 24 d4 02 RSP: 0018:ffffc900000b7d08 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88807bf12728 RCX: 0000000000000000 RDX: ffff88807bf12700 RSI: ffffc900000b7d50 RDI: ffff888035c84000 RBP: ffffc900000b7d40 R08: ffff888035c84000 R09: ffffc900000b7d08 R10: ffff8880781de800 R11: 0000000000000018 R12: ffff888035c84000 R13: ffffc900000b7d50 R14: 0000000000000000 R15: ffff88807bf12724 FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e8 CR3: 00000000790f4004 CR4: 0000000000160ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: virtio_transport_reset+0x59/0x70 [vmw_vsock_virtio_transport_common] virtio_transport_recv_pkt+0x5bb/0xe50 [vmw_vsock_virtio_transport_common] ? detach_buf_split+0xf1/0x130 virtio_transport_rx_work+0xba/0x130 [vmw_vsock_virtio_transport] process_one_work+0x1c0/0x300 worker_thread+0x45/0x3c0 kthread+0xfc/0x130 ? current_work+0x40/0x40 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: sunrpc kvm_intel kvm vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common irqbypass vsock virtio_rng rng_core CR2: 00000000000000e8 ---[ end trace e75400e2ea2fa824 ]--- This happens because virtio_transport_reset() calls virtio_transport_send_pkt_info() that can be used only on connecting/connected sockets. This patch fixes the issue, using virtio_transport_reset_no_sock() instead of virtio_transport_reset() when we are handling a listener socket. Fixes: c0cfa2d8 ("vsock: add multi-transports support") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 15 Dec, 2019 12 commits
-
-
Paul Durrant authored
In function xenvif_disconnect_queue(), the value of queue->rx_irq is zeroed *before* queue->task is stopped. Unfortunately that task may call notify_remote_via_irq(queue->rx_irq) and calling that function with a zero value results in a NULL pointer dereference in evtchn_from_irq(). This patch simply re-orders things, stopping all tasks before zero-ing the irq values, thereby avoiding the possibility of the race. Fixes: 2ac061ce ("xen/netback: cleanup init and deinit code") Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Cristian Birsan authored
Display the return code as decimal integer. Fixes: 55d7de9d ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Vishal Kulkarni authored
The sge_info debugfs collects offload queue info even when offload capability is disabled and leads to panic. [ 144.139871] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 144.139874] CR2: 0000000000000000 CR3: 000000082d456005 CR4: 00000000001606e0 [ 144.139876] Call Trace: [ 144.139887] sge_queue_start+0x12/0x30 [cxgb4] [ 144.139897] seq_read+0x1d4/0x3d0 [ 144.139906] full_proxy_read+0x50/0x70 [ 144.139913] vfs_read+0x89/0x140 [ 144.139916] ksys_read+0x55/0xd0 [ 144.139924] do_syscall_64+0x5b/0x1d0 [ 144.139933] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 144.139936] RIP: 0033:0x7f4b01493990 Fix this crash by skipping the offload queue access in sge_qinfo when offload capability is disabled Signed-off-by: Herat Ramani <herat@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Ursula Braun authored
FASTOPEN setsockopt() or sendmsg() may switch the SMC socket to fallback mode. Once fallback mode is active, the native TCP socket functions are called. Nevertheless there is a small race window, when FASTOPEN setsockopt/sendmsg runs in parallel to a connect(), and switch the socket into fallback mode before connect() takes the sock lock. Make sure the SMC-specific connect setup is omitted in this case. This way a syzbot-reported refcount problem is fixed, triggered by different threads running non-blocking connect() and FASTOPEN_KEY setsockopt. Reported-by: syzbot+96d3f9ff6a86d37e44c8@syzkaller.appspotmail.com Fixes: 6d6dd528 ("net/smc: fix refcount non-blocking connect() -part 2") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Russell King authored
A mismerge between the following two commits: c6787263 ("net: phylink: ensure consistent phy interface mode") 27755ff8 ("net: phylink: Add phylink_mac_link_{up, down} wrapper functions") resulted in the wrong interface being passed to the mac_link_up() function. Fix this up. Fixes: b4b12b0d ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Thadeu Lima de Souza Cascardo authored
This test only works when [1] is applied, which was rejected. Basically, the errors are reported and cleared. In this particular case of tls sockets, following reads will block. The test case was originally submitted with the rejected patch, but, then, was included as part of a different patchset, possibly by mistake. [1] https://lore.kernel.org/netdev/20191007035323.4360-2-jakub.kicinski@netronome.com/#t Thanks Paolo Pisati for pointing out the original patchset where this appeared. Fixes: 65190f77 (selftests/tls: add a test for fragmented messages) Reported-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Jakub Kicinski authored
Taehee Yoo says: ==================== gtp: fix several bugs in gtp module This patchset fixes several bugs in the GTP module. 1. Do not allow adding duplicate TID and ms_addr pdp context. In the current code, duplicate TID and ms_addr pdp context could be added. So, RX and TX path could find correct pdp context. 2. Fix wrong condition in ->dumpit() callback. ->dumpit() callback is re-called if dump packet size is too big. So, before return, it saves last position and then restart from last dump position. TID value is used to find last dump position. GTP module allows adding zero TID value. But ->dumpit() callback ignores zero TID value. So, dump would not work correctly if dump packet size too big. 3. Fix use-after-free in ipv4_pdp_find(). RX and TX patch always uses gtp->tid_hash and gtp->addr_hash. but while packet processing, these hash pointer would be freed. So, use-after-free would occur. 4. Fix panic because of zero size hashtable GTP hashtable size could be set by user-space. If hashsize is set to 0, hashtable will not work and panic will occur. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Taehee Yoo authored
GTP default hashtable size is 1024 and userspace could set specific hashtable size with IFLA_GTP_PDP_HASHSIZE. If hashtable size is set to 0 from userspace, hashtable will not work and panic will occur. Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Taehee Yoo authored
ipv4_pdp_find() is called in TX packet path of GTP. ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context. In the current code, gtp->tid_hash and gtp->addr_hash are freed by ->dellink(), which is gtp_dellink(). But gtp_dellink() would be called while packets are processing. So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash. Instead, dev->priv_destructor() would be used because this callback is called after all packet processing safely. Test commands: ip link add veth1 type veth peer name veth2 ip a a 172.0.0.1/24 dev veth1 ip link set veth1 up ip a a 172.99.0.1/32 dev lo gtp-link add gtp1 & gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2 ip r a 172.99.0.2/32 dev gtp1 ip link set gtp1 mtu 1500 ip netns add ns2 ip link set veth2 netns ns2 ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2 ip netns exec ns2 ip link set veth2 up ip netns exec ns2 ip a a 172.99.0.2/32 dev lo ip netns exec ns2 ip link set lo up ip netns exec ns2 gtp-link add gtp2 & ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1 ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2 ip netns exec ns2 ip link set gtp2 mtu 1500 hping3 172.99.0.2 -2 --flood & ip link del gtp1 Splat looks like: [ 72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.568916][ T1195] Read of size 8 at addr ffff8880b9a35d28 by task hping3/1195 [ 72.569631][ T1195] [ 72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199 [ 72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 72.571438][ T1195] Call Trace: [ 72.571764][ T1195] dump_stack+0x96/0xdb [ 72.572171][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.572761][ T1195] print_address_description.constprop.5+0x1be/0x360 [ 72.573400][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.573971][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.574544][ T1195] __kasan_report+0x12a/0x16f [ 72.575014][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.575593][ T1195] kasan_report+0xe/0x20 [ 72.576004][ T1195] ipv4_pdp_find.isra.12+0x130/0x170 [gtp] [ 72.576577][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp] [ ... ] [ 72.647671][ T1195] BUG: unable to handle page fault for address: ffff8880b9a35d28 [ 72.648512][ T1195] #PF: supervisor read access in kernel mode [ 72.649158][ T1195] #PF: error_code(0x0000) - not-present page [ 72.649849][ T1195] PGD a6c01067 P4D a6c01067 PUD 11fb07067 PMD 11f939067 PTE 800fffff465ca060 [ 72.652958][ T1195] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G B 5.5.0-rc1 #199 [ 72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp] [ ... ] [ 72.679168][ T1195] Call Trace: [ 72.679603][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp] [ 72.681915][ T1195] ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp] [ 72.682513][ T1195] ? lock_acquire+0x164/0x3b0 [ 72.682966][ T1195] ? gtp_dev_xmit+0x35e/0x890 [gtp] [ 72.683481][ T1195] gtp_dev_xmit+0x3c2/0x890 [gtp] [ ... ] Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Taehee Yoo authored
gtp_genl_dump_pdp() is ->dumpit() callback of GTP module and it is used to dump pdp contexts. it would be re-executed because of dump packet size. If dump packet size is too big, it saves current dump pointer (gtp interface pointer, bucket, TID value) then it restarts dump from last pointer. Current GTP code allows adding zero TID pdp context but dump code ignores zero TID value. So, last dump pointer will not be found. In addition, this patch adds missing rcu_read_lock() in gtp_genl_dump_pdp(). Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Taehee Yoo authored
GTP RX packet path lookups pdp context with TID. If duplicate TID pdp contexts are existing in the list, it couldn't select correct pdp context. So, TID value should be unique. GTP TX packet path lookups pdp context with ms_addr. If duplicate ms_addr pdp contexts are existing in the list, it couldn't select correct pdp context. So, ms_addr value should be unique. Fixes: 459aa660 ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Mahesh Bandewar authored
After the recent fix in commit 1899bb32 ("bonding: fix state transition issue in link monitoring"), the active-backup mode with miimon initially come-up fine but after a link-failure, both members transition into backup state. Following steps to reproduce the scenario (eth1 and eth2 are the slaves of the bond): ip link set eth1 up ip link set eth2 down sleep 1 ip link set eth2 up ip link set eth1 down cat /sys/class/net/eth1/bonding_slave/state cat /sys/class/net/eth2/bonding_slave/state Fixes: 1899bb32 ("bonding: fix state transition issue in link monitoring") CC: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
- 14 Dec, 2019 15 commits
-
-
Jakub Kicinski authored
Manish Chopra says: ==================== bnx2x: bug fixes This series has two driver changes, one to fix some unexpected hardware behaviour casued during the parity error recovery in presence of SR-IOV VFs and another one related for fixing resource management in the driver among the PFs configured on an engine. Please consider applying it to "net". V1->V2: ======= Fix the compilation errors reported by kbuild test robot on the patch #1 with CONFIG_BNX2X_SRIOV=n ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Manish Chopra authored
Driver doesn't calculate total number of PFs configured on a given engine correctly which messed up resources in the PFs loaded on that engine, leading driver to exceed configuration of resources (like vlan filters etc.) beyond the limit per engine, which ended up with asserts from the firmware. Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Manish Chopra authored
Parity error from the hardware will cause PF to lose the state of their VFs due to PF's internal reload and hardware reset following the parity error. Restrict any configuration request from the VFs after the parity as it could cause unexpected hardware behavior, only way for VFs to recover would be to trigger FLR on VFs and reload them. Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Arnd Bergmann authored
Without the common part of the driver, the new file fails to link: drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_probe': cpsw_new.c:(.text+0x312c): undefined reference to `ti_cm_get_macid' Use the same Makefile hack as before, and build cpsw-common.o for any driver that needs it. Fixes: ed3525ed ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Arnd Bergmann authored
The new driver misses a dependency: drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_rx_handler': cpsw_new.c:(.text+0x259c): undefined reference to `__page_pool_put_page' cpsw_new.c:(.text+0x25d0): undefined reference to `page_pool_alloc_pages' drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_fill_rx_channels': cpsw_priv.c:(.text+0x22d8): undefined reference to `page_pool_alloc_pages' cpsw_priv.c:(.text+0x2420): undefined reference to `__page_pool_put_page' drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_create_xdp_rxqs': cpsw_priv.c:(.text+0x2624): undefined reference to `page_pool_create' drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_run_xdp': cpsw_priv.c:(.text+0x2dc8): undefined reference to `__page_pool_put_page' Other drivers use 'select' for PAGE_POOL, so do the same here. Fixes: ed3525ed ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Haiyang Zhang authored
Host can provide send indirection table messages anytime after RSS is enabled by calling rndis_filter_set_rss_param(). So the host provided table values may be overwritten by the initialization in rndis_set_subchannel(). To prevent this problem, move the tx_table initialization before calling rndis_filter_set_rss_param(). Fixes: a6fb6aa3 ("hv_netvsc: Set tx_table to equal weight after subchannels open") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Russell King authored
phylink requires the MAC to report when its link status changes when operating in inband modes. Failure to report link status changes means that phylink has no idea when the link events happen, which results in either the network interface's carrier remaining up or remaining permanently down. For example, with a fiber module, if the interface is brought up and link is initially established, taking the link down at the far end will cut the optical power. The SFP module's LOS asserts, we deactivate the link, and the network interface reports no carrier. When the far end is brought back up, the SFP module's LOS deasserts, but the MAC may be slower to establish link. If this happens (which in my tests is a certainty) then phylink never hears that the MAC has established link with the far end, and the network interface is stuck reporting no carrier. This means the interface is non-functional. Avoiding the link interrupt when we have phylink is basically not an option, so remove the !port->phylink from the test. Fixes: 4bb04326 ("net: mvpp2: phylink support") Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Jakub Kicinski authored
Eric Dumazet says: ==================== tcp: take care of empty skbs in write queue We understood recently that TCP sockets could have an empty skb at the tail of the write queue, leading to various problems. This patch series : 1) Make sure we do not send an empty packet since this was unintended and causing crashes in old kernels. 2) Change tcp_write_queue_empty() to not be fooled by the presence of an empty skb. 3) Fix a bug that could trigger suboptimal epoll() application behavior under memory pressure. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Eric Dumazet authored
At the time commit ce5ec440 ("tcp: ensure epoll edge trigger wakeup when write queue is empty") was added to the kernel, we still had a single write queue, combining rtx and write queues. Once we moved the rtx queue into a separate rb-tree, testing if sk_write_queue is empty has been suboptimal. Indeed, if we have packets in the rtx queue, we probably want to delay the EPOLLOUT generation at the time incoming packets will free them, making room, but more importantly avoiding flooding application with EPOLLOUT events. Solution is to use tcp_rtx_and_write_queues_empty() helper. Fixes: 75c119af ("tcp: implement rb-tree based retransmit queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Eric Dumazet authored
Due to how tcp_sendmsg() is implemented, we can have an empty skb at the tail of the write queue. Most [1] tcp_write_queue_empty() callers want to know if there is anything to send (payload and/or FIN) Instead of checking if the sk_write_queue is empty, we need to test if tp->write_seq == tp->snd_nxt [1] tcp_send_fin() was the only caller that expected to see if an skb was in the write queue, I have changed the code to reuse the tcp_write_queue_tail() result. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Eric Dumazet authored
Backport of commit fdfc5c85 ("tcp: remove empty skb from write queue in error cases") in linux-4.14 stable triggered various bugs. One of them has been fixed in commit ba2ddb43f270 ("tcp: Don't dequeue SYN/FIN-segments from write-queue"), but we still have crashes in some occasions. Root-cause is that when tcp_sendmsg() has allocated a fresh skb and could not append a fragment before being blocked in sk_stream_wait_memory(), tcp_write_xmit() might be called and decide to send this fresh and empty skb. Sending an empty packet is not only silly, it might have caused many issues we had in the past with tp->packets_out being out of sync. Fixes: c65f7f00 ("[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Christoph Paasch <cpaasch@apple.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Jason Baron <jbaron@akamai.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Eric Dumazet authored
We got another syzbot report [1] that tells us we must use write_lock_irq()/write_unlock_irq() to avoid possible deadlock. [1] WARNING: inconsistent lock state 5.5.0-rc1-syzkaller #0 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage. syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes: ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 {HARDIRQ-ON-W} state was registered at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline] _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319 sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657 tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489 tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585 tiocsetd drivers/tty/tty_io.c:2337 [inline] tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597 vfs_ioctl fs/ioctl.c:47 [inline] file_ioctl fs/ioctl.c:545 [inline] do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732 ksys_ioctl+0xab/0xd0 fs/ioctl.c:749 __do_sys_ioctl fs/ioctl.c:756 [inline] __se_sys_ioctl fs/ioctl.c:754 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe irq event stamp: 3946 hardirqs last enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline] hardirqs last enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199 hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42 softirqs last enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline] softirqs last enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222 softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline] softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(disc_data_lock); <Interrupt> lock(disc_data_lock); *** DEADLOCK *** 5 locks held by syz-executor826/9605: #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 #1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413 #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116 #3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823 #4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288 stack backtrace: CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101 valid_state kernel/locking/lockdep.c:3112 [inline] mark_lock_irq kernel/locking/lockdep.c:3309 [inline] mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666 mark_usage kernel/locking/lockdep.c:3554 [inline] __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline] _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline] serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline] do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607 </IRQ> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline] RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579 Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7 RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7 RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000 RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899 R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138 R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:962 [inline] __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665 __fput+0x2ff/0x890 fs/file_table.c:280 ____fput+0x16/0x20 fs/file_table.c:313 task_work_run+0x145/0x1c0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x8e7/0x2ef0 kernel/exit.c:797 do_group_exit+0x135/0x360 kernel/exit.c:895 __do_sys_exit_group kernel/exit.c:906 [inline] __se_sys_exit_group kernel/exit.c:904 [inline] __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x43fef8 Code: Bad RIP value. RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001 R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000 Fixes: 6e4e2f81 ("6pack,mkiss: fix lock inconsistency") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Eric Dumazet authored
Michal Kubecek and Firo Yang did a very nice analysis of crashes happening in __inet_lookup_established(). Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN (via a close()/socket()/listen() cycle) without a RCU grace period, I should not have changed listeners linkage in their hash table. They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt), so that a lookup can detect a socket in a hash list was moved in another one. Since we added code in commit d296ba60 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix"), we have to add hlist_nulls_add_tail_rcu() helper. Fixes: 3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Michal Kubecek <mkubecek@suse.cz> Reported-by: Firo Yang <firo.yang@suse.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Thomas Falcon authored
This conditional is missing a bang, with the intent being to break when the retry count reaches zero. Fixes: 476d96ca ("ibmvnic: Bound waits for device queries") Suggested-by: Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Hangbin Liu authored
In commit 4b1373de ("net: ipv6: addr: perform strict checks also for doit handlers") we add strict check for inet6_rtm_getaddr(). But we did the invalid header values check before checking if NETLINK_F_STRICT_CHK is set. This may break backwards compatibility if user already set the ifm->ifa_prefixlen, ifm->ifa_flags, ifm->ifa_scope in their netlink code. I didn't move the nlmsg_len check because I thought it's a valid check. Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 4b1373de ("net: ipv6: addr: perform strict checks also for doit handlers") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
- 13 Dec, 2019 4 commits
-
-
Arnd Bergmann authored
Without I2C, we get a link failure: drivers/ptp/ptp_clockmatrix.o: In function `idtcm_xfer.isra.3': ptp_clockmatrix.c:(.text+0xcc): undefined reference to `i2c_transfer' drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_init': ptp_clockmatrix.c:(.init.text+0x14): undefined reference to `i2c_register_driver' drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_exit': ptp_clockmatrix.c:(.exit.text+0x10): undefined reference to `i2c_del_driver' Fixes: 3a6ba7dc ("ptp: Add a ptp clock driver for IDT ClockMatrix.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Jonathan Lemon authored
After executing "ethtool -C eth0 rx-usecs-irq 0", the box becomes unresponsive, likely due to interrupt livelock. It appears that a minimum clamp value for the irq timer is computed, but is never applied. Fix by applying the corrected clamp value. Fixes: 74706afa ("bnxt_en: Update interrupt coalescing logic.") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Vivien Didelot authored
I no longer work at Savoir-faire Linux but even though MAINTAINERS is up-to-date, some emails are still sent to my old email address. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
Subash Abhinov Kasiviswanathan authored
Add myself and Sean as maintainers for rmnet driver. Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
-
- 12 Dec, 2019 3 commits
-
-
Manish Chopra authored
Driver doesn't accommodate the configuration for max number of multicast mac addresses, in such particular case it leaves the device with improper/invalid multicast configuration state, causing connectivity issues (in lacp bonding like scenarios). Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cristian Birsan authored
Lan78xx driver accesses the PHY registers through MDIO bus over USB connection. When performing a suspend/resume, the PHY registers can be accessed before the USB connection is resumed. This will generate an error and will prevent the device to resume correctly. This patch adds the dependency between the MDIO bus and USB device to allow correct handling of suspend/resume. Fixes: ce85e13a ("lan78xx: Update to use phylib instead of mii_if_info.") Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf 2019-12-11 The following pull-request contains BPF updates for your *net* tree. We've added 8 non-merge commits during the last 1 day(s) which contain a total of 10 files changed, 126 insertions(+), 18 deletions(-). The main changes are: 1) Make BPF trampoline co-exist with ftrace-based tracers, from Alexei. 2) Fix build in minimal configurations, from Arnd. 3) Fix mips, riscv bpf_tail_call limit, from Paul. 4) Fix bpftool segfault, from Toke. 5) Fix samples/bpf, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 11 Dec, 2019 2 commits
-
-
Daniel T. Lee authored
Currently, open() is called from the user program and it calls the syscall 'sys_openat', not the 'sys_open'. This leads to an error of the program of user side, due to the fact that the counter maps are zero since no function such 'sys_open' is called. This commit adds the kernel bpf program which are attached to the tracepoint 'sys_enter_openat' and 'sys_enter_openat'. Fixes: 1da236b6 ("bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Previously, when this sample is added, commit 1c47910e ("samples/bpf: add perf_event+bpf example"), a symbol 'sys_read' and 'sys_write' has been used without no prefixes. But currently there are no exact symbols with these under kallsyms and this leads to failure. This commit changes exact compare to substring compare to keep compatible with exact symbol or prefixed symbol. Fixes: 1c47910e ("samples/bpf: add perf_event+bpf example") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191205080114.19766-2-danieltimlee@gmail.com
-