- 14 Oct, 2022 1 commit
-
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull more io_uring updates from Jens Axboe: "A collection of fixes that ended up either being later than the initial pull, or dependent on multiple branches (6.0-late being one of them) and hence deferred purposely. This contains: - Cleanup fixes for the single submitter late 6.0 change, which we pushed to 6.1 to keep the 6.0 changes small (Dylan, Pavel) - Fix for IORING_OP_CONNECT not handling -EINPROGRESS correctly (me) - Ensure that the zc sendmsg variant gets audited correctly (me) - Regression fix from this merge window where kiocb_end_write() doesn't always gets called, which can cause issues with fs freezing (me) - Registered files SCM handling fix (Pavel) - Regression fix for big sqe dumping in fdinfo (Pavel) - Registered buffers accounting fix (Pavel) - Remove leftover notification structures, we killed them off late in 6.0 (Pavel) - Minor optimizations (Pavel) - Cosmetic variable shadowing fix (Stefan)" * tag 'io_uring-6.1-2022-10-13' of git://git.kernel.dk/linux: io_uring/rw: ensure kiocb_end_write() is always called io_uring: fix fdinfo sqe offsets calculation io_uring: local variable rw shadows outer variable in io_write io_uring/opdef: remove 'audit_skip' from SENDMSG_ZC io_uring: optimise locking for local tw with submit_wait io_uring: remove redundant memory barrier in io_req_local_work_add io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT io_uring: remove notif leftovers io_uring: correct pinned_vm accounting io_uring/af_unix: defer registered files gc to io_uring release io_uring: limit registration w/ SINGLE_ISSUER io_uring: remove io_register_submitter io_uring: simplify __io_uring_add_tctx_node
-
- 13 Oct, 2022 28 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linuxLinus Torvalds authored
Pull devicetree fixes from Rob Herring: - Fixes for Mediatek MT6370 binding - Merge the DT overlay maintainer entry to the main entry as Pantelis is not active and Frank is taking a step back * tag 'devicetree-fixes-for-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: MAINTAINERS: of: collapse overlay entry into main device tree entry dt-bindings: mfd: mt6370: fix the interrupt order of the charger in the example dt-bindings: leds: mt6370: Fix MT6370 LED indicator DT warning
-
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds authored
Pull MMC fixes from Ulf Hansson: "MMC core: - Add SD card quirk for broken discard MMC host: - renesas_sdhi: Fix clock rounding errors - sdhci-sprd: Fix minimum clock limit to detect cards - sdhci-tegra: Use actual clock rate for SW tuning correction" * tag 'mmc-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-sprd: Fix minimum clock limit mmc: sdhci-tegra: Use actual clock rate for SW tuning correction mmc: renesas_sdhi: Fix rounding errors mmc: core: Add SD card quirk for broken discard
-
git://git.lwn.net/linuxLinus Torvalds authored
Pull documentation fixes from Jonathan Corbet: "A handful of relatively simple documentation fixes, plus a set of patches catching the Chinese translation up with the front-page rework" * tag 'docs-6.1-2' of git://git.lwn.net/linux: Documentation: rtla: Correct command line example docs/zh_CN: add a man-pages link to zh_CN/index.rst docs/zh_CN: Rewrite the Chinese translation front page docs/zh_CN: add zh_CN/arch.rst docs/zh_CN: promote the title of zh_CN/process/index.rst docs/zh_CN: Update the translation of page_owner to 6.0-rc7 docs/zh_CN: Update the translation of ksm to 6.0-rc7 docs/howto: Replace abundoned URL of gmane.org Documentation: ubifs: Fix compression idiom Documentation/mm/page_owner.rst: delete frequently changing experimental data docs/zh_CN: Fix build warning docs: ftrace: Correct access mode
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, and wifi. Current release - regressions: - Revert "net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs", it may cause crashes when the qdisc is reconfigured - inet: ping: fix splat due to packet allocation refactoring in inet - tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF due to races when per-netns hash table is used Current release - new code bugs: - eth: adin1110: check in netdev_event that netdev belongs to driver - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co. Previous releases - regressions: - ipv4: handle attempt to delete multipath route when fib_info contains an nh reference, avoid oob access - wifi: fix handful of bugs in the new Multi-BSSID code - wifi: mt76: fix rate reporting / throughput regression on mt7915 and newer, fix checksum offload - wifi: iwlwifi: mvm: fix double list_add at iwl_mvm_mac_wake_tx_queue (other cases) - wifi: mac80211: do not drop packets smaller than the LLC-SNAP header on fast-rx Previous releases - always broken: - ieee802154: don't warn zero-sized raw_sendmsg() - ipv6: ping: fix wrong checksum for large frames - mctp: prevent double key removal and unref - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM - hv_netvsc: fix race between VF offering and VF association message Misc: - remove -Warray-bounds silencing in the drivers, compilers fixed" * tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits) sunhme: fix an IS_ERR() vs NULL check in probe net: marvell: prestera: fix a couple NULL vs IS_ERR() checks kcm: avoid potential race in kcm_tx_work tcp: Clean up kernel listener's reqsk in inet_twsk_purge() net: phy: micrel: Fixes FIELD_GET assertion openvswitch: add nf_ct_is_confirmed check before assigning the helper tcp: Fix data races around icsk->icsk_af_ops. ipv6: Fix data races around sk->sk_prot. tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). tcp/udp: Fix memory leak in ipv6_renew_options(). mctp: prevent double key removal and unref selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1 netfilter: rpfilter/fib: Populate flowic_l3mdev field selftests: netfilter: Test reverse path filtering net/mlx5: Make ASO poll CQ usable in atomic context tcp: cdg: allow tcp_cdg_release() to be called multiple times inet: ping: fix recent breakage ipv6: ping: fix wrong checksum for large frames net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio fixes from Michael Tsirkin: - Fix a regression in virtio pci on power - Add a reviewer for ifcvf * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/ifcvf: add reviewer virtio_pci: use irq to detect interrupt support
-
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: - Found that the synthetic events were using strlen/strscpy() on values that could have come from userspace, and that is bad. Consolidate the string logic of kprobe and eprobe and extend it to the synthetic events to safely process string addresses. - Clean up content of text dump in ftrace_bug() where the output does not make char reads into signed and sign extending the byte output. - Fix some kernel docs in the ring buffer code. * tag 'trace-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix reading strings from synthetic events tracing: Add "(fault)" name injection to kernel probes tracing: Move duplicate code of trace_kprobe/eprobe.c into header ring-buffer: Fix kernel-doc ftrace: Fix char print issue in print_ip_ins()
-
git://www.linux-watchdog.org/linux-watchdogLinus Torvalds authored
Pull watchdog updates from Wim Van Sebroeck: - new driver for Exar/MaxLinear XR28V38x - support for exynosautov9 SoC - support for Renesas R-Car V5H (R8A779G0) and RZ/V2M (r9a09g011) SoC - support for imx93 - several other fixes and improvements * tag 'linux-watchdog-6.1-rc1' of git://www.linux-watchdog.org/linux-watchdog: (36 commits) watchdog: twl4030_wdt: add missing mod_devicetable.h include dt-bindings: watchdog: migrate mt7621 text bindings to YAML watchdog: sp5100_tco: Add "action" module parameter watchdog: imx93: add watchdog timer on imx93 watchdog: imx7ulp_wdt: init wdog when it was active watchdog: imx7ulp_wdt: Handle wdog reconfigure failure watchdog: imx7ulp_wdt: Fix RCS timeout issue watchdog: imx7ulp_wdt: Check CMD32EN in wdog init watchdog: imx7ulp: Add explict memory barrier for unlock sequence watchdog: imx7ulp: Move suspend/resume to noirq phase watchdog: rti-wdt:using the pm_runtime_resume_and_get to simplify the code dt-bindings: watchdog: rockchip: add rockchip,rk3128-wdt watchdog: s3c2410_wdt: support exynosautov9 watchdog dt-bindings: watchdog: add exynosautov9 compatible watchdog: npcm: Enable clock if provided watchdog: meson: keep running if already active watchdog: dt-bindings: atmel,at91sam9-wdt: convert to json-schema watchdog: armada_37xx_wdt: Fix .set_timeout callback watchdog: sa1100: make variable sa1100dog_driver static watchdog: w83977f_wdt: Fix comment typo ...
-
https://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph updates from Ilya Dryomov: "A quiet round this time: several assorted filesystem fixes, the most noteworthy one being some additional wakeups in cap handling code, and a messenger cleanup" * tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client: ceph: remove Sage's git tree from documentation ceph: fix incorrectly showing the .snap size for stat ceph: fail the open_by_handle_at() if the dentry is being unlinked ceph: increment i_version when doing a setattr with caps ceph: Use kcalloc for allocating multiple elements ceph: no need to wait for transition RDCACHE|RD -> RD ceph: fail the request if the peer MDS doesn't support getvxattr op ceph: wake up the waiters if any new caps comes libceph: drop last_piece flag from ceph_msg_data_cursor
-
git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds authored
Pull NFS client updates from Anna Schumaker: "New Features: - Add NFSv4.2 xattr tracepoints - Replace xprtiod WQ in rpcrdma - Flexfiles cancels I/O on layout recall or revoke Bugfixes and Cleanups: - Directly use ida_alloc() / ida_free() - Don't open-code max_t() - Prefer using strscpy over strlcpy - Remove unused forward declarations - Always return layout states on flexfiles layout return - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error - Allow more xprtrdma memory allocations to fail without triggering a reclaim - Various other xprtrdma clean ups - Fix rpc_killall_tasks() races" * tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits) NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked SUNRPC: Add API to force the client to disconnect SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls SUNRPC: Fix races with rpc_killall_tasks() xprtrdma: Fix uninitialized variable xprtrdma: Prevent memory allocations from driving a reclaim xprtrdma: Memory allocation should be allowed to fail during connect xprtrdma: MR-related memory allocation should be allowed to fail xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc() xprtrdma: Clean up synopsis of rpcrdma_req_create() svcrdma: Clean up RPCRDMA_DEF_GFP SUNRPC: Replace the use of the xprtiod WQ in rpcrdma NFSv4.2: Add a tracepoint for listxattr NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2 NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration NFSv4: remove nfs4_renewd_prepare_shutdown() declaration fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment NFSv4/pNFS: Always return layout stats on layout return for flexfiles ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linuxLinus Torvalds authored
Pull orangefs update from Mike Marshall: "Change iterate to iterate_shared" * tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: Orangefs: change iterate to iterate_shared
-
Pierre Gondois authored
The '-t/-T' parameters seem to have been swapped: -t/--trace[=file]: save the stopped trace to [file|timerlat_trace.txt] -T/--thread us: stop trace if the thread latency is higher than the argument in us Swap them back. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Link: https://lore.kernel.org/r/20221006084409.3882542-1-pierre.gondois@arm.comSigned-off-by: Jonathan Corbet <corbet@lwn.net>
-
Dan Carpenter authored
The devm_request_region() function does not return error pointers, it returns NULL on error. Fixes: 914d9b27 ("sunhme: switch to devres") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Link: https://lore.kernel.org/r/Y0bWzJL8JknX8MUf@kiliSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dan Carpenter authored
The __prestera_nexthop_group_create() function returns NULL on error and the prestera_nexthop_group_get() returns error pointers. Fix these two checks. Fixes: 0a23ae23 ("net: marvell: prestera: Add router nexthops ABI") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Y0bWq+7DoKK465z8@kiliSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
syzbot found that kcm_tx_work() could crash [1] in: /* Primarily for SOCK_SEQPACKET sockets */ if (likely(sk->sk_socket) && test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { <<*>> clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags); sk->sk_write_space(sk); } I think the reason is that another thread might concurrently run in kcm_release() and call sock_orphan(sk) while sk is not locked. kcm_tx_work() find sk->sk_socket being NULL. [1] BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline] BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53 CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: kkcmd kcm_tx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 kasan_report+0xbe/0x1f0 mm/kasan/report.c:495 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 instrument_atomic_write include/linux/instrumented.h:86 [inline] clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline] kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Link: https://lore.kernel.org/r/20221012133412.519394-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
Eric Dumazet reported a use-after-free related to the per-netns ehash series. [0] When we create a TCP socket from userspace, the socket always holds a refcnt of the netns. This guarantees that a reqsk timer is always fired before netns dismantle. Each reqsk has a refcnt of its listener, so the listener is not freed before the reqsk, and the net is not freed before the listener as well. OTOH, when in-kernel users create a TCP socket, it might not hold a refcnt of its netns. Thus, a reqsk timer can be fired after the netns dismantle and access freed per-netns ehash. To avoid the use-after-free, we need to clean up TCP_NEW_SYN_RECV sockets in inet_twsk_purge() if the netns uses a per-netns ehash. [0]: https://lore.kernel.org/netdev/CANn89iLXMup0dRD_Ov79Xt8N9FM0XdhCHEN05sf3eLwxKweM6w@mail.gmail.com/ BUG: KASAN: use-after-free in tcp_or_dccp_get_hashinfo include/net/inet_hashtables.h:181 [inline] BUG: KASAN: use-after-free in reqsk_queue_unlink+0x320/0x350 net/ipv4/inet_connection_sock.c:913 Read of size 8 at addr ffff88807545bd80 by task syz-executor.2/8301 CPU: 1 PID: 8301 Comm: syz-executor.2 Not tainted 6.0.0-syzkaller-02757-gaf7d23f9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold+0x2ba/0x719 mm/kasan/report.c:433 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495 tcp_or_dccp_get_hashinfo include/net/inet_hashtables.h:181 [inline] reqsk_queue_unlink+0x320/0x350 net/ipv4/inet_connection_sock.c:913 inet_csk_reqsk_queue_drop net/ipv4/inet_connection_sock.c:927 [inline] inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:939 [inline] reqsk_timer_handler+0x724/0x1160 net/ipv4/inet_connection_sock.c:1053 call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474 expire_timers kernel/time/timer.c:1519 [inline] __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790 __run_timers kernel/time/timer.c:1768 [inline] run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803 __do_softirq+0x1d0/0x9c8 kernel/softirq.c:571 invoke_softirq kernel/softirq.c:445 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650 irq_exit_rcu+0x5/0x20 kernel/softirq.c:662 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1107 </IRQ> Fixes: d1e5e640 ("tcp: Introduce optional per-netns ehash.") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Eric Dumazet <edumazet@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221012145036.74960-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Frank Rowand authored
Pantelis has not been active in recent years so no need to maintain a separate entry for device tree overlays. Signed-off-by: Frank Rowand <frank.rowand@sony.com> Link: https://lore.kernel.org/r/20221012220548.4163865-1-frowand.list@gmail.comSigned-off-by: Rob Herring <robh@kernel.org>
-
Michael S. Tsirkin authored
Zhu Lingshan has been writing and reviewing ifcvf patches for a while now, add as reviewer. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com>
-
Michael S. Tsirkin authored
commit 71491c54 ("virtio_pci: don't try to use intxif pin is zero") breaks virtio_pci on powerpc, when running as a qemu guest. vp_find_vqs() bails out because pci_dev->pin == 0. But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would succeed if we called it - which is what the code used to do. This seems to happen because pci_dev->pin is not populated in pci_assign_irq(). A PCI core bug? Maybe. However Linus said: I really think that that is basically the only time you should use that 'pci_dev->pin' thing: it basically exists not for "does this device have an IRQ", but for "what is the routing of this irq on this device". and The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does if (dev->irq) ... so let's just check irq and be done with it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: 71491c54 ("virtio_pci: don't try to use intxif pin is zero") Cc: "Angus Chen" <angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221012220312.308522-1-mst@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wirelessPaolo Abeni authored
Johannes Berg says: ==================== More wireless fixes for 6.1 This has only the fixes for the scan parsing issues. * tag 'wireless-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: update hidden BSSes to avoid WARN_ON wifi: mac80211: fix crash in beacon protection for P2P-device wifi: mac80211_hwsim: avoid mac80211 warning on bad rate wifi: cfg80211: avoid nontransmitted BSS list corruption wifi: cfg80211: fix BSS refcounting bugs wifi: cfg80211: ensure length byte is present before access wifi: mac80211: fix MBSSID parsing use-after-free wifi: cfg80211/mac80211: reject bad MBSSID elements wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans() ==================== Link: https://lore.kernel.org/r/20221013100522.46346-1-johannes@sipsolutions.netSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Johannes Berg authored
Pull in the fixes for various scan parsing bugs found by Sönke Huster by fuzzing.
-
Divya Koppera authored
FIELD_GET() must only be used with a mask that is a compile-time constant. Mark the functions as __always_inline to avoid the problem. Fixes: 21b688da ("net: phy: micrel: Cable Diag feature for lan8814 phy") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Divya Koppera <Divya.Koppera@microchip.com> Link: https://lore.kernel.org/r/20221011095437.12580-1-Divya.Koppera@microchip.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Xin Long authored
A WARN_ON call trace would be triggered when 'ct(commit, alg=helper)' applies on a confirmed connection: WARNING: CPU: 0 PID: 1251 at net/netfilter/nf_conntrack_extend.c:98 RIP: 0010:nf_ct_ext_add+0x12d/0x150 [nf_conntrack] Call Trace: <TASK> nf_ct_helper_ext_add+0x12/0x60 [nf_conntrack] __nf_ct_try_assign_helper+0xc4/0x160 [nf_conntrack] __ovs_ct_lookup+0x72e/0x780 [openvswitch] ovs_ct_execute+0x1d8/0x920 [openvswitch] do_execute_actions+0x4e6/0xb60 [openvswitch] ovs_execute_actions+0x60/0x140 [openvswitch] ovs_packet_cmd_execute+0x2ad/0x310 [openvswitch] genl_family_rcv_msg_doit.isra.15+0x113/0x150 genl_rcv_msg+0xef/0x1f0 which can be reproduced with these OVS flows: table=0, in_port=veth1,tcp,tcp_dst=2121,ct_state=-trk actions=ct(commit, table=1) table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new actions=ct(commit, alg=ftp),normal The issue was introduced by commit 248d45f1 ("openvswitch: Allow attaching helper in later commit") where it somehow removed the check of nf_ct_is_confirmed before asigning the helper. This patch is to fix it by bringing it back. Fixes: 248d45f1 ("openvswitch: Allow attaching helper in later commit") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/c5c9092a22a2194650222bffaf786902613deb16.1665085502.git.lucien.xin@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Kuniyuki Iwashima says: ==================== tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM. This series fixes some memory leaks and data races caused in the same scenario where one thread converts an IPv6 socket into IPv4 with IPV6_ADDRFORM and another accesses the socket concurrently. v4: https://lore.kernel.org/netdev/20221004171802.40968-1-kuniyu@amazon.com/ v3 (Resend): https://lore.kernel.org/netdev/20221003154425.49458-1-kuniyu@amazon.com/ v3: https://lore.kernel.org/netdev/20220929012542.55424-1-kuniyu@amazon.com/ v2: https://lore.kernel.org/netdev/20220928002741.64237-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20220927161209.32939-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20221006185349.74777-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops under lock_sock(), but tcp_(get|set)sockopt() read it locklessly. To avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE() for the reads and writes. Thanks to Eric Dumazet for providing the syzbot report: BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0: tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240 __inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660 inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724 __sys_connect_file net/socket.c:1976 [inline] __sys_connect+0x197/0x1b0 net/socket.c:1993 __do_sys_connect net/socket.c:2003 [inline] __se_sys_connect net/socket.c:2000 [inline] __x64_sys_connect+0x3d/0x50 net/socket.c:2000 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1: tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585 __sys_setsockopt+0x212/0x2b0 net/socket.c:2252 __do_sys_setsockopt net/socket.c:2263 [inline] __se_sys_setsockopt net/socket.c:2260 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted 6.0.0-rc4-syzkaller-00331-g4ed9c1e9-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
Commit 086d4905 ("ipv6: annotate some data-races around sk->sk_prot") fixed some data-races around sk->sk_prot but it was not enough. Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid load tearing. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were able to clean them up by calling inet6_destroy_sock() during the IPv6 -> IPv4 conversion by IPV6_ADDRFORM. However, commit 03485f2a ("udpv6: Add lockless sendmsg() support") added a lockless memory allocation path, which could cause a memory leak: setsockopt(IPV6_ADDRFORM) sendmsg() +-----------------------+ +-------+ - do_ipv6_setsockopt(sk, ...) - udpv6_sendmsg(sk, ...) - sockopt_lock_sock(sk) ^._ called via udpv6_prot - lock_sock(sk) before WRITE_ONCE() - WRITE_ONCE(sk->sk_prot, &tcp_prot) - inet6_destroy_sock() - if (!corkreq) - sockopt_release_sock(sk) - ip6_make_skb(sk, ...) - release_sock(sk) ^._ lockless fast path for the non-corking case - __ip6_append_data(sk, ...) - ipv6_local_rxpmtu(sk, ...) - xchg(&np->rxpmtu, skb) ^._ rxpmtu is never freed. - goto out_no_dst; - lock_sock(sk) For now, rxpmtu is only the case, but not to miss the future change and a similar bug fixed in commit e2732600 ("net: ping6: Fix memleak in ipv6_renew_options()."), let's set a new function to IPv6 sk->sk_destruct() and call inet6_cleanup_sock() there. Since the conversion does not change sk->sk_destruct(), we can guarantee that we can clean up IPv6 resources finally. We can now remove all inet6_destroy_sock() calls from IPv6 protocol specific ->destroy() functions, but such changes are invasive to backport. So they can be posted as a follow-up later for net-next. Fixes: 03485f2a ("udpv6: Add lockless sendmsg() support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
Commit 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support") forgot to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6 socket into IPv4 with IPV6_ADDRFORM. After conversion, sk_prot is changed to udp_prot and ->destroy() never cleans it up, resulting in a memory leak. This is due to the discrepancy between inet6_destroy_sock() and IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM to remove the difference. However, this is not enough for now because rxpmtu can be changed without lock_sock() after commit 03485f2a ("udpv6: Add lockless sendmsg() support"). We will fix this case in the following patch. Note we will rename inet6_destroy_sock() to inet6_cleanup_sock() and remove unnecessary inet6_destroy_sock() calls in sk_prot->destroy() in the future. Fixes: 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
syzbot reported a memory leak [0] related to IPV6_ADDRFORM. The scenario is that while one thread is converting an IPv6 socket into IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and allocates memory to inet6_sk(sk)->XXX after conversion. Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources, which inet6_destroy_sock() should have cleaned up. setsockopt(IPV6_ADDRFORM) setsockopt(IPV6_DSTOPTS) +-----------------------+ +----------------------+ - do_ipv6_setsockopt(sk, ...) - sockopt_lock_sock(sk) - do_ipv6_setsockopt(sk, ...) - lock_sock(sk) ^._ called via tcpv6_prot - WRITE_ONCE(sk->sk_prot, &tcp_prot) before WRITE_ONCE() - xchg(&np->opt, NULL) - txopt_put(opt) - sockopt_release_sock(sk) - release_sock(sk) - sockopt_lock_sock(sk) - lock_sock(sk) - ipv6_set_opt_hdr(sk, ...) - ipv6_update_options(sk, opt) - xchg(&inet6_sk(sk)->opt, opt) ^._ opt is never freed. - sockopt_release_sock(sk) - release_sock(sk) Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after acquiring the lock. This issue exists from the initial commit between IPV6_ADDRFORM and IPV6_PKTOPTIONS. [0]: BUG: memory leak unreferenced object 0xffff888009ab9f80 (size 96): comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s) hex dump (first 32 bytes): 01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00 ....H........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline] [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566 [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318 [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline] [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668 [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021 [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789 [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252 [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline] [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline] [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260 [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 12 Oct, 2022 11 commits
-
-
Jens Axboe authored
A previous commit moved the notifications and end-write handling, but it is now missing a few spots where we also want to call both of those. Without that, we can potentially be missing file notifications, and more importantly, have an imbalance in the super_block writers sem accounting. Fixes: b000145e ("io_uring/rw: defer fsnotify calls to task context") Reported-by: Dave Chinner <david@fromorbit.com> Link: https://lore.kernel.org/all/20221010050319.GC2703033@dread.disaster.area/Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Only with the big sqe feature they take 128 bytes per entry, but we unconditionally advance by 128B. Fix it by using sq_shift. Fixes: 3b8fdd1d ("io_uring/fdinfo: fix sqe dumping for IORING_SETUP_SQE128") Reported-and-tested-by: syzbot+e5198737e8a2d23d958c@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8b41287cb75d5efb8fcb5cccde845ddbbadd8372.1665449983.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Stefan Roesch authored
This fixes the shadowing of the outer variable rw in the function io_write(). No issue is caused by this, but let's silence the shadowing warning anyway. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Stefan Roesch <shr@devkernel.io> Link: https://lore.kernel.org/r/20221010234330.244244-1-shr@devkernel.ioSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
The msg variants of sending aren't audited separately, so we should not be setting audit_skip for the zerocopy sendmsg variant either. Fixes: 493108d9 ("io_uring/net: zerocopy sendmsg") Reported-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Running local task_work requires taking uring_lock, for submit + wait we can try to run them right after submit while we still hold the lock and save one lock/unlokc pair. The optimisation was implemented in the first local tw patches but got dropped for simplicity. Suggested-by: Dylan Yudaken <dylany@fb.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/281fc79d98b5d91fe4778c5137a17a2ab4693e5c.1665088876.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
io_cqring_wake() needs a barrier for the waitqueue_active() check. However, in the case of io_req_local_work_add(), we call llist_add() first, which implies an atomic. Hence we can replace smb_mb() with smp_mb__after_atomic(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/43983bc8bc507172adda7a0f00cab1aff09fd238.1665018309.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
We treat EINPROGRESS like EAGAIN, but if we're retrying post getting EINPROGRESS, then we just need to check the socket for errors and terminate the request. This was exposed on a bluetooth connection request which ends up taking a while and hitting EINPROGRESS, and yields a CQE result of -EBADFD because we're retrying a connect on a socket that is now connected. Cc: stable@vger.kernel.org Fixes: 87f80d62 ("io_uring: handle connect -EINPROGRESS like -EAGAIN") Link: https://github.com/axboe/liburing/issues/671Reported-by: Aidan Sun <aidansun05@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Notifications were killed but there is a couple of fields and struct declarations left, remove them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8df8877d677be5a2b43afd936d600e60105ea960.1664849941.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
->mm_account should be released only after we free all registered buffers, otherwise __io_sqe_buffers_unregister() will see a NULL ->mm_account and skip locked_vm accounting. Cc: <Stable@vger.kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6d798f65ed4ab8db3664c4d3397d4af16ca98846.1664849932.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>
-
Pavel Begunkov authored
Instead of putting io_uring's registered files in unix_gc() we want it to be done by io_uring itself. The trick here is to consider io_uring registered files for cycle detection but not actually putting them down. Because io_uring can't register other ring instances, this will remove all refs to the ring file triggering the ->release path and clean up with io_ring_ctx_free(). Cc: stable@vger.kernel.org Fixes: 6b06314c ("io_uring: add file set registration") Reported-and-tested-by: David Bouman <dbouman03@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> [axboe: add kerneldoc comment to skb, fold in skb leak fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Linus Torvalds authored
Merge tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull more KUnit updates from Shuah Khan: "Features and fixes: - simplify resource use - make kunit_malloc() and kunit_free() allocations and frees consistent. kunit_free() frees only the memory allocated by kunit_malloc() - stop downloading risc-v opensbi binaries using wget - other fixes and improvements to tool and KUnit framework" * tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: Documentation: kunit: Update description of --alltests option kunit: declare kunit_assert structs as const kunit: rename base KUNIT_ASSERTION macro to _KUNIT_FAILED kunit: remove format func from struct kunit_assert, get it to 0 bytes kunit: tool: Don't download risc-v opensbi firmware with wget kunit: make kunit_kfree(NULL) a no-op to match kfree() kunit: make kunit_kfree() not segfault on invalid inputs kunit: make kunit_kfree() only work on pointers from kunit_malloc() and friends kunit: drop test pointer in string_stream_fragment kunit: string-stream: Simplify resource use
-