- 28 Jun, 2021 5 commits
-
-
Kees Cook authored
Instead of depending on "sysctl" being installed, just use "grep -H" for sysctl status reporting. Additionally report kernel version for easier comparisons. Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
When running the seccomp benchmark under a test runner, it wouldn't provide any feedback on progress. Set stdout unbuffered. Suggested-by: Will Drewry <wad@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Since the open fds might not always start at "4" (especially when running under kselftest, etc), start counting from the first assigned fd, rather than using the more permissive EXPECT_GE(fd, 0). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20210527032948.3730953-1-keescook@chromium.orgReviewed-by: Rodrigo Campos <rodrigo@kinvolk.io> Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
-
Rodrigo Campos authored
This just adds a test to verify that when using the new introduced flag to ADDFD, a valid fd is added and returned as the syscall result. Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io> Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210517193908.3113-5-sargun@sargun.me
-
Rodrigo Campos authored
Alban Crequy reported a race condition userspace faces when we want to add some fds and make the syscall return them[1] using seccomp notify. The problem is that currently two different ioctl() calls are needed by the process handling the syscalls (agent) for another userspace process (target): SECCOMP_IOCTL_NOTIF_ADDFD to allocate the fd and SECCOMP_IOCTL_NOTIF_SEND to return that value. Therefore, it is possible for the agent to do the first ioctl to add a file descriptor but the target is interrupted (EINTR) before the agent does the second ioctl() call. This patch adds a flag to the ADDFD ioctl() so it adds the fd and returns that value atomically to the target program, as suggested by Kees Cook[2]. This is done by simply allowing seccomp_do_user_notification() to add the fd and return it in this case. Therefore, in this case the target wakes up from the wait in seccomp_do_user_notification() either to interrupt the syscall or to add the fd and return it. This "allocate an fd and return" functionality is useful for syscalls that return a file descriptor only, like connect(2). Other syscalls that return a file descriptor but not as return value (or return more than one fd), like socketpair(), pipe(), recvmsg with SCM_RIGHTs, will not work with this flag. This effectively combines SECCOMP_IOCTL_NOTIF_ADDFD and SECCOMP_IOCTL_NOTIF_SEND into an atomic opteration. The notification's return value, nor error can be set by the user. Upon successful invocation of the SECCOMP_IOCTL_NOTIF_ADDFD ioctl with the SECCOMP_ADDFD_FLAG_SEND flag, the notifying process's errno will be 0, and the return value will be the file descriptor number that was installed. [1]: https://lore.kernel.org/lkml/CADZs7q4sw71iNHmV8EOOXhUKJMORPzF7thraxZYddTZsxta-KQ@mail.gmail.com/ [2]: https://lore.kernel.org/lkml/202012011322.26DCBC64F2@keescook/Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io> Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210517193908.3113-4-sargun@sargun.me
-
- 29 May, 2021 1 commit
-
-
Sargun Dhillon authored
This refactors the user notification code to have a do / while loop around the completion condition. This has a small change in semantic, in that previously we ignored addfd calls upon wakeup if the notification had been responded to, but instead with the new change we check for an outstanding addfd calls prior to returning to userspace. Rodrigo Campos also identified a bug that can result in addfd causing an early return, when the supervisor didn't actually handle the syscall [1]. [1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/ Fixes: 7cf97b12 ("seccomp: Introduce addfd ioctl to seccomp user notifier") Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Rodrigo Campos <rodrigo@kinvolk.io> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
-
- 28 May, 2021 1 commit
-
-
Sargun Dhillon authored
The documentation had some previously incorrect information about how userspace notifications (and responses) were handled due to a change from a previously proposed patchset. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Fixes: 6a21cc50 ("seccomp: add a return code to trap to userspace") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210517193908.3113-2-sargun@sargun.me
-
- 16 May, 2021 7 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
Merge tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver fixes for driver core changes that happened in 5.13-rc1. The clk driver fix resolves a many-reported issue with booting some devices, and the USB typec fix resolves the reported problem of USB systems on some embedded boards. Both of these have been in linux-next this week with no reported issues" * tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: clk: Skip clk provider registration when np is NULL usb: typec: tcpm: Don't block probing of consumers of "connector" nodes
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging and IIO driver fixes from Greg KH: "Here are some small IIO driver fixes and one Staging driver fix for 5.13-rc2. Nothing major, just some resolutions for reported problems: - gcc-11 bogus warning fix for rtl8723bs - iio driver tiny fixes All of these have been in linux-next for many days with no reported issues" * tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: tsl2583: Fix division by a zero lux_val iio: core: return ENODEV if ioctl is unknown iio: core: fix ioctl handlers removal iio: gyro: mpu3050: Fix reported temperature value iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER iio: proximity: pulsedlight: Fix rumtime PM imbalance on error iio: light: gp2ap002: Fix rumtime PM imbalance on error staging: rtl8723bs: avoid bogus gcc warning
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are some small USB fixes for 5.13-rc2. They consist of a number of resolutions for reported issues: - typec fixes for found problems - xhci fixes and quirk additions - dwc3 driver fixes - minor fixes found by Coverity - cdc-wdm fixes for reported problems All of these have been in linux-next for a few days with no reported issues" * tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits) usb: core: hub: fix race condition about TRSMRCY of resume usb: typec: tcpm: Fix SINK_DISCOVERY current limit for Rp-default xhci: Add reset resume quirk for AMD xhci controller. usb: xhci: Increase timeout for HC halt xhci: Do not use GFP_KERNEL in (potentially) atomic context xhci: Fix giving back cancelled URBs even if halted endpoint can't reset xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI usb: musb: Fix an error message usb: typec: tcpm: Fix wrong handling for Not_Supported in VDM AMS usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4 usb: fotg210-hcd: Fix an error message docs: usb: function: Modify path name usb: dwc3: omap: improve extcon initialization usb: typec: ucsi: Put fwnode in any case during ->probe() usb: typec: tcpm: Fix wrong handling in GET_SINK_CAP usb: dwc2: Remove obsolete MODULE_ constants from platform.c usb: dwc3: imx8mp: fix error return code in dwc3_imx8mp_probe() usb: dwc3: imx8mp: detect dwc3 core node via compatible string usb: dwc3: gadget: Return success always for kick transfer in ep queue ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Thomas Gleixner: "Two fixes for timers: - Use the ALARM feature check in the alarmtimer core code insted of the old method of checking for the set_alarm() callback. Drivers can have that callback set but the feature bit cleared. If such a RTC device is selected then alarms wont work. - Use a proper define to let the preprocessor check whether Hyper-V VDSO clocksource should be active. The code used a constant in an enum with #ifdef, which evaluates to always false and disabled the clocksource for VDSO" * tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86 alarmtimer: Check RTC features instead of ops
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull xen fixes from Juergen Gross: - two patches for error path fixes - a small series for fixing a regression with swiotlb with Xen on Arm * tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/swiotlb: check if the swiotlb has already been initialized arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required xen/arm: move xen_swiotlb_detect to arm/swiotlb-xen.h xen/unpopulated-alloc: fix error return code in fill_list() xen/gntdev: fix gntdev_mmap() error exit path
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Borislav Petkov: "The three SEV commits are not really urgent material. But we figured since getting them in now will avoid a huge amount of conflicts between future SEV changes touching tip, the kvm and probably other trees, sending them to you now would be best. The idea is that the tip, kvm etc branches for 5.14 will all base ontop of -rc2 and thus everything will be peachy. What is more, those changes are purely mechanical and defines movement so they should be fine to go now (famous last words). Summary: - Enable -Wundef for the compressed kernel build stage - Reorganize SEV code to streamline and simplify future development" * tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/compressed: Enable -Wundef x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG x86/sev: Move GHCB MSR protocol and NAE definitions in a common header x86/sev-es: Rename sev-es.{ch} to sev.{ch}
-
- 15 May, 2021 26 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Fix a regression in the conversion of the 64-bit BookE interrupt entry to C. - Fix KVM hosts running with the hash MMU since the recent KVM gfn changes. - Fix a deadlock in our paravirt spinlocks when hcall tracing is enabled. - Several fixes for oopses in our runtime code patching for security mitigations. - A couple of minor fixes for the recent conversion of 32-bit interrupt entry/exit to C. - Fix __get_user() causing spurious crashes in sigreturn due to a bad inline asm constraint, spotted with GCC 11. - A fix for the way we track IRQ masking state vs NMI interrupts when using the new scv system call entry path. - A couple more minor fixes. Thanks to Cédric Le Goater, Christian Zigotzky, Christophe Leroy, Naveen N. Rao, Nicholas Piggin Paul Menzel, and Sean Christopherson. * tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64e/interrupt: Fix nvgprs being clobbered powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled powerpc/64s: Fix stf mitigation patching w/strict RWX & hash powerpc/64s: Fix entry flush patching w/strict RWX & hash powerpc/64s: Fix crashes when toggling entry flush barrier powerpc/64s: Fix crashes when toggling stf barrier KVM: PPC: Book3S HV: Fix kvm_unmap_gfn_range_hv() for Hash MMU powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds powerpc/signal: Fix possible build failure with unsafe_copy_fpr_{to/from}_user powerpc/uaccess: Fix __get_user() with CONFIG_CC_HAS_ASM_GOTO_OUTPUT powerpc/pseries: warn if recursing into the hcall tracing code powerpc/pseries: use notrace hcall variant for H_CEDE idle powerpc/pseries: Don't trace hcall tracing wrapper powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks powerpc/syscall: Calling kuap_save_and_lock() is wrong powerpc/interrupts: Fix kuep_unlock() call
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Ingo Molnar: "Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency enumeration bug" * tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull objtool fixes from Ingo Molnar: "Fix a couple of endianness bugs that crept in" * tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool/x86: Fix elf_add_alternative() endianness objtool: Fix elf_create_undef_symbol() endianness
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fix from Ingo Molnar: "Fix build warning on SH" * tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sh: Remove unused variable
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 stack randomization fix from Ingo Molnar: "Fix an assembly constraint that affected LLVM up to version 12" * tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stack: Replace "o" output with "r" input constraint
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: resource, squashfs, hfsplus, modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan, pagemap, and ioremap)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/ioremap: fix iomap_max_page_shift docs: admin-guide: update description for kernel.modprobe sysctl hfsplus: prevent corruption in shrinking truncate mm/filemap: fix readahead return types kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled mm: fix struct page layout on 32-bit systems ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()" userfaultfd: release page in error path to avoid BUG_ON squashfs: fix divide error in calculate_skip() kernel/resource: fix return code check in __request_free_mem_region mm, slub: move slub_debug static key enabling outside slab_mutex mm/hugetlb: fix cow where page writtable in child mm/hugetlb: fix F_SEAL_FUTURE_WRITE
-
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds authored
Pull ARC fixes from Vineet Gupta: - PAE fixes - syscall num check off-by-one bug - misc fixes * tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: mm: Use max_high_pfn as a HIGHMEM zone border ARC: mm: PAE: use 40-bit physical page mask ARC: entry: fix off-by-one error in syscall number validation ARC: kgdb: add 'fallthrough' to prevent a warning arc: Fix typos/spellos
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: - Fix for shared tag set exit (Bart) - Correct ioctl range for zoned ioctls (Damien) - Removed dead/unused function (Lin) - Fix perf regression for shared tags (Ming) - Fix out-of-bounds issue with kyber and preemption (Omar) - BFQ merge fix (Paolo) - Two error handling fixes for nbd (Sun) - Fix weight update in blk-iocost (Tejun) - NVMe pull request (Christoph): - correct the check for using the inline bio in nvmet (Chaitanya Kulkarni) - demote unsupported command warnings (Chaitanya Kulkarni) - fix corruption due to double initializing ANA state (me, Hou Pu) - reset ns->file when open fails (Daniel Wagner) - fix a NULL deref when SEND is completed with error in nvmet-rdma (Michal Kalderon) - Fix kernel-doc warning (Bart) * tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block: block/partitions/efi.c: Fix the efi_partition() kernel-doc header blk-mq: Swap two calls in blk_mq_exit_queue() blk-mq: plug request for shared sbitmap nvmet: use new ana_log_size instead the old one nvmet: seset ns->file when open fails nbd: share nbd_put and return by goto put_nbd nbd: Fix NULL pointer in flush_workqueue blkdev.h: remove unused codes blk_account_rq block, bfq: avoid circular stable merges blk-iocost: fix weight updates of inner active iocgs nvmet: demote fabrics cmd parse err msg to debug nvmet: use helper to remove the duplicate code nvmet: demote discovery cmd parse err msg to debug nvmet-rdma: Fix NULL deref when SEND is completed with error nvmet: fix inline bio check for passthru nvmet: fix inline bio check for bdev-ns nvme-multipath: fix double initialization of ANA state kyber: fix out of bounds access when preempted block: uapi: fix comment about block device ioctl
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "Just a few minor fixes/changes: - Fix issue with double free race for linked timeout completions - Fix reference issue with timeouts - Remove last few places that make SQPOLL special, since it's just an io thread now. - Bump maximum allowed registered buffers, as we don't allocate as much anymore" * tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block: io_uring: increase max number of reg buffers io_uring: further remove sqpoll limits on opcodes io_uring: fix ltout double free on completion race io_uring: fix link timeout refs
-
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofsLinus Torvalds authored
Pull erofs fixes from Gao Xiang: "This mainly fixes 1 lcluster-sized pclusters for the big pcluster feature, which can be forcely generated by mkfs as a specific on-disk case for per-(sub)file compression strategies but missed to handle in runtime properly. Also, documentation updates are included to fix the broken illustration due to the ReST conversion by accident and complete the big pcluster introduction. Summary: - update documentation to fix the broken illustration due to ReST conversion by accident at that time and complete the big pcluster introduction - fix 1 lcluster-sized pclusters for the big pcluster feature" * tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix 1 lcluster-sized pcluster for big pcluster erofs: update documentation about data compression erofs: fix broken illustration in documentation
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull libnvdimm fixes from Dan Williams: "A regression fix for a bootup crash condition introduced in this merge window and some other minor fixups: - Fix regression in ACPI NFIT table handling leading to crashes and driver load failures. - Move the nvdimm mailing list - Miscellaneous minor fixups" * tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: ACPI: NFIT: Fix support for variable 'SPA' structure size MAINTAINERS: Move nvdimm mailing list tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' static libnvdimm: Remove duplicate struct declaration
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull dax fixes from Dan Williams: "A fix for a hang condition due to missed wakeups in the filesystem-dax core when exercised by virtiofs. This bug has been there from the beginning, but the condition has not triggered on other filesystems since they hold a lock over invalidation events" * tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Wake up all waiters after invalidating dax entry dax: Add a wakeup mode parameter to put_unlocked_entry() dax: Add an enum for specifying dax wakup mode
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull more drm fixes from Dave Airlie: "Looks like I wasn't the only one not fully switched on this week. The msm pull has a missing tag so I missed it, and i915 team were a bit late. In my defence I did have a day with the roof of my home office removed, so was sitting at my kids desk. msm: - dsi regression fix - dma-buf pinning fix - displayport fixes - llc fix i915: - Fix active callback alignment annotations and subsequent crashes - Retract link training strategy to slow and wide, again - Avoid division by zero on gen2 - Use correct width reads for C0DRB3/C1DRB3 registers - Fix double free in pdp allocation failure path - Fix HDMI 2.1 PCON downstream caps check" * tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm: drm/i915: Use correct downstream caps for check Src-Ctl mode for PCON drm/i915/overlay: Fix active retire callback alignment drm/i915: Fix crash in auto_retire drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp drm/i915: Read C0DRB3/C1DRB3 as 16 bits again drm/i915: Avoid div-by-zero on gen2 drm/i915/dp: Use slow and wide link training for everything drm/msm/dp: initialize audio_comp when audio starts drm/msm/dp: check sink_count before update is_connected status drm/msm: fix minor version to indicate MSM_PARAM_SUSPENDS support drm/msm/dsi: fix msm_dsi_phy_get_clk_provider return code drm/msm/dsi: dsi_phy_28nm_8960: fix uninitialized variable access drm/msm: fix LLC not being enabled for mmu500 targets drm/msm: Do not unpin/evict exported dma-buf's
-
Tetsuo Handa authored
syzbot is reporting OOB write at vga16fb_imageblit() [1], for resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether requested rows/columns fit the amount of memory reserved for the graphical screen if current mode is KD_GRAPHICS. ---------- #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kd.h> #include <linux/vt.h> int main(int argc, char *argv[]) { const int fd = open("/dev/char/4:1", O_RDWR); struct vt_sizes vt = { 0x4100, 2 }; ioctl(fd, KDSETMODE, KD_GRAPHICS); ioctl(fd, VT_RESIZE, &vt); ioctl(fd, KDSETMODE, KD_TEXT); return 0; } ---------- Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode != KD_GRAPHICS check from resize_screen() to fbcon_resize(). Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3 [1] Reported-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christophe Leroy authored
iomap_max_page_shift is expected to contain a page shift, so it can't be a 'bool', has to be an 'unsigned int' And fix the default values: P4D_SHIFT is when huge iomap is allowed. However, on some architectures (eg: powerpc book3s/64), P4D_SHIFT is not a constant so it can't be used to initialise a static variable. So, initialise iomap_max_page_shift with a maximum shift supported by the architecture, it is gated by P4D_SHIFT in vmap_try_huge_p4d() anyway. Link: https://lkml.kernel.org/r/ad2d366015794a9f21320dcbdd0a8eb98979e9df.1620898113.git.christophe.leroy@csgroup.eu Fixes: bbc180a5 ("mm: HUGE_VMAP arch support cleanup") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rasmus Villemoes authored
When I added CONFIG_MODPROBE_PATH, I neglected to update Documentation/. It's still true that this defaults to /sbin/modprobe, but now via a level of indirection. So document that the kernel might have been built with something other than /sbin/modprobe as the initial value. Link: https://lkml.kernel.org/r/20210420125324.1246826-1-linux@rasmusvillemoes.dk Fixes: 17652f42 ("modules: add CONFIG_MODPROBE_PATH") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jouni Roivas authored
I believe there are some issues introduced by commit 31651c60 ("hfsplus: avoid deadlock on file truncation") HFS+ has extent records which always contains 8 extents. In case the first extent record in catalog file gets full, new ones are allocated from extents overflow file. In case shrinking truncate happens to middle of an extent record which locates in extents overflow file, the logic in hfsplus_file_truncate() was changed so that call to hfs_brec_remove() is not guarded any more. Right action would be just freeing the extents that exceed the new size inside extent record by calling hfsplus_free_extents(), and then check if the whole extent record should be removed. However since the guard (blk_cnt > start) is now after the call to hfs_brec_remove(), this has unfortunate effect that the last matching extent record is removed unconditionally. To reproduce this issue, create a file which has at least 10 extents, and then perform shrinking truncate into middle of the last extent record, so that the number of remaining extents is not under or divisible by 8. This causes the last extent record (8 extents) to be removed totally instead of truncating into middle of it. Thus this causes corruption, and lost data. Fix for this is simply checking if the new truncated end is below the start of this extent record, making it safe to remove the full extent record. However call to hfs_brec_remove() can't be moved to it's previous place since we're dropping ->tree_lock and it can cause a race condition and the cached info being invalidated possibly corrupting the node data. Another issue is related to this one. When entering into the block (blk_cnt > start) we are not holding the ->tree_lock. We break out from the loop not holding the lock, but hfs_find_exit() does unlock it. Not sure if it's possible for someone else to take the lock under our feet, but it can cause hard to debug errors and premature unlocking. Even if there's no real risk of it, the locking should still always be kept in balance. Thus taking the lock now just before the check. Link: https://lkml.kernel.org/r/20210429165139.3082828-1-jouni.roivas@tuxera.com Fixes: 31651c60 ("hfsplus: avoid deadlock on file truncation") Signed-off-by: Jouni Roivas <jouni.roivas@tuxera.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Cc: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
A readahead request will not allocate more memory than can be represented by a size_t, even on systems that have HIGHMEM available. Change the length functions from returning an loff_t to a size_t. Link: https://lkml.kernel.org/r/20210510201201.1558972-1-willy@infradead.org Fixes: 32c0a6bc ("btrfs: add and use readahead_batch_length") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Collingbourne authored
These tests deliberately access these arrays out of bounds, which will cause the dynamic local bounds checks inserted by CONFIG_UBSAN_LOCAL_BOUNDS to fail and panic the kernel. To avoid this problem, access the arrays via volatile pointers, which will prevent the compiler from being able to determine the array bounds. These accesses use volatile pointers to char (char *volatile) rather than the more conventional pointers to volatile char (volatile char *) because we want to prevent the compiler from making inferences about the pointer itself (i.e. its array bounds), not the data that it refers to. Link: https://lkml.kernel.org/r/20210507025915.1464056-1-pcc@google.com Link: https://linux-review.googlesource.com/id/I90b1713fbfa1bf68ff895aef099ea77b98a7c3b9Signed-off-by: Peter Collingbourne <pcc@google.com> Tested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Peter Collingbourne <pcc@google.com> Cc: George Popescu <georgepope@android.com> Cc: Elena Petrova <lenaptr@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
32-bit architectures which expect 8-byte alignment for 8-byte integers and need 64-bit DMA addresses (arm, mips, ppc) had their struct page inadvertently expanded in 2019. When the dma_addr_t was added, it forced the alignment of the union to 8 bytes, which inserted a 4 byte gap between 'flags' and the union. Fix this by storing the dma_addr_t in one or two adjacent unsigned longs. This restores the alignment to that of an unsigned long. We always store the low bits in the first word to prevent the PageTail bit from being inadvertently set on a big endian platform. If that happened, get_user_pages_fast() racing against a page which was freed and reallocated to the page_pool could dereference a bogus compound_head(), which would be hard to trace back to this cause. Link: https://lkml.kernel.org/r/20210510153211.1504886-1-willy@infradead.org Fixes: c25fff71 ("mm: add dma_addr_t to struct page") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Matteo Croce <mcroce@linux.microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
This reverts commit 3e96b6a2. General Protection Fault in rmap_walk_ksm() under memory pressure: remove_rmap_item_from_tree() needs to take page lock, of course. Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2105092253500.1127@eggly.anvilsSigned-off-by: Hugh Dickins <hughd@google.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Axel Rasmussen authored
Consider the following sequence of events: 1. Userspace issues a UFFD ioctl, which ends up calling into shmem_mfill_atomic_pte(). We successfully account the blocks, we shmem_alloc_page(), but then the copy_from_user() fails. We return -ENOENT. We don't release the page we allocated. 2. Our caller detects this error code, tries the copy_from_user() after dropping the mmap_lock, and retries, calling back into shmem_mfill_atomic_pte(). 3. Meanwhile, let's say another process filled up the tmpfs being used. 4. So shmem_mfill_atomic_pte() fails to account blocks this time, and immediately returns - without releasing the page. This triggers a BUG_ON in our caller, which asserts that the page should always be consumed, unless -ENOENT is returned. To fix this, detect if we have such a "dangling" page when accounting fails, and if so, release it before returning. Link: https://lkml.kernel.org/r/20210428230858.348400-1-axelrasmussen@google.com Fixes: cb658a45 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY") Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reported-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Phillip Lougher authored
Sysbot has reported a "divide error" which has been identified as being caused by a corrupted file_size value within the file inode. This value has been corrupted to a much larger value than expected. Calculate_skip() is passed i_size_read(inode) >> msblk->block_log. Due to the file_size value corruption this overflows the int argument/variable in that function, leading to the divide error. This patch changes the function to use u64. This will accommodate any unexpectedly large values due to corruption. The value returned from calculate_skip() is clamped to be never more than SQUASHFS_CACHED_BLKS - 1, or 7. So file_size corruption does not lead to an unexpectedly large return result here. Link: https://lkml.kernel.org/r/20210507152618.9447-1-phillip@squashfs.org.ukSigned-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: <syzbot+e8f781243ce16ac2f962@syzkaller.appspotmail.com> Reported-by: <syzbot+7b98870d4fec9447b951@syzkaller.appspotmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alistair Popple authored
Splitting an earlier version of a patch that allowed calling __request_region() while holding the resource lock into a series of patches required changing the return code for the newly introduced __request_region_locked(). Unfortunately this change was not carried through to a subsequent commit 56fd9491 ("kernel/resource: fix locking in request_free_mem_region") in the series. This resulted in a use-after-free due to freeing the struct resource without properly releasing it. Fix this by correcting the return code check so that the struct is not freed if the request to add it was successful. Link: https://lkml.kernel.org/r/20210512073528.22334-1-apopple@nvidia.com Fixes: 56fd9491 ("kernel/resource: fix locking in request_free_mem_region") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Muchun Song <smuchun@gmail.com> Cc: Oliver Sang <oliver.sang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Paul E. McKenney reported [1] that commit 1f0723a4 ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags") results in the lockdep complaint: ====================================================== WARNING: possible circular locking dependency detected 5.12.0+ #15 Not tainted ------------------------------------------------------ rcu_torture_sta/109 is trying to acquire lock: ffffffff96063cd0 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x9/0x20 but task is already holding lock: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.}-{3:3}: lock_acquire+0xb9/0x3a0 __mutex_lock+0x8d/0x920 slub_cpu_dead+0x15/0xf0 cpuhp_invoke_callback+0x17a/0x7c0 cpuhp_invoke_callback_range+0x3b/0x80 _cpu_down+0xdf/0x2a0 cpu_down+0x2c/0x50 device_offline+0x82/0xb0 remove_cpu+0x1a/0x30 torture_offline+0x80/0x140 torture_onoff+0x147/0x260 kthread+0x10a/0x140 ret_from_fork+0x22/0x30 -> #0 (cpu_hotplug_lock){++++}-{0:0}: check_prev_add+0x8f/0xbf0 __lock_acquire+0x13f0/0x1d80 lock_acquire+0xb9/0x3a0 cpus_read_lock+0x21/0xa0 static_key_enable+0x9/0x20 __kmem_cache_create+0x38d/0x430 kmem_cache_create_usercopy+0x146/0x250 kmem_cache_create+0xd/0x10 rcu_torture_stats+0x79/0x280 kthread+0x10a/0x140 ret_from_fork+0x22/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(cpu_hotplug_lock); lock(slab_mutex); lock(cpu_hotplug_lock); *** DEADLOCK *** 1 lock held by rcu_torture_sta/109: #0: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250 stack backtrace: CPU: 3 PID: 109 Comm: rcu_torture_sta Not tainted 5.12.0+ #15 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: dump_stack+0x6d/0x89 check_noncircular+0xfe/0x110 ? lock_is_held_type+0x98/0x110 check_prev_add+0x8f/0xbf0 __lock_acquire+0x13f0/0x1d80 lock_acquire+0xb9/0x3a0 ? static_key_enable+0x9/0x20 ? mark_held_locks+0x49/0x70 cpus_read_lock+0x21/0xa0 ? static_key_enable+0x9/0x20 static_key_enable+0x9/0x20 __kmem_cache_create+0x38d/0x430 kmem_cache_create_usercopy+0x146/0x250 ? rcu_torture_stats_print+0xd0/0xd0 kmem_cache_create+0xd/0x10 rcu_torture_stats+0x79/0x280 ? rcu_torture_stats_print+0xd0/0xd0 kthread+0x10a/0x140 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 This is because there's one order of locking from the hotplug callbacks: lock(cpu_hotplug_lock); // from hotplug machinery itself lock(slab_mutex); // in e.g. slab_mem_going_offline_callback() And commit 1f0723a4 made the reverse sequence possible: lock(slab_mutex); // in kmem_cache_create_usercopy() lock(cpu_hotplug_lock); // kmem_cache_open() -> static_key_enable() The simplest fix is to move static_key_enable() to a place before slab_mutex is taken. That means kmem_cache_create_usercopy() in mm/slab_common.c which is not ideal for SLUB-specific code, but the #ifdef CONFIG_SLUB_DEBUG makes it at least self-contained and obvious. [1] https://lore.kernel.org/lkml/20210502171827.GA3670492@paulmck-ThinkPad-P17-Gen-1/ Link: https://lkml.kernel.org/r/20210504120019.26791-1-vbabka@suse.cz Fixes: 1f0723a4 ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Xu authored
When rework early cow of pinned hugetlb pages, we moved huge_ptep_get() upper but overlooked a side effect that the huge_ptep_get() will fetch the pte after wr-protection. After moving it upwards, we need explicit wr-protect of child pte or we will keep the write bit set in the child process, which could cause data corrution where the child can write to the original page directly. This issue can also be exposed by "memfd_test hugetlbfs" kselftest. Link: https://lkml.kernel.org/r/20210503234356.9097-3-peterx@redhat.com Fixes: 4eae4efa ("hugetlb: do early cow when page pinned on src mm") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-