- 05 Dec, 2022 3 commits
-
-
Jason Gunthorpe authored
As with the previous patch EEH is always enabled if SPAPR_TCE_IOMMU, so move this last bit of code into the main module. Now that this function only processes VFIO_EEH_PE_OP remove a level of indenting as well, it is only called by a case statement that already checked VFIO_EEH_PE_OP. This eliminates an unnecessary module and SPAPR code in a global header. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v5-fc5346cacfd4+4c482-vfio_modules_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
The PPC64 kconfig is a bit of a rats nest, but it turns out that if CONFIG_SPAPR_TCE_IOMMU is on then EEH must be too: config SPAPR_TCE_IOMMU bool "sPAPR TCE IOMMU Support" depends on PPC_POWERNV || PPC_PSERIES select IOMMU_API help Enables bits of IOMMU API required by VFIO. The iommu_ops is not implemented as it is not necessary for VFIO. config PPC_POWERNV select FORCE_PCI config PPC_PSERIES select FORCE_PCI config EEH bool depends on (PPC_POWERNV || PPC_PSERIES) && PCI default y So, just open code the call to eeh_enabled() into tce_iommu_ioctl(). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v5-fc5346cacfd4+4c482-vfio_modules_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Jason Gunthorpe authored
The vfio_spapr_pci_eeh_open/release() functions are one line wrappers around an arch function. Just call them directly. This eliminates some weird exported symbols that don't need to exist. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/1-v5-fc5346cacfd4+4c482-vfio_modules_jgg@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
- 02 Dec, 2022 1 commit
-
-
Joao Martins authored
Commit f38044e5 ("vfio/iova_bitmap: Fix PAGE_SIZE unaligned bitmaps") had fixed the unaligned bitmaps by capping the remaining iterable set at the start of the bitmap. Although, that mistakenly worked around iova_bitmap_set() incorrectly setting bits across page boundary. Fix this by reworking the loop inside iova_bitmap_set() to iterate over a range of bits to set (cur_bit .. last_bit) which may span different pinned pages, thus updating @page_idx and @offset as it sets the bits. The previous cap to the first page is now adjusted to be always accounted rather than when there's only a non-zero pgoff. While at it, make @page_idx , @offset and @nbits to be unsigned int given that it won't be more than 512 and 4096 respectively (even a bigger PAGE_SIZE or a smaller struct page size won't make this bigger than the above 32-bit max). Also, delete the stale kdoc on Return type. Cc: Avihai Horon <avihaih@nvidia.com> Fixes: f38044e5 ("vfio/iova_bitmap: Fix PAGE_SIZE unaligned bitmaps") Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Link: https://lore.kernel.org/r/20221129131235.38880-1-joao.m.martins@oracle.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
- 14 Nov, 2022 2 commits
-
-
Yishai Hadas authored
Fix a typo in mlx5vf_cmd_load_vhca_state() to use the 'load' memory layout. As in/out sizes are equal for save and load commands there wasn't any functional issue. Fixes: f1d98f34 ("vfio/mlx5: Expose migration commands over mlx5 device") Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221106174630.25909-3-yishaih@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Yishai Hadas authored
Add an option to get migration data size by introducing a new migration feature named VFIO_DEVICE_FEATURE_MIG_DATA_SIZE. Upon VFIO_DEVICE_FEATURE_GET the estimated data length that will be required to complete STOP_COPY is returned. This option may better enable user space to consider before moving to STOP_COPY whether it can meet the downtime SLA based on the returned data. The patch also includes the implementation for mlx5 and hisi for this new option to make it feature complete for the existing drivers in this area. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20221106174630.25909-2-yishaih@nvidia.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
- 10 Nov, 2022 7 commits
-
-
Eric Farman authored
With the "mess" sorted out, we should be able to inline the vfio_free_device call introduced by commit cb9ff3f3 ("vfio: Add helpers for unifying vfio_device life cycle") and remove them from driver release callbacks. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> # vfio-ap part Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-8-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
Now that we have a reasonable separation of structs that follow the subchannel and mdev lifecycles, there's no reason we can't call the official vfio_alloc_device routine for our private data, and behave like everyone else. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-7-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
There's enough separation between the parent and private structs now, that it is fine to remove the release completion hack. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-6-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
Now that the mdev parent data is split out into its own struct, it is safe to move the remaining private data to follow the mdev probe/remove lifecycle. The mdev parent data will remain where it is, and follow the subchannel and the css driver interfaces. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-5-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
There's already a device initialization callback that is used to initialize the release completion workaround that was introduced by commit ebb72b76 ("vfio/ccw: Use the new device life cycle helpers"). Move the other elements of the vfio_ccw_private struct that require distinct initialization over to that routine. With that done, the vfio_ccw_alloc_private routine only does a kzalloc, so fold it inline. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-4-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
These places all rely on the ability to jump from a private struct back to the subchannel struct. Rather than keeping a copy in our back pocket, let's use the relationship provided by the vfio_device embedded within the private. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-3-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Eric Farman authored
Move the stuff associated with the mdev parent (and thus the subchannel struct) into its own struct, and leave the rest in the existing private structure. The subchannel will point to the parent, and the parent will point to the private, for the areas where one or both are needed. Further separation of these structs will follow. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221104142007.1314999-2-farman@linux.ibm.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
- 09 Nov, 2022 5 commits
-
-
Joao Martins authored
iova_bitmap_set() doesn't consider the end of the page boundary when the first bitmap page offset isn't zero, and wrongly changes the consecutive page right after. Consequently this leads to missing dirty pages from reported by the device as seen from the VMM. The current logic iterates over a given number of base pages and clamps it to the remaining indexes to iterate in the last page. Instead of having to consider extra pages to pin (e.g. first and extra pages), just handle the first page as its own range and let the rest of the bitmap be handled as if it was base page aligned. This is done by changing iova_bitmap_mapped_remaining() to return PAGE_SIZE - pgoff (on the first bitmap page), and leads to pgoff being set to 0 on following iterations. Fixes: 58ccf019 ("vfio: Add an IOVA bitmap support") Reported-by: Avihai Horon <avihaih@nvidia.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20221025193114.58695-3-joao.m.martins@oracle.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Joao Martins authored
kzalloc/kzfree are used so include `slab.h`. While it happens to work without it, due to commit 8b9f3ac5 ("fs: introduce alloc_inode_sb() to allocate filesystems specific inode") which indirectly includes via: . ./include/linux/mm.h .. ./include/linux/huge_mm.h ... ./include/linux/fs.h .... ./include/linux/slab.h Make it explicit should any of its indirect dependencies be dropped/changed for entirely different reasons as it was the cause prior to commit above recently (i.e. <= v5.18). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20221025193114.58695-2-joao.m.martins@oracle.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Rafael Mendonca authored
The ACPI _RST method has no return value, there's no need to pass a return buffer to acpi_evaluate_object(). Fixes: d30daa33 ("vfio: platform: call _RST method when using ACPI") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20221018152825.891032-1-rafaelmendsr@gmail.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
Shang XiaoJing authored
Since pci provides the helper macro module_pci_driver(), we may replace the module_init/exit with it. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220922123507.11222-1-shangxiaojing@huawei.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
git://githubhttps://github.comPalmer Dabbelt authored
Github deprecated the git:// links about a year ago, so let's move to the https:// URLs instead. Reported-by: Conor Dooley <conor.dooley@microchip.com> Link: https://github.blog/2021-09-01-improving-git-protocol-security-github/Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20221013214636.30721-1-palmer@rivosinc.comSigned-off-by: Alex Williamson <alex.williamson@redhat.com>
-
- 06 Nov, 2022 17 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds authored
Pull cxl fixes from Dan Williams: "Several fixes for CXL region creation crashes, leaks and failures. This is mainly fallout from the original implementation of dynamic CXL region creation (instantiate new physical memory pools) that arrived in v6.0-rc1. Given the theme of "failures in the presence of pass-through decoders" this also includes new regression test infrastructure for that case. Summary: - Fix region creation crash with pass-through decoders - Fix region creation crash when no decoder allocation fails - Fix region creation crash when scanning regions to enforce the increasing physical address order constraint that CXL mandates - Fix a memory leak for cxl_pmem_region objects, track 1:N instead of 1:1 memory-device-to-region associations. - Fix a memory leak for cxl_region objects when regions with active targets are deleted - Fix assignment of NUMA nodes to CXL regions by CFMWS (CXL Window) emulated proximity domains. - Fix region creation failure for switch attached devices downstream of a single-port host-bridge - Fix false positive memory leak of cxl_region objects by recycling recently used region ids rather than freeing them - Add regression test infrastructure for a pass-through decoder configuration - Fix some mailbox payload handling corner cases" * tag 'cxl-fixes-for-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Recycle region ids cxl/region: Fix 'distance' calculation with passthrough ports tools/testing/cxl: Add a single-port host-bridge regression config tools/testing/cxl: Fix some error exits cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak cxl/region: Fix cxl_region leak, cleanup targets at region delete cxl/region: Fix region HPA ordering validation cxl/pmem: Use size_add() against integer overflow cxl/region: Fix decoder allocation crash ACPI: NUMA: Add CXL CFMWS 'nodes' to the possible nodes set cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. cxl/region: Fix null pointer dereference due to pass through decoder commit cxl/mbox: Add a check on input payload size
-
Linus Torvalds authored
Merge tag 'hwmon-for-v6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fix two regressions: - Commit 54cc3dbf ("hwmon: (pmbus) Add regulator supply into macro") resulted in regulator undercount when disabling regulators. Revert it. - The thermal subsystem rework caused the scmi driver to no longer register with the thermal subsystem because index values no longer match. To fix the problem, the scmi driver now directly registers with the thermal subsystem, no longer through the hwmon core" * tag 'hwmon-for-v6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: Revert "hwmon: (pmbus) Add regulator supply into macro" hwmon: (scmi) Register explicitly with Thermal Framework
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Borislav Petkov: - Add Cooper Lake's stepping to the PEBS guest/host events isolation fixed microcode revisions checking quirk - Update Icelake and Sapphire Rapids events constraints - Use the standard energy unit for Sapphire Rapids in RAPL - Fix the hw_breakpoint test to fail more graciously on !SMP configs * tag 'perf_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add Cooper Lake stepping to isolation_ucodes[] perf/x86/intel: Fix pebs event constraints for SPR perf/x86/intel: Fix pebs event constraints for ICL perf/x86/rapl: Use standard Energy Unit for SPR Dram RAPL domain perf/hw_breakpoint: test: Skip the test if dependencies unmet
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Borislav Petkov: - Add new Intel CPU models - Enforce that TDX guests are successfully loaded only on TDX hardware where virtualization exception (#VE) delivery on kernel memory is disabled because handling those in all possible cases is "essentially impossible" - Add the proper include to the syscall wrappers so that BTF can see the real pt_regs definition and not only the forward declaration * tag 'x86_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add several Intel server CPU model numbers x86/tdx: Panic on bad configs that #VE on "private" memory access x86/tdx: Prepare for using "INFO" call for a second purpose x86/syscall: Include asm/ptrace.h in syscall_wrapper header
-
Linus Torvalds authored
Merge tag 'kbuild-fixes-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Use POSIX-compatible grep options - Document git-related tips for reproducible builds - Fix a typo in the modpost rule - Suppress SIGPIPE error message from gcc-ar and llvm-ar - Fix segmentation fault in the menuconfig search * tag 'kbuild-fixes-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix segmentation fault in menuconfig search kbuild: fix SIGPIPE error message for AR=gcc-ar and AR=llvm-ar kbuild: fix typo in modpost Documentation: kbuild: Add description of git for reproducible builds kbuild: use POSIX-compatible grep option
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix the pKVM stage-1 walker erronously using the stage-2 accessor - Correctly convert vcpu->kvm to a hyp pointer when generating an exception in a nVHE+MTE configuration - Check that KVM_CAP_DIRTY_LOG_* are valid before enabling them - Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE - Document the boot requirements for FGT when entering the kernel at EL1 x86: - Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit() - Make argument order consistent for kvcalloc() - Userspace API fixes for DEBUGCTL and LBRs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix a typo about the usage of kvcalloc() KVM: x86: Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit() KVM: VMX: Ignore guest CPUID for host userspace writes to DEBUGCTL KVM: VMX: Fold vmx_supported_debugctl() into vcpu_supported_debugctl() KVM: VMX: Advertise PMU LBRs if and only if perf supports LBRs arm64: booting: Document our requirements for fine grained traps with SME KVM: arm64: Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE KVM: Check KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL} prior to enabling them KVM: arm64: Fix bad dereference on MTE-enabled systems KVM: arm64: Use correct accessor to parse stage-1 PTEs
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull xen fixes from Juergen Gross: "One fix for silencing a smatch warning, and a small cleanup patch" * tag 'for-linus-6.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: simplify sysenter and syscall setup x86/xen: silence smatch warning in pmu_msr_chk_emulated()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 fixes from Ted Ts'o: "Fix a number of bugs, including some regressions, the most serious of which was one which would cause online resizes to fail with file systems with metadata checksums enabled. Also fix a warning caused by the newly added fortify string checker, plus some bugs that were found using fuzzed file systems" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix fortify warning in fs/ext4/fast_commit.c:1551 ext4: fix wrong return err in ext4_load_and_init_journal() ext4: fix warning in 'ext4_da_release_space' ext4: fix BUG_ON() when directory entry has invalid rec_len ext4: update the backup superblock's at the end of the online resize
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull cifs fixes from Steve French: "One symlink handling fix and two fixes foir multichannel issues with iterating channels, including for oplock breaks when leases are disabled" * tag '6.1-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix use-after-free on the link name cifs: avoid unnecessary iteration of tcp sessions cifs: always iterate smb sessions using primary channel
-
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-traceLinus Torvalds authored
Pull `lTracing fixes for 6.1-rc3: - Fixed NULL pointer dereference in the ring buffer wait-waiters code for machines that have less CPUs than what nr_cpu_ids returns. The buffer array is of size nr_cpu_ids, but only the online CPUs get initialized. - Fixed use after free call in ftrace_shutdown. - Fix accounting of if a kprobe is enabled - Fix NULL pointer dereference on error path of fprobe rethook_alloc(). - Fix unregistering of fprobe_kprobe_handler - Fix memory leak in kprobe test module * tag 'trace-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: kprobe: Fix memory leak in test_gen_kprobe/kretprobe_cmd() tracing/fprobe: Fix to check whether fprobe is registered correctly fprobe: Check rethook_alloc() return in rethook initialization kprobe: reverse kp->flags when arm_kprobe failed ftrace: Fix use-after-free for dynamic ftrace_ops ring-buffer: Check for NULL cpu_buffer in ring_buffer_wake_waiters()
-
Paolo Bonzini authored
Merge tag 'kvmarm-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD * Fix the pKVM stage-1 walker erronously using the stage-2 accessor * Correctly convert vcpu->kvm to a hyp pointer when generating an exception in a nVHE+MTE configuration * Check that KVM_CAP_DIRTY_LOG_* are valid before enabling them * Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE * Document the boot requirements for FGT when entering the kernel at EL1
-
Paolo Bonzini authored
x86: * Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit() * Make argument order consistent for kvcalloc() * Userspace API fixes for DEBUGCTL and LBRs
-
Theodore Ts'o authored
With the new fortify string system, rework the memcpy to avoid this warning: memcpy: detected field-spanning write (size 60) of single field "&raw_inode->i_generation" at fs/ext4/fast_commit.c:1551 (size 4) Cc: stable@kernel.org Fixes: 54d9469b ("fortify: Add run-time WARN for cross-field memcpy()") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Jason Yan authored
The return value is wrong in ext4_load_and_init_journal(). The local variable 'err' need to be initialized before goto out. The original code in __ext4_fill_super() is fine because it has two return values 'ret' and 'err' and 'ret' is initialized as -EINVAL. After we factor out ext4_load_and_init_journal(), this code is broken. So fix it by directly returning -EINVAL in the error handler path. Cc: stable@kernel.org Fixes: 9c1dd22d ("ext4: factor out ext4_load_and_init_journal()") Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221025040206.3134773-1-yanaijie@huawei.comSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Ye Bin authored
Syzkaller report issue as follows: EXT4-fs (loop0): Free/Dirty block details EXT4-fs (loop0): free_blocks=0 EXT4-fs (loop0): dirty_blocks=0 EXT4-fs (loop0): Block reservation details EXT4-fs (loop0): i_reserved_data_blocks=0 EXT4-fs warning (device loop0): ext4_da_release_space:1527: ext4_da_release_space: ino 18, to_free 1 with only 0 reserved data blocks ------------[ cut here ]------------ WARNING: CPU: 0 PID: 92 at fs/ext4/inode.c:1528 ext4_da_release_space+0x25e/0x370 fs/ext4/inode.c:1524 Modules linked in: CPU: 0 PID: 92 Comm: kworker/u4:4 Not tainted 6.0.0-syzkaller-09423-g493ffd66 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Workqueue: writeback wb_workfn (flush-7:0) RIP: 0010:ext4_da_release_space+0x25e/0x370 fs/ext4/inode.c:1528 RSP: 0018:ffffc900015f6c90 EFLAGS: 00010296 RAX: 42215896cd52ea00 RBX: 0000000000000000 RCX: 42215896cd52ea00 RDX: 0000000000000000 RSI: 0000000080000001 RDI: 0000000000000000 RBP: 1ffff1100e907d96 R08: ffffffff816aa79d R09: fffff520002bece5 R10: fffff520002bece5 R11: 1ffff920002bece4 R12: ffff888021fd2000 R13: ffff88807483ecb0 R14: 0000000000000001 R15: ffff88807483e740 FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555569ba628 CR3: 000000000c88e000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ext4_es_remove_extent+0x1ab/0x260 fs/ext4/extents_status.c:1461 mpage_release_unused_pages+0x24d/0xef0 fs/ext4/inode.c:1589 ext4_writepages+0x12eb/0x3be0 fs/ext4/inode.c:2852 do_writepages+0x3c3/0x680 mm/page-writeback.c:2469 __writeback_single_inode+0xd1/0x670 fs/fs-writeback.c:1587 writeback_sb_inodes+0xb3b/0x18f0 fs/fs-writeback.c:1870 wb_writeback+0x41f/0x7b0 fs/fs-writeback.c:2044 wb_do_writeback fs/fs-writeback.c:2187 [inline] wb_workfn+0x3cb/0xef0 fs/fs-writeback.c:2227 process_one_work+0x877/0xdb0 kernel/workqueue.c:2289 worker_thread+0xb14/0x1330 kernel/workqueue.c:2436 kthread+0x266/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 </TASK> Above issue may happens as follows: ext4_da_write_begin ext4_create_inline_data ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS); ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); __ext4_ioctl ext4_ext_migrate -> will lead to eh->eh_entries not zero, and set extent flag ext4_da_write_begin ext4_da_convert_inline_data_to_extent ext4_da_write_inline_data_begin ext4_da_map_blocks ext4_insert_delayed_block if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) if (!ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk)) ext4_clu_mapped(inode, EXT4_B2C(sbi, lblk)); -> will return 1 allocated = true; ext4_es_insert_delayed_block(inode, lblk, allocated); ext4_writepages mpage_map_and_submit_extent(handle, &mpd, &give_up_on_write); -> return -ENOSPC mpage_release_unused_pages(&mpd, give_up_on_write); -> give_up_on_write == 1 ext4_es_remove_extent ext4_da_release_space(inode, reserved); if (unlikely(to_free > ei->i_reserved_data_blocks)) -> to_free == 1 but ei->i_reserved_data_blocks == 0 -> then trigger warning as above To solve above issue, forbid inode do migrate which has inline data. Cc: stable@kernel.org Reported-by: syzbot+c740bb18df70ad00952e@syzkaller.appspotmail.com Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221018022701.683489-1-yebin10@huawei.comSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Luís Henriques authored
The rec_len field in the directory entry has to be a multiple of 4. A corrupted filesystem image can be used to hit a BUG() in ext4_rec_len_to_disk(), called from make_indexed_dir(). ------------[ cut here ]------------ kernel BUG at fs/ext4/ext4.h:2413! ... RIP: 0010:make_indexed_dir+0x53f/0x5f0 ... Call Trace: <TASK> ? add_dirent_to_buf+0x1b2/0x200 ext4_add_entry+0x36e/0x480 ext4_add_nondir+0x2b/0xc0 ext4_create+0x163/0x200 path_openat+0x635/0xe90 do_filp_open+0xb4/0x160 ? __create_object.isra.0+0x1de/0x3b0 ? _raw_spin_unlock+0x12/0x30 do_sys_openat2+0x91/0x150 __x64_sys_open+0x6c/0xa0 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The fix simply adds a call to ext4_check_dir_entry() to validate the directory entry, returning -EFSCORRUPTED if the entry is invalid. CC: stable@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=216540Signed-off-by: Luís Henriques <lhenriques@suse.de> Link: https://lore.kernel.org/r/20221012131330.32456-1-lhenriques@suse.deSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
- 05 Nov, 2022 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fix from Rafael Wysocki: "Add StorageD3Enable quirk for Dell Inspiron 16 5625 (Mario Limonciello)" * tag 'acpi-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: x86: Add another system to quirk list for forcing StorageD3Enable
-
Rafael J. Wysocki authored
* acpi-x86: ACPI: x86: Add another system to quirk list for forcing StorageD3Enable
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull block fixes from Jens Axboe: - Fixes for the ublk driver (Ming) - Fixes for error handling memory leaks (Chen Jun, Chen Zhongjin) - Explicitly clear the last request in a chain when the plug is flushed, as it may have already been issued (Al) * tag 'block-6.1-2022-11-05' of git://git.kernel.dk/linux: block: blk_add_rq_to_plug(): clear stale 'last' after flush blk-mq: Fix kmemleak in blk_mq_init_allocated_queue block: Fix possible memory leak for rq_wb on add_disk failure ublk_drv: add ublk_queue_cmd() for cleanup ublk_drv: avoid to touch io_uring cmd in blk_mq io path ublk_drv: comment on ublk_driver entry of Kconfig ublk_drv: return flag of UBLK_F_URING_CMD_COMP_IN_TASK in case of module
-
ChenXiaoSong authored
xfstests generic/011 reported use-after-free bug as follows: BUG: KASAN: use-after-free in __d_alloc+0x269/0x859 Read of size 15 at addr ffff8880078933a0 by task dirstress/952 CPU: 1 PID: 952 Comm: dirstress Not tainted 6.1.0-rc3+ #77 Call Trace: __dump_stack+0x23/0x29 dump_stack_lvl+0x51/0x73 print_address_description+0x67/0x27f print_report+0x3e/0x5c kasan_report+0x7b/0xa8 kasan_check_range+0x1b2/0x1c1 memcpy+0x22/0x5d __d_alloc+0x269/0x859 d_alloc+0x45/0x20c d_alloc_parallel+0xb2/0x8b2 lookup_open+0x3b8/0x9f9 open_last_lookups+0x63d/0xc26 path_openat+0x11a/0x261 do_filp_open+0xcc/0x168 do_sys_openat2+0x13b/0x3f7 do_sys_open+0x10f/0x146 __se_sys_creat+0x27/0x2e __x64_sys_creat+0x55/0x6a do_syscall_64+0x40/0x96 entry_SYSCALL_64_after_hwframe+0x63/0xcd Allocated by task 952: kasan_save_stack+0x1f/0x42 kasan_set_track+0x21/0x2a kasan_save_alloc_info+0x17/0x1d __kasan_kmalloc+0x7e/0x87 __kmalloc_node_track_caller+0x59/0x155 kstrndup+0x60/0xe6 parse_mf_symlink+0x215/0x30b check_mf_symlink+0x260/0x36a cifs_get_inode_info+0x14e1/0x1690 cifs_revalidate_dentry_attr+0x70d/0x964 cifs_revalidate_dentry+0x36/0x62 cifs_d_revalidate+0x162/0x446 lookup_open+0x36f/0x9f9 open_last_lookups+0x63d/0xc26 path_openat+0x11a/0x261 do_filp_open+0xcc/0x168 do_sys_openat2+0x13b/0x3f7 do_sys_open+0x10f/0x146 __se_sys_creat+0x27/0x2e __x64_sys_creat+0x55/0x6a do_syscall_64+0x40/0x96 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 950: kasan_save_stack+0x1f/0x42 kasan_set_track+0x21/0x2a kasan_save_free_info+0x1c/0x34 ____kasan_slab_free+0x1c1/0x1d5 __kasan_slab_free+0xe/0x13 __kmem_cache_free+0x29a/0x387 kfree+0xd3/0x10e cifs_fattr_to_inode+0xb6a/0xc8c cifs_get_inode_info+0x3cb/0x1690 cifs_revalidate_dentry_attr+0x70d/0x964 cifs_revalidate_dentry+0x36/0x62 cifs_d_revalidate+0x162/0x446 lookup_open+0x36f/0x9f9 open_last_lookups+0x63d/0xc26 path_openat+0x11a/0x261 do_filp_open+0xcc/0x168 do_sys_openat2+0x13b/0x3f7 do_sys_open+0x10f/0x146 __se_sys_creat+0x27/0x2e __x64_sys_creat+0x55/0x6a do_syscall_64+0x40/0x96 entry_SYSCALL_64_after_hwframe+0x63/0xcd When opened a symlink, link name is from 'inode->i_link', but it may be reset to a new value when revalidate the dentry. If some processes get the link name on the race scenario, then UAF will happen on link name. Fix this by implementing 'get_link' interface to duplicate the link name. Fixes: 76894f3e ("cifs: improve symlink handling for smb2+") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
-
Shyam Prasad N authored
In a few places, we do unnecessary iterations of tcp sessions, even when the server struct is provided. The change avoids it and uses the server struct provided. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
-