Commits · 1c7ce115e521060819f6e9b6b6eb26ae0aee6596 · Kirill Smelkov / linux

10 Aug, 2023 11 commits

xfs: reap large AG metadata extents when possible · 1c7ce115

Darrick J. Wong authored Aug 10, 2023

When we're freeing extents that have been set in a bitmap, break the
bitmap extent into multiple sub-extents organized by fate, and reap the
extents.  This enables us to dispose of old resources more efficiently
than doing them block by block.

While we're at it, rename the reaping functions to make it clear that
they're reaping per-AG extents.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

1c7ce115

xfs: allow scanning ranges of the buffer cache for live buffers · 9ed851f6

Darrick J. Wong authored Aug 10, 2023

After an online repair, we need to invalidate buffers representing the
blocks from the old metadata that we're replacing. It's possible that
parts of a tree that were previously cached in memory are no longer
accessible due to media failure or other corruption on interior nodes,
so repair figures out the old blocks from the reverse mapping data and
scans the buffer cache directly.

In other words, online fsck needs to find all the live (i.e. non-stale)
buffers for a range of fsblocks so that it can invalidate them.

Unfortunately, the current buffer cache code triggers asserts if the
rhashtable lookup finds a non-stale buffer of a different length than
the key we searched for. For regular operation this is desirable, but
for this repair procedure, we don't care since we're going to forcibly
stale the buffer anyway. Add an internal lookup flag to avoid the
assert. Skip buffers that are already XBF_STALE.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

9ed851f6

xfs: rearrange xrep_reap_block to make future code flow easier · 77a1396f

Darrick J. Wong authored Aug 10, 2023

Rearrange the logic inside xrep_reap_block to make it more obvious that
crosslinked metadata blocks are handled differently.  Add a couple of
tracepoints so that we can tell what's going on at the end of a btree
rebuild operation.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

77a1396f

xfs: use deferred frees to reap old btree blocks · 5fee784e

Darrick J. Wong authored Aug 10, 2023

Use deferred frees (EFIs) to reap the blocks of a btree that we just
replaced. This helps us to shrink the window in which those old blocks
could be lost due to a system crash, though we try to flush the EFIs
every few hundred blocks so that we don't also overflow the transaction
reservations during and after we commit the new btree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

5fee784e

xfs: only allow reaping of per-AG blocks in xrep_reap_extents · a55e0730

Darrick J. Wong authored Aug 10, 2023

Now that we've refactored btree cursors to require the caller to pass in
a perag structure, there are numerous problems in xrep_reap_extents if
it's being called to reap extents for an inode metadata repair.  We
don't have any repair functions that can do that, so drop the support
for now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

a55e0730

xfs: only invalidate blocks if we're going to free them · 8e54e06b

Darrick J. Wong authored Aug 10, 2023

When we're discarding old btree blocks after a repair, only invalidate
the buffers for the ones that we're freeing -- if the metadata was
crosslinked with another data structure, we don't want to touch it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

8e54e06b

xfs: move the post-repair block reaping code to a separate file · e06ef14b

Darrick J. Wong authored Aug 10, 2023

Reaping blocks after a repair is a complicated affair involving a lot of
rmap btree lookups and figuring out if we're going to unmap or free old
metadata blocks that might be crosslinked. Eventually, we will need to
be able to reap per-AG metadata blocks, bmbt blocks from inode forks,
garbage CoW staging extents, and (even later) blocks from btrees rooted
in inodes. This results in a lot of reaping code, so we might as well
split that off while it's easy.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

e06ef14b

xfs: cull repair code that will never get used · 86a46417

Darrick J. Wong authored Aug 10, 2023

These two functions date from the era when I thought that we could
rebuild btrees by creating an alternate root and adding records one by
one.  In other words, they predate the btree bulk loader.  They're not
necessary now, so remove them.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>

86a46417

MAINTAINERS: add Chandan Babu as XFS release manager · d6532904

Darrick J. Wong authored Aug 10, 2023

I nominate Chandan Babu to take over release management for the upstream
kernel's XFS code.  He has had sufficient experience merging backports
to the 5.4 LTS tree, testing them, and sending them on to the LTS leads.

NOTE: I am /not/ nominating Chandan to take on any of the other roles I
have just dropped.  Bug triager, testing lead, and community manager are
open positions that need to be filled.  There's also maintainer for
supported LTS releases (4.14, 4.19, 5.10...).

Cc: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Acked-by: Chandan Babu R <chandan.babu@oracle.com>
Reviewed-by: Carlos Maiolino <cem@kernel.org>

d6532904

MAINTAINERS: drop me as XFS maintainer · d554046e

Darrick J. Wong authored Aug 10, 2023

I burned out years ago trying to juggle the roles senior developer,
reviewer, tester, triager (crappily), release manager, and (at times)
manager liaison. There's enough work here in this one subsystem for a
team of 20 FT, but instead we're squeezed to half that. I thought if I
could hold on just a bit longer I could help to maintain the focus on
long term development to improve the experience for users. I was wrong.

Nowadays, people working on XFS seem to spend most of their time on
distro kernel backports and dealing with AI-generated corner case bug
reports that aren't user reports. Reviewing has become a nightmare of
sifting through under-documented kernel code trying to decide if this
new feature won't break all the other features. Getting reviews is an
unpleasant process of negotiating with demands for further cleanups,
trying to figure out if a review comment is based in experience or
unfamiliarity, and wondering if the silence means anything.

For now, I will continue to review patches and will try to get online
fsck, parent pointers, and realtime volume modernisation merged.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>

d554046e

docs: add maintainer entry profile for XFS · 19e13b0a

Darrick J. Wong authored Aug 10, 2023

Create a new document to list what I think are (within the scope of XFS)
our shared goals and community roles.  Since I will be stepping down
shortly, I feel it's important to write down somewhere all the hats that
I have been wearing for the past six years.

Also, document important extra details about how to contribute to XFS.

Cc: corbet@lwn.net
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

19e13b0a

06 Aug, 2023 8 commits

Linux 6.5-rc5 · 52a93d39
Linus Torvalds authored Aug 06, 2023

52a93d39

Merge tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 0108963f

Linus Torvalds authored Aug 06, 2023

Pull vfs fixes from Christian Brauner:

 - Fix a wrong check for O_TMPFILE during RESOLVE_CACHED lookup

 - Clean up directory iterators and clarify file_needs_f_pos_lock()

* tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: rely on ->iterate_shared to determine f_pos locking
  vfs: get rid of old '->iterate' directory operation
  proc: fix missing conversion to 'iterate_shared'
  open: make RESOLVE_CACHED correctly test for O_TMPFILE

0108963f

fs: rely on ->iterate_shared to determine f_pos locking · 7d84d1b9

Christian Brauner authored Aug 06, 2023

Now that we removed ->iterate we don't need to check for either
->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check
for ->iterate_shared instead. This will tell us whether we need to
unconditionally take the lock. Not just does it allow us to avoid
checking f_inode's mode it also actually clearly shows that we're
locking because of readdir.
Signed-off-by: Christian Brauner <brauner@kernel.org>

7d84d1b9

vfs: get rid of old '->iterate' directory operation · 3e327154

Linus Torvalds authored Aug 05, 2023

All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.

Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.

This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.

The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.

Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

3e327154

proc: fix missing conversion to 'iterate_shared' · 0a2c2baa

Linus Torvalds authored Aug 05, 2023

I'm looking at the directory handling due to the discussion about f_pos
locking (see commit 79796425: "file: reinstate f_pos locking
optimization for regular files"), and wanting to clean that up.

And one source of ugliness is how we were supposed to move filesystems
over to the '->iterate_shared()' function that only takes the inode lock
for reading many many years ago, but several filesystems still use the
bad old '->iterate()' that takes the inode lock for exclusive access.

See commit 61922694 ("introduce a parallel variant of ->iterate()")
that also added some documentation stating

      Old method is only used if the new one is absent; eventually it will
      be removed.  Switch while you still can; the old one won't stay.

and that was back in April 2016.  Here we are, many years later, and the
old version is still clearly sadly alive and well.

Now, some of those old style iterators are probably just because the
filesystem may end up having per-inode mutable data that it uses for
iterating a directory, but at least one case is just a mistake.

Al switched over most filesystems to use '->iterate_shared()' back when
it was introduced.  In particular, the /proc filesystem was converted as
one of the first ones in commit f50752ea ("switch all procfs
directories ->iterate_shared()").

But then later one new user of '->iterate()' was then re-introduced by
commit 6d9c939d ("procfs: add smack subdir to attrs").

And that's clearly not what we wanted, since that new case just uses the
same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions
that other /proc pident directories use, and they are most definitely
safe to use with the inode lock held shared.

So just fix it.

This still leaves a fair number of oddball filesystems using the
old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2,
overlayfs, and vboxsf), but at least we don't have any remaining in the
core filesystems.

I'm going to add a wrapper function that just drops the read-lock and
takes it as a write lock, so that we can clean up the core vfs layer and
make all the ugly 'this filesystem needs exclusive inode locking' be
just filesystem-internal warts.

I just didn't want to make that conversion when we still had a core user
left.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

0a2c2baa

open: make RESOLVE_CACHED correctly test for O_TMPFILE · a0fc452a

Aleksa Sarai authored Aug 06, 2023

O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old
fast-path check for RESOLVE_CACHED would reject all users passing
O_DIRECTORY with -EAGAIN, when in fact the intended test was to check
for __O_TMPFILE.

Cc: stable@vger.kernel.org # v5.12+
Fixes: 99668f61 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

a0fc452a

Merge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux · f0ab9f34

Linus Torvalds authored Aug 05, 2023

Pull rust fixes from Miguel Ojeda:

 - Allocator: prevent mis-aligned allocation

 - Types: delete 'ForeignOwnable::borrow_mut'. A sound replacement is
   planned for the merge window

 - Build: fix bindgen error with UBSAN_BOUNDS_STRICT

* tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux:
  rust: fix bindgen build error with UBSAN_BOUNDS_STRICT
  rust: delete `ForeignOwnable::borrow_mut`
  rust: allocator: Prevent mis-aligned allocation

f0ab9f34

Merge tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · fb0d9199

Linus Torvalds authored Aug 05, 2023

Pull ata fix from Damien Le Moal:

 - Prevent the scsi disk driver from issuing a START STOP UNIT command
   for ATA devices during system resume as this causes various issues
   reported by multiple users.

* tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata,scsi: do not issue START STOP UNIT on resume

fb0d9199

05 Aug, 2023 5 commits

Merge tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6 · f6a69168

Linus Torvalds authored Aug 05, 2023

Pull smb client fix from Steve French:

 - Fix DFS interlink problem (different namespace)

* tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix dfs link mount against w2k8

f6a69168

Merge tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 251a94f1

Linus Torvalds authored Aug 05, 2023

Pull powerpc fixes from Michael Ellerman:

 - Fix vmemmap altmap boundary check which could cause memory hotunplug
   failure

 - Create a dummy stackframe to fix ftrace stack unwind

 - Fix secondary thread bringup for Book3E ELFv2 kernels

 - Use early_ioremap/unmap() in via_calibrate_decr()

Thanks to Aneesh Kumar K.V, Benjamin Gray, Christophe Leroy, David
Hildenbrand, and Naveen N Rao.

* tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powermac: Use early_* IO variants in via_calibrate_decr()
  powerpc/64e: Fix secondary thread bringup for ELFv2 kernels
  powerpc/ftrace: Create a dummy stackframe to fix stack unwind
  powerpc/mm/altmap: Fix altmap boundary check

251a94f1

Merge tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 947c2a83

Linus Torvalds authored Aug 05, 2023

Pull parisc architecture fixes from Helge Deller:

 - early fixmap preallocation to fix boot failures on kernel >= 6.4

 - remove DMA leftover code in parport_gsc

 - drop old comments and code style fixes

* tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: unaligned: Add required spaces after ','
  parport: gsc: remove DMA leftover code
  parisc: pci-dma: remove unused and dead EISA code and comment
  parisc/mm: preallocate fixmap page tables at init

947c2a83

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · c9d26d8d

Linus Torvalds authored Aug 04, 2023

Pull clk fixes from Stephen Boyd:
 "A few clk driver fixes for some SoC clk drivers:

   - Change a usleep() to udelay() to avoid scheduling while atomic in
     the Amlogic PLL code

   - Revert a patch to the Mediatek MT8183 driver that caused an
     out-of-bounds write

   - Return the right error value when devm_of_iomap() fails in
     imx93_clocks_probe()

   - Constrain the Kconfig for the fixed mmio clk so that it depends on
     HAS_IOMEM and can't be compiled on architectures such as s390"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM
  clk: imx93: Propagate correct error in imx93_clocks_probe()
  clk: mediatek: mt8183: Add back SSPM related clocks
  clk: meson: change usleep_range() to udelay() for atomic context

c9d26d8d

Merge tag 'hyperv-fixes-signed-20230804' of... · 024ff300

Linus Torvalds authored Aug 04, 2023

Merge tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix a bug in a python script for Hyper-V (Ani Sinha)

 - Workaround a bug in Hyper-V when IBT is enabled (Michael Kelley)

 - Fix an issue parsing MP table when Linux runs in VTL2 (Saurabh
   Sengar)

 - Several cleanup patches (Nischala Yelchuri, Kameron Carr, YueHaibing,
   ZhiHu)

* tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()
  x86/hyperv: add noop functions to x86_init mpparse functions
  vmbus_testing: fix wrong python syntax for integer value comparison
  x86/hyperv: fix a warning in mshyperv.h
  x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction
  x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg
  Drivers: hv: Change hv_free_hyperv_page() to take void * argument

024ff300

04 Aug, 2023 13 commits

Merge tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · e661f98c

Linus Torvalds authored Aug 04, 2023

Pull RISC-V fixes from Palmer Dabbelt:

 - A pair of fixes for build-related failures in the selftests

 - A fix for a sparse warning in acpi_os_ioremap()

 - A fix to restore the kernel PA offset in vmcoreinfo, to fix crash
   handling

* tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Documentation: kdump: Add va_kernel_pa_offset for RISCV64
  riscv: Export va_kernel_pa_offset in vmcoreinfo
  RISC-V: ACPI: Fix acpi_os_ioremap to return iomem address
  selftests: riscv: Fix compilation error with vstate_exec_nolibc.c
  selftests/riscv: fix potential build failure during the "emit_tests" step

e661f98c

Merge tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ea4f142f

Linus Torvalds authored Aug 04, 2023

Pull power management fix from Rafael Wysocki:
 "Fix a sparse warning triggered by the TPMI interface recently added to
  the Intel RAPL power capping driver (Zhang Rui)"

* tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: Fix a sparse warning in TPMI interface

ea4f142f

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · e6fda526

Linus Torvalds authored Aug 04, 2023

Pull arm64 fixes from Catalin Marinas:
 "More SVE/SME fixes for ptrace() and for the (potentially future) case
  where SME is implemented in hardware without SVE support"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE
  arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems
  arm64/ptrace: Don't enable SVE when setting streaming SVE
  arm64/ptrace: Flush FP state when setting ZT0
  arm64/fpsimd: Clear SME state in the target task when setting the VL

e6fda526

Merge tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · c8273a25

Linus Torvalds authored Aug 04, 2023

Pull mtd fixes from Miquel Raynal:
 "Raw NAND fixes:
   - fsl_upm: Fix an off-by one test in fun_exec_op()
   - Rockchip:
       - Align hwecc vs. raw page helper layouts
       - Fix oobfree offset and description
   - Meson: Fix OOB available bytes for ECC
   - Omap ELM: Fix incorrect type in assignment

  SPI-NOR fix:
   - Avoid holes in struct spi_mem_op

  Hyperbus fix:
   - Add Tudor as reviewer in MAINTAINERS

  SPI-NAND fixes:
   - Winbond and Toshiba: Fix ecc_get_status"

* tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op()
  mtd: spi-nor: avoid holes in struct spi_mem_op
  MAINTAINERS: Add myself as reviewer for HYPERBUS
  mtd: rawnand: rockchip: Align hwecc vs. raw page helper layouts
  mtd: rawnand: rockchip: fix oobfree offset and description
  mtd: rawnand: meson: fix OOB available bytes for ECC
  mtd: rawnand: omap_elm: Fix incorrect type in assignment
  mtd: spinand: winbond: Fix ecc_get_status
  mtd: spinand: toshiba: Fix ecc_get_status

c8273a25

Merge tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm · 4142fc67

Linus Torvalds authored Aug 04, 2023

Pull drm fixes from Dave Airlie:
 "Small set of fixes this week, i915 and a few misc ones. I didn't see
  an amd pull so maybe next week it'll have a few more on that driver.

  ttm:
   - NULL ptr deref fix

  panel:
   - add missing MODULE_DEVICE_TABLE

  imx/ipuv3:
   - timing fix

  i915:
   - Fix bug in getting msg length in AUX CH registers handler
   - Gen12 AUX invalidation fixes
   - Fix premature release of request's reusable memory"

* tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: samsung-s6d7aa0: Add MODULE_DEVICE_TABLE
  drm/i915: Fix premature release of request's reusable memory
  drm/i915/gt: Support aux invalidation on all engines
  drm/i915/gt: Poll aux invalidation register bit on invalidation
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control and in the CS
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915: Add the gen12_needs_ccs_aux_inv helper
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915/gvt: Fix bug in getting msg length in AUX CH registers handler
  drm/imx/ipuv3: Fix front porch adjustment upon hactive aligning
  drm/ttm: check null pointer before accessing when swapping

4142fc67

Merge tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client · 4593f3c2

Linus Torvalds authored Aug 04, 2023

Pull ceph fixes from Ilya Dryomov:
 "Two patches to improve RBD exclusive lock interaction with
  osd_request_timeout option and another fix to reduce the potential for
  erroneous blocklisting -- this time in CephFS. All going to stable"

* tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client:
  libceph: fix potential hang in ceph_osdc_notify()
  rbd: prevent busy loop when requesting exclusive lock
  ceph: defer stopping mdsc delayed_work

4593f3c2

file: reinstate f_pos locking optimization for regular files · 79796425

Linus Torvalds authored Aug 03, 2023

In commit 20ea1e7d ("file: always lock position for
FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because
pidfd_getfd() could get a reference to the file even when it didn't have
an elevated file count due to threading of other sharing cases.

But Mateusz Guzik reports that the extra locking is actually measurable,
so let's re-introduce the optimization, and only force the locking for
directory traversal.

Directories need the lock for correctness reasons, while regular files
only need it for "POSIX semantics". Since pidfd_getfd() is about
debuggers etc special things that are _way_ outside of POSIX, we can
relax the rules for that case.
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

79796425

arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE · 69af56ae

Mark Brown authored Aug 03, 2023

We have a function sve_sync_from_fpsimd_zeropad() which is used by the
ptrace code to update the SVE state when the user writes to the the
FPSIMD register set. Currently this checks that the task has SVE
enabled but this will miss updates for tasks which have streaming SVE
enabled if SVE has not been enabled for the thread, also do the
conversion if the task has streaming SVE enabled.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-3-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

69af56ae

arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems · 507ea5dd

Mark Brown authored Aug 03, 2023

Currently we guard FPSIMD/SVE state conversions with a check for the system
supporting SVE but SME only systems may need to sync streaming mode SVE
state so add a check for SME support too. These functions are only used
by the ptrace code.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-2-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

507ea5dd

arm64/ptrace: Don't enable SVE when setting streaming SVE · 045aecdf

Mark Brown authored Aug 03, 2023

Systems which implement SME without also implementing SVE are
architecturally valid but were not initially supported by the kernel,
unfortunately we missed one issue in the ptrace code.

The SVE register setting code is shared between SVE and streaming mode
SVE. When we set full SVE register state we currently enable TIF_SVE
unconditionally, in the case where streaming SVE is being configured on a
system that supports vanilla SVE this is not an issue since we always
initialise enough state for both vector lengths but on a system which only
support SME it will result in us attempting to restore the SVE vector
length after having set streaming SVE registers.

Fix this by making the enabling of SVE conditional on setting SVE vector
state. If we set streaming SVE state and SVE was not already enabled this
will result in a SVE access trap on next use of normal SVE, this will cause
us to flush our register state but this is fine since the only way to
trigger a SVE access trap would be to exit streaming mode which will cause
the in register state to be flushed anyway.

Fixes: e12310a0 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-1-49df214bfb3e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

045aecdf

rust: fix bindgen build error with UBSAN_BOUNDS_STRICT · b0554488

Andrea Righi authored Jul 11, 2023

With commit 2d47c695 ("ubsan: Tighten UBSAN_BOUNDS on GCC") if
CONFIG_UBSAN is enabled and gcc supports -fsanitize=bounds-strict, we
can trigger the following build error due to bindgen lacking support for
this additional build option:

   BINDGEN rust/bindings/bindings_generated.rs
 error: unsupported argument 'bounds-strict' to option '-fsanitize='

Fix by adding -fsanitize=bounds-strict to the list of skipped gcc flags
for bindgen.

Fixes: 2d47c695 ("ubsan: Tighten UBSAN_BOUNDS on GCC")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Link: https://lore.kernel.org/r/20230711071914.133946-1-andrea.righi@canonical.comSigned-off-by: Miguel Ojeda <ojeda@kernel.org>

b0554488

rust: delete `ForeignOwnable::borrow_mut` · 1d24eb2d

Alice Ryhl authored Jul 06, 2023

We discovered that the current design of `borrow_mut` is problematic.
This patch removes it until a better solution can be found.

Specifically, the current design gives you access to a `&mut T`, which
lets you change where the `ForeignOwnable` points (e.g., with
`core::mem::swap`). No upcoming user of this API intended to make that
possible, making all of them unsound.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Fixes: 0fc4424d ("rust: types: introduce `ForeignOwnable`")
Link: https://lore.kernel.org/r/20230706094615.3080784-1-aliceryhl@google.comSigned-off-by: Miguel Ojeda <ojeda@kernel.org>

1d24eb2d

rust: allocator: Prevent mis-aligned allocation · b3d8aa84

Boqun Feng authored Jul 29, 2023

Currently the rust allocator simply passes the size of the type Layout
to krealloc(), and in theory the alignment requirement from the type
Layout may be larger than the guarantee provided by SLAB, which means
the allocated object is mis-aligned.

Fix this by adjusting the allocation size to the nearest power of two,
which SLAB always guarantees a size-aligned allocation. And because Rust
guarantees that the original size must be a multiple of alignment and
the alignment must be a power of two, then the alignment requirement is
satisfied.
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Co-developed-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org # v6.1+
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 247b365d ("rust: add `kernel` crate")
Link: https://github.com/Rust-for-Linux/linux/issues/974
Link: https://lore.kernel.org/r/20230730012905.643822-2-boqun.feng@gmail.com
[ Applied rewording of comment as discussed in the mailing list. ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

b3d8aa84

03 Aug, 2023 3 commits

Merge tag 'drm-intel-fixes-2023-08-03' of... · 1958b0f9

Dave Airlie authored Aug 04, 2023

Merge tag 'drm-intel-fixes-2023-08-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix bug in getting msg length in AUX CH registers handler [gvt] (Yan Zhao)
- Gen12 AUX invalidation fixes [gt] (Andi Shyti, Jonathan Cavitt)
- Fix premature release of request's reusable memory (Janusz Krzysztofik)

- Merge tag 'gvt-fixes-2023-08-02' of https://github.com/intel/gvt-linux into drm-intel-fixes (Tvrtko Ursulin)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMtkxWGuUKpaRMmo@tursulin-desk

1958b0f9

Merge tag 'drm-misc-fixes-2023-08-03' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes · 062ff85b

Dave Airlie authored Aug 04, 2023

A NULL pointer dereference fix for TTM, a timings fix for imx/ipuv3 and
the addition of a MODULE_DEVICE_TABLE for the samsung-s6d7aa0 panel.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ztfogof2dhtlvjwe73mvd2jp5kbldhkkav7k5culuseqblwpti@qfobohwx3c3j

062ff85b

Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of... · c1a515d3

Linus Torvalds authored Aug 03, 2023

Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix segfault in the powerpc specific arch_skip_callchain_idx
   function. The patch doing the reference count init/exit that went
   into 6.5 missed this function.

 - Fix regression reading the arm64 PMU cpu slots in sysfs, a patch
   removing some code duplication ended up duplicating the /sysfs prefix
   for these files.

 - Fix grouping of events related to topdown, addressing a regression on
   the CSV output produced by 'perf stat' noticed on the downstream tool
   toplev.

 - Fix the uprobe_from_different_cu 'perf test' entry, it is failing
   when gcc isn't available, so we need to check that and skip the test
   if it is not installed.

* tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf test parse-events: Test complex name has required event format
  perf pmus: Create placholder regardless of scanning core_only
  perf test uprobe_from_different_cu: Skip if there is no gcc
  perf parse-events: Only move force grouped evsels when sorting
  perf parse-events: When fixing group leaders always set the leader
  perf parse-events: Extra care around force grouped events
  perf callchain powerpc: Fix addr location init during arch_skip_callchain_idx function
  perf pmu arm64: Fix reading the PMU cpu slots in sysfs

c1a515d3