1. 13 Dec, 2022 40 commits
    • Linus Torvalds's avatar
      Merge tag 'mtd/for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 1e4fa020
      Linus Torvalds authored
      Pull mtd updates from Miquel Raynal:
       "MTD core changes:
         - Fix refcount error in del_mtd_device()
         - Fix possible resource leak in init_mtd()
         - Set ROOT_DEV for partitions marked as rootfs in DT
         - Describe marking rootfs partitions in the bindings
         - Fix device name leak when register device fails in add_mtd_device()
         - Try to find OF node for every MTD partition
         - simplify (a bit) code find partition-matching dynamic OF node
      
        MTD driver changes:
         - pxa2xx-flash maps: fix memory leak in probe
         - BCM parser: refer to ARCH_BCMBCA instead of ARCH_BCM4908
         - lpddr2_nvm: Fix possible null-ptr-deref
         - inftlcore: fix repeated words in comments
         - lart: remove driver
         - tplink:
            - Add TP-Link SafeLoader partitions table parser and bindings
            - Describe TP-Link SafeLoader parser
            - Describe TP-Link SafeLoader dynamic subpartitions
         - mtdoops:
            - Panic caused mtdoops to call mtdoops_erase function immediately
            - Add mtdoops_erase function and move mtdoops_inc_counter after it
            - Change printk() to counterpart pr_ functions
      
        MTD binding cleanup:
         - Fixed-partitions: Fix 'sercomm,scpart-id' schema
         - Standardize the style in the examples
         - Drop object types when referencing other files
         - Argue in favor of keeping additionalProperties set to true
         - NVMEM-cells:
            - Inherit from MTD partitions
            - Drop range property from example
         - Partitions:
            - Change qcom,smem-part partition type
            - Constrain the list of parsers
         - Physmap: Reuse the generic definitions
         - SPI-NOR: Drop common properties
         - Sunxi-nand: Add an example to validate the bindings
         - Onenand: Mention the expected node name
         - Ingenic: Mark partitions in the controller node as deprecated
         - NAND:
            - Standardize the child node name
            - Drop common properties already defined in generic files
            - nand-chip.yaml should reference mtd.yaml
         - Remove useless file about partitions
         - Clarify all partition subnodes
      
        SPI NOR core changes:
         - Add support for flash reset using the dt reset-gpios property.
         - Update hwcaps.mask to include 8D-8D-8D read and page program ops
           when xSPI profile 1.0 table is defined.
         - Bypass zero erase size in spi_nor_find_best_erase_type().
         - Fix select_uniform_erase to skip 0 erase size
         - Add generic flash driver. If a flash is not found in the flash_info
           array, fall back to the generic flash driver which is described
           solely by the flash's SFDP tables.
         - Fix the number of bytes for the dummy cycles in
           spi_nor_spimem_check_readop().
         - Introduce SPI_NOR_QUAD_PP flag, as PP_1_1_4 is not SFDP
           discoverable.
      
        SPI NOR manufacturer drivers changes:
         - Spansion:
            - use PARSE_SFDP for s28hs512t,
            - add support for s28hl512t, s28hl01gt, and s28hs01gt.
         - Gigadevice: Replace default_init() with post_bfpt() for gd25q256.
         - Micron - ST: Enable locking for mt25qu256a.
         - Winbond: Add support for W25Q512NW-IQ.
         - ISSI: Use PARSE_SFDP and SPI_NOR_QUAD_PP.
      
        Raw NAND core changes:
         - Drop obsolete dependencies on COMPILE_TEST
         - MAINTAINERS: rectify entry for MESON NAND controller bindings
         - Drop EXPORT_SYMBOL_GPL for nanddev_erase()
      
        Raw NAND driver changes:
         - marvell: Enable NFC/DEVBUS arbiter
         - gpmi: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync
         - mpc5121: Replace NO_IRQ by 0
         - lpc32xx_{slc,mlc}:
            - Switch to using pm_ptr()
            - Switch to using gpiod API
         - lpc32xx_mlc: Switch to using pm_ptr()
         - cadence: Support 64-bit slave dma interface
         - rockchip: Describe rk3128-nfc in the bindings
         - brcmnand: Update interrupts description in the bindings
      
        SPI-NAND driver changes:
         - winbond:
            - Add Winbond W25N02KV flash support
            - Fix flash identification"
      
      * tag 'mtd/for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (76 commits)
        mtd: rawnand: Drop obsolete dependencies on COMPILE_TEST
        mtd: maps: pxa2xx-flash: fix memory leak in probe
        mtd: core: Fix refcount error in del_mtd_device()
        mtd: spi-nor: add SFDP fixups for Quad Page Program
        mtd: spi-nor: issi: is25wp256: Init flash based on SFDP
        mtd: spi-nor: winbond: add support for W25Q512NW-IQ
        mtd: spi-nor: micron-st: Enable locking for mt25qu256a
        mtd: spi-nor: Fix the number of bytes for the dummy cycles
        mtd: spi-nor: gigadevice: gd25q256: replace gd25q256_default_init with gd25q256_post_bfpt
        mtd: spi-nor: Fix formatting in spi_nor_read_raw() kerneldoc comment
        mtd: spi-nor: sysfs: print JEDEC ID for generic flash driver
        mtd: spi-nor: add generic flash driver
        mtd: spi-nor: fix select_uniform_erase to skip 0 erase size
        mtd: spi-nor: move function declaration out of sfdp.h
        mtd: spi-nor: remember full JEDEC flash ID
        mtd: spi-nor: sysfs: hide manufacturer if it is not set
        mtd: spi-nor: hide jedec_id sysfs attribute if not present
        mtd: spi-nor: Check for zero erase size in spi_nor_find_best_erase_type()
        mtd: rawnand: marvell: Enable NFC/DEVBUS arbiter
        mtd: parsers: refer to ARCH_BCMBCA instead of ARCH_BCM4908
        ...
      1e4fa020
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm · a594533d
      Linus Torvalds authored
      Pull drm updates from Dave Airlie:
       "The biggest highlight is that the accel subsystem framework is merged.
        Hopefully for 6.3 we will be able to line up a driver to use it.
      
        In drivers land, i915 enables DG2 support by default now, and nouveau
        has a big stability refactoring and initial ampere support, AMD
        includes new hw IP support and should build on ARM again. There is
        also an ofdrm driver to take over offb on platforms it's used.
      
        Stuff outside my tree, the dma-buf patches hit a few places, the vc4
        firmware changes also do, and i915 has some interactions with MEI for
        discrete GPUs. I think all of those should have been acked/reviewed by
        relevant parties.
      
        New driver:
         - ofdrm - replacement for offb
      
        fbdev:
         - add support for nomodeset
      
        fourcc:
         - add Vivante tiled modifier
      
        core:
         - atomic-helpers: CRTC primary plane test fixes, fb access hooks
         - connector: TV API consistency, cmdline parser improvements
         - send connector hotplug on cleanup
         - sort makefile objects
      
        tests:
         - sort kunit tests
         - improve DP-MST tests
         - add kunit helpers to create a device
      
        sched:
         - module param for scheduling policy
         - refcounting fix
      
        buddy:
         - add back random seed log
      
        ttm:
         - convert ttm_resource to size_t
         - optimize pool allocations
      
        edid:
         - HFVSDB parsing support fixes
         - logging/debug improvements
         - DSC quirks
      
        dma-buf:
         - Add unlocked vmap and attachment mapping
         - move drivers to common locking convention
         - locking improvements
      
        firmware:
         - new API for rPI firmware and vc4
      
        xilinx:
         - zynqmp: displayport bridge support
         - dpsub fix
      
        bridge:
         - adv7533: Remove dynamic lane switching
         - it6505: Runtime PM support, sync improvements
         - ps8640: Handle AUX defer messages
         - tc358775: Drop soft-reset over I2C
      
        panel:
         - panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
         - Jadard JD9365DA-H3
         - NewVision NV3051D
      
        amdgpu:
         - DCN support on ARM
         - DCN 2.1 secure display
         - Sienna Cichlid mode2 reset fixes
         - new GC 11.x firmware versions
         - drop AMD specific DSC workarounds in favour of drm code
         - clang warning fixes
         - scheduler rework
         - SR-IOV fixes
         - GPUVM locking fixes
         - fix memory leak in CS IOCTL error path
         - flexible array updates
         - enable new GC/PSP/SMU/NBIO IP
         - GFX preemption support for gfx9
      
        amdkfd:
         - cache size fixes
         - userptr fixes
         - enable cooperative launch on gfx 10.3
         - enable GC 11.0.4 KFD support
      
        radeon:
         - replace kmap with kmap_local_page
         - ACPI ref count fix
         - HDA audio notifier support
      
        i915:
         - DG2 enabled by default
         - MTL enablement work
         - hotplug refactoring
         - VBT improvements
         - Display and watermark refactoring
         - ADL-P workaround
         - temp disable runtime_pm for discrete-
         - fix for A380 as a secondary GPU
         - Wa_18017747507 for DG2
         - CS timestamp support fixes for gen5 and earlier
         - never purge busy TTM objects
         - use i915_sg_dma_sizes for all backends
         - demote GuC kernel contexts to normal priority
         - gvt: refactor for new MDEV interface
         - enable DC power states on eDP ports
         - fix gen 2/3 workarounds
      
        nouveau:
         - fix page fault handling
         - Ampere acceleration support
         - driver stability improvements
         - nva3 backlight support
      
        msm:
         - MSM_INFO_GET_FLAGS support
         - DPU: XR30 and P010 image formats
         - Qualcomm SM6115 support
         - DSI PHY support for QCM2290
         - HDMI: refactored dev init path
         - remove exclusive-fence hack
         - fix speed-bin detection
         - enable clamp to idle on 7c3
         - improved hangcheck detection
      
        vmwgfx:
         - fb and cursor refactoring
         - convert to generic hashtable
         - cursor improvements
      
        etnaviv:
         - hw workarounds
         - softpin MMU fixes
      
        ast:
         - atomic gamma LUT support
         - convert to SHMEM
      
        lcdif:
         - support YUV planes
         - Increase DMA burst size
         - FIFO threshold tuning
      
        meson:
         - fix return type of cvbs mode_valid
      
        mgag200:
         - fix PLL setup on some revisions
      
        sun4i:
         - A100 and D1 support
      
        udl:
         - modesetting improvements
         - hot unplug support
      
        vc4:
         - support PAL-M
         - fix regression preventing 4K @ 60Hz
         - fix NULL ptr deref
      
        v3d:
         - switch to drm managed resources
      
        renesas:
         - RZ/G2L DSI support
         - DU Kconfig cleanup
      
        mediatek:
         - fixup dpi and hdmi
         - MT8188 dpi support
         - MT8195 AFBC support
      
        tegra:
         - NVDEC hardware on Tegra234 SoC
      
        hdlcd:
         - switch to drm managed resources
      
        ingenic:
         - fix registration error path
      
        hisilicon:
         - convert to drm_mode_init
      
        maildp:
         - use managed resources
      
        mtk:
         - use drm_mode_init
      
        rockchip:
         - use drm_mode_copy"
      
      * tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits)
        drm/amdgpu: fix mmhub register base coding error
        drm/amdgpu: add tmz support for GC IP v11.0.4
        drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
        drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
        drm/amdgpu: enable GFX IP v11.0.4 CG support
        drm/amdgpu: Make amdgpu_ring_mux functions as static
        drm/amdgpu: generally allow over-commit during BO allocation
        drm/amd/display: fix array index out of bound error in DCN32 DML
        drm/amd/display: 3.2.215
        drm/amd/display: set optimized required for comp buf changes
        drm/amd/display: Add debug option to skip PSR CRTC disable
        drm/amd/display: correct DML calc error of UrgentLatency
        drm/amd/display: correct static_screen_event_mask
        drm/amd/display: Ensure commit_streams returns the DC return code
        drm/amd/display: read invalid ddc pin status cause engine busy
        drm/amd/display: Bypass DET swath fill check for max clocks
        drm/amd/display: Disable uclk pstate for subvp pipes
        drm/amd/display: Fix DCN2.1 default DSC clocks
        drm/amd/display: Enable dp_hdmi21_pcon support
        drm/amd/display: prevent seamless boot on displays that don't have the preferred dig
        ...
      a594533d
    • Linus Torvalds's avatar
      Merge tag 'media/v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · cdb9d353
      Linus Torvalds authored
      Pull media updates from Mauro Carvalho Chehab:
      
       - DVB core changes to avoid refcount troubles and UAF
      
       - DVB API/core has gained support for DVB-C2 and DVB-S2X
      
       - New sensor drivers: ov08x40, ov4689.c, st-vgxy61 and tc358746.c
      
       - Removal of an unused sensor driver: s5k4ecgx
      
       - Move microchip_csi2dc to a new directory, named after the
         manufacturer
      
       - Add media controller support to Microship drivers
      
       - Old Atmel/Microship drivers that don't use media controler got moved
         to staging
      
       - New drivers added for Renesas RZ/G2L CRU and MIPI CSI-2 support
      
       - Allwinner A31 camera sensor driver code was now split into a bridge
         and a separate processor driver
      
       - Added a virtual stateless decoder driver in order to test core
         support for stateless drivers and test userspace apps using it
      
       - removed platform-based support for ov9650, as this is not used
         anymore
      
       - atomisp now uses videobuf2 and supports normal mmap mode
      
       - the imx7-media-csi driver got promoted from staging
      
       - rcar-vin driver has gained support for gen3 UDS (Up Down Scaler)
      
       - most i2c drivers now use I2C .probe_new() kAPI
      
       - lots of drivers fixes, cleanups and improvements
      
      * tag 'media/v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (544 commits)
        media: s5c73m3: Switch to GPIO descriptors
        media: i2c: s5k5baf: switch to using gpiod API
        media: i2c: s5k6a3: switch to using gpiod API
        media: imx: remove code for non-existing config IMX_GPT_ICAP
        media: si470x: Fix use-after-free in si470x_int_in_callback()
        media: staging: stkwebcam: Restore MEDIA_{USB,CAMERA}_SUPPORT dependencies
        media: coda: Add check for kmalloc
        media: coda: Add check for dcoda_iram_alloc
        dt-bindings: media: s5c73m3: Fix reset-gpio descriptor
        media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property
        media: s5k4ecgx: Delete driver
        media: s5k4ecgx: Switch to GPIO descriptors
        media: Switch to use dev_err_probe() helper
        headers: Remove some left-over license text in include/uapi/linux/v4l2-*
        headers: Remove some left-over license text in include/uapi/linux/dvb/
        media: usb: pwc-uncompress: Use flex array destination for memcpy()
        media: s5p-mfc: Fix to handle reference queue during finishing
        media: s5p-mfc: Clear workbit to handle error condition
        media: s5p-mfc: Fix in register read and write for H264
        media: imx: Use get_mbus_config instead of parsing upstream DT endpoints
        ...
      cdb9d353
    • Linus Torvalds's avatar
      Merge tag 'sound-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 102f9d3d
      Linus Torvalds authored
      Pull sound updates from Takashi Iwai:
       "This looks like a relatively calm development cycle; there have been
        only few changes in ALSA and ASoC core sides while we get lots of
        device-specific fixes and updates as usual. Most of commits are about
        ASoC, including Intel SOF/AVS and many device tree updates.
      
        Below are some highlights:
      
        Core:
         - Improvement in memalloc helper for fallback allocations
         - More cleanups of ASoC DAPM code
      
        ASoC:
         - Factoring out of mapping hw_params onto SoundWire configuration
         - The ever ongoing overhauls of the Intel DSP code continue,
           including support for loading libraries and probes with IPC4 on
           SOF.
         - Support for more sample formats on JZ4740
         - Lots of device tree conversions and fixups
         - Support for Allwinner D1, a range of AMD and Intel systems,
           Mediatek systems with multiple DMICs, Nuvoton NAU8318, NXP
           fsl_rpmsg and i.MX93, Qualcomm AudioReach Enable, MFC and SAL,
           RealTek RT1318 and Rockchip RK3588
      
        ALSA:
         - Addition of PCM kselftest; still minimalistic but can be extended
           in future
         - Fixes for corner-case XRUNs with USB-audio implicit feedback mode
         - Usual device-specific quirk updates for USB- and HD-audio
         - FireWire DICE updates
      
        This also contains a few cross-tree updates:
         - Some OMAP board file updates for removal of relevant OMAP platforms
         - A new I2C API update for I2C probe API adaption
         - A DRM update for the further hdmi-codec updates"
      
      * tag 'sound-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (417 commits)
        ALSA: mts64: fix possible null-ptr-defer in snd_mts64_interrupt
        ALSA: patch_realtek: Fix Dell Inspiron Plus 16
        ALSA: hda/cirrus: Add extra 10 ms delay to allow PLL settle and lock.
        ASoC: dt-bindings: Correct Alexandre Belloni email
        ASoC: dt-bindings: maxim,max98504: Convert to DT schema
        ASoC: dt-bindings: maxim,max98357a: Convert to DT schema
        ASoC: dt-bindings: Reference common DAI properties
        ASoC: dt-bindings: Extend name-prefix.yaml into common DAI properties
        ASoC: rt715: Make read-only arrays capture_reg_H and capture_reg_L static const
        ASoC: uniphier: aio-core: Make some read-only arrays static const
        ASoC: wcd938x: Make read-only array minCode_param static const
        ASoC: qcom: lpass-sc7280: Add maybe_unused tag for system PM ops
        ASoC : SOF: amd: Add support for IPC and DSP dumps
        ASoC: SOF: amd: Use poll function instead to read ACP_SHA_DSP_FW_QUALIFIER
        ALSA: usb-audio: Workaround for XRUN at prepare
        ALSA: pcm: Handle XRUN at trigger START
        ALSA: pcm: Set missing stop_operating flag at undoing trigger start
        drm: tda99x: Don't advertise non-existent capture support
        ASoC: hdmi-codec: Allow playback and capture to be disabled
        kselftest/alsa: Add more coverage of sample rates and channel counts
        ...
      102f9d3d
    • Linus Torvalds's avatar
      Merge tag 'for-6.2/dm-changes' of... · 8715c6d3
      Linus Torvalds authored
      Merge tag 'for-6.2/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - Fix use-after-free races due to missing resource cleanup during DM
         target destruction in DM targets: thin-pool, cache, integrity and
         clone.
      
       - Fix ABBA deadlocks in DM thin-pool and cache targets due to their use
         of a bufio client (that has a shrinker whose locking can cause the
         incorrect locking order).
      
       - Fix DM cache target to set its needs_check flag after first aborting
         the metadata (whereby using reset persistent-data objects to update
         the superblock with, otherwise the superblock update could be dropped
         due to aborting metadata). This was found with code-inspection when
         comparing with the equivalent in DM thinp code.
      
       - Fix DM thin-pool's presume to continue resuming the device even if
         the pool in is fail mode -- otherwise bios may never be failed up the
         IO stack (which will prevent resetting the thin-pool target via table
         reload)
      
       - Fix DM thin-pool's metadata to use proper btree root (from previous
         transaction) if metadata commit failed.
      
       - Add 'waitfor' module param to DM module (dm_mod) to allow dm-init to
         wait for the specified device before continuing with its DM target
         initialization.
      
      * tag 'for-6.2/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm thin: Use last transaction's pmd->root when commit failed
        dm init: add dm-mod.waitfor to wait for asynchronously probed block devices
        dm ioctl: fix a couple ioctl codes
        dm ioctl: a small code cleanup in list_version_get_info
        dm thin: resume even if in FAIL mode
        dm cache: set needs_check flag after aborting metadata
        dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort
        dm thin: Fix ABBA deadlock between shrink_slab and dm_pool_abort_metadata
        dm integrity: Fix UAF in dm_integrity_dtr()
        dm cache: Fix UAF in destroy()
        dm clone: Fix UAF in clone_dtr()
        dm thin: Fix UAF in run_timer_softirq()
      8715c6d3
    • Linus Torvalds's avatar
      Merge tag 'ata-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 8ecd28b7
      Linus Torvalds authored
      Pull ata updates from Damien Le Moal:
       "The ususal set of driver fixes and improvements as well as several
        patches improving libata core in preparation of the introduction of
        the support for the command duration limits feature. In more details:
      
         - Define the missing COMPLETED sense key in scsi header (me)
      
         - Several patches to improve libata handling of the status of
           completed commands and the retry and sense data reported to the
           scsi layer for failed commands. In particular, this widen the
           support for NCQ autosense to all drives that support this feature
           instead of restricting this feature use to ZAC drives only (Niklas)
      
         - Cleanup of the pata_mpc52xx and sata_dwc_460ex drivers to remove
           the use of the deprecated NO_IRQ macro (Christophe)
      
         - Fix build dedependency on OF vs use of the of_match_ptr() macro to
           avoid build errors with the sata_gemini and pata_ftide010 drivers
           (me)
      
         - Some libata cleanups using the new helper function
           ata_port_is_frozen() (Niklas)
      
         - Improve internal command handling by not retrying commands that
           failed with a timeout (Niklas)
      
         - Remove code for several unused libata helper functions (from
           Niklas)
      
         - Remove the palmchip pata_bk3710 driver. A couple of other driver
           removal should come in through the arm tree pull request (from
           Arnd)
      
         - Remove unused variable and function in the sata_dwc_460ex driver
           and libata-sff code (Colin and Sergey)
      
         - Minor cleanup of the pata_ep93xx driver platform code (from
           Minghao)
      
         - Remove the unnecessary linux/msi.h include from the ahci driver
           (Thomas)
      
         - Changes to libata enum constants definitions to avoid warnings with
           gcc-13 (Arnd)"
      
      * tag 'ata-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (24 commits)
        ata: ahci: fix enum constants for gcc-13
        ata: libata: fix commands incorrectly not getting retried during NCQ error
        ata: ahci: Remove linux/msi.h include
        ata: sata_dwc_460ex: Check !irq instead of irq == NO_IRQ
        ata: pata_ep93xx: use devm_platform_get_and_ioremap_resource()
        ata: libata-sff: kill unused ata_sff_busy_sleep()
        ata: sata_dwc_460ex: remove variable num_processed
        ata: remove palmchip pata_bk3710 driver
        ata: remove unused helper ata_id_flush_ext_enabled()
        ata: remove unused helper ata_id_flush_enabled()
        ata: remove unused helper ata_id_lba48_enabled()
        ata: libata-core: do not retry reading the log on timeout
        scsi: libsas: make use of ata_port_is_frozen() helper
        ata: make use of ata_port_is_frozen() helper
        ata: add ata_port_is_frozen() helper
        ata: pata_ftide010: Remove build dependency on OF
        ata: sata_gemini: Remove dependency on OF for compile tests
        ata: pata_mpc52xx: Replace NO_IRQ with 0
        ata: libahci: read correct status and error field for NCQ commands
        ata: libata: fetch sense data for ATA devices supporting sense reporting
        ...
      8ecd28b7
    • Linus Torvalds's avatar
      Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux · ce8a79d5
      Linus Torvalds authored
      Pull block updates from Jens Axboe:
      
       - NVMe pull requests via Christoph:
            - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
              Joshi)
            - Refactor PCIe probing and reset (Christoph Hellwig)
            - Various fabrics authentication fixes and improvements (Sagi
              Grimberg)
            - Avoid fallback to sequential scan due to transient issues (Uday
              Shankar)
            - Implement support for the DEAC bit in Write Zeroes (Christoph
              Hellwig)
            - Allow overriding the IEEE OUI and firmware revision in configfs
              for nvmet (Aleksandr Miloserdov)
            - Force reconnect when number of queue changes in nvmet (Daniel
              Wagner)
            - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
              Grimberg, Christoph Hellwig, Christophe JAILLET)
            - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
            - Use the common tagset helpers in nvme-pci driver (Christoph
              Hellwig)
            - Cleanup the nvme-pci removal path (Christoph Hellwig)
            - Use kstrtobool() instead of strtobool (Christophe JAILLET)
            - Allow unprivileged passthrough of Identify Controller (Joel
              Granados)
            - Support io stats on the mpath device (Sagi Grimberg)
            - Minor nvmet cleanup (Sagi Grimberg)
      
       - MD pull requests via Song:
            - Code cleanups (Christoph)
            - Various fixes
      
       - Floppy pull request from Denis:
            - Fix a memory leak in the init error path (Yuan)
      
       - Series fixing some batch wakeup issues with sbitmap (Gabriel)
      
       - Removal of the pktcdvd driver that was deprecated more than 5 years
         ago, and subsequent removal of the devnode callback in struct
         block_device_operations as no users are now left (Greg)
      
       - Fix for partition read on an exclusively opened bdev (Jan)
      
       - Series of elevator API cleanups (Jinlong, Christoph)
      
       - Series of fixes and cleanups for blk-iocost (Kemeng)
      
       - Series of fixes and cleanups for blk-throttle (Kemeng)
      
       - Series adding concurrent support for sync queues in BFQ (Yu)
      
       - Series bringing drbd a bit closer to the out-of-tree maintained
         version (Christian, Joel, Lars, Philipp)
      
       - Misc drbd fixes (Wang)
      
       - blk-wbt fixes and tweaks for enable/disable (Yu)
      
       - Fixes for mq-deadline for zoned devices (Damien)
      
       - Add support for read-only and offline zones for null_blk
         (Shin'ichiro)
      
       - Series fixing the delayed holder tracking, as used by DM (Yu,
         Christoph)
      
       - Series enabling bio alloc caching for IRQ based IO (Pavel)
      
       - Series enabling userspace peer-to-peer DMA (Logan)
      
       - BFQ waker fixes (Khazhismel)
      
       - Series fixing elevator refcount issues (Christoph, Jinlong)
      
       - Series cleaning up references around queue destruction (Christoph)
      
       - Series doing quiesce by tagset, enabling cleanups in drivers
         (Christoph, Chao)
      
       - Series untangling the queue kobject and queue references (Christoph)
      
       - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
         Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)
      
      * tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
        blktrace: Fix output non-blktrace event when blk_classic option enabled
        block: sed-opal: Don't include <linux/kernel.h>
        sed-opal: allow using IOC_OPAL_SAVE for locking too
        blk-cgroup: Fix typo in comment
        block: remove bio_set_op_attrs
        nvmet: don't open-code NVME_NS_ATTR_RO enumeration
        nvme-pci: use the tagset alloc/free helpers
        nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
        nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
        nvme: consolidate setting the tagset flags
        nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
        block: bio_copy_data_iter
        nvme-pci: split out a nvme_pci_ctrl_is_dead helper
        nvme-pci: return early on ctrl state mismatch in nvme_reset_work
        nvme-pci: rename nvme_disable_io_queues
        nvme-pci: cleanup nvme_suspend_queue
        nvme-pci: remove nvme_pci_disable
        nvme-pci: remove nvme_disable_admin_queue
        nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
        nvme: use nvme_wait_ready in nvme_shutdown_ctrl
        ...
      ce8a79d5
    • Linus Torvalds's avatar
      Merge tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux · 96f7e448
      Linus Torvalds authored
      Pull io_uring updates part two from Jens Axboe:
      
       - Misc fixes (me, Lin)
      
       - Series from Pavel extending the single task exclusive ring mode,
         yielding nice improvements for the common case of having a single
         ring per thread (Pavel)
      
       - Cleanup for MSG_RING, removing our IOPOLL hack (Pavel)
      
       - Further poll cleanups and fixes (Pavel)
      
       - Misc cleanups and fixes (Pavel)
      
      * tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux: (22 commits)
        io_uring/msg_ring: flag target ring as having task_work, if needed
        io_uring: skip spinlocking for ->task_complete
        io_uring: do msg_ring in target task via tw
        io_uring: extract a io_msg_install_complete helper
        io_uring: get rid of double locking
        io_uring: never run tw and fallback in parallel
        io_uring: use tw for putting rsrc
        io_uring: force multishot CQEs into task context
        io_uring: complete all requests in task context
        io_uring: don't check overflow flush failures
        io_uring: skip overflow CQE posting for dying ring
        io_uring: improve io_double_lock_ctx fail handling
        io_uring: dont remove file from msg_ring reqs
        io_uring: reshuffle issue_flags
        io_uring: don't reinstall quiesce node for each tw
        io_uring: improve rsrc quiesce refs checks
        io_uring: don't raw spin unlock to match cq_lock
        io_uring: combine poll tw handlers
        io_uring: improve poll warning handling
        io_uring: remove ctx variable in io_poll_check_events
        ...
      96f7e448
    • Linus Torvalds's avatar
      Merge tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux · 54e60e50
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
      
       - Always ensure proper ordering in case of CQ ring overflow, which then
         means we can remove some work-arounds for that (Dylan)
      
       - Support completion batching for multishot, greatly increasing the
         efficiency for those (Dylan)
      
       - Flag epoll/eventfd wakeups done from io_uring, so that we can easily
         tell if we're recursing into io_uring again.
      
         Previously, this would have resulted in repeated multishot
         notifications if we had a dependency there. That could happen if an
         eventfd was registered as the ring eventfd, and we multishot polled
         for events on it. Or if an io_uring fd was added to epoll, and
         io_uring had a multishot request for the epoll fd.
      
         Test cases here:
      	https://git.kernel.dk/cgit/liburing/commit/?id=919755a7d0096fda08fb6d65ac54ad8d0fe027cd
      
         Previously these got terminated when the CQ ring eventually
         overflowed, now it's handled gracefully (me).
      
       - Tightening of the IOPOLL based completions (Pavel)
      
       - Optimizations of the networking zero-copy paths (Pavel)
      
       - Various tweaks and fixes (Dylan, Pavel)
      
      * tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux: (41 commits)
        io_uring: keep unlock_post inlined in hot path
        io_uring: don't use complete_post in kbuf
        io_uring: spelling fix
        io_uring: remove io_req_complete_post_tw
        io_uring: allow multishot polled reqs to defer completion
        io_uring: remove overflow param from io_post_aux_cqe
        io_uring: add lockdep assertion in io_fill_cqe_aux
        io_uring: make io_fill_cqe_aux static
        io_uring: add io_aux_cqe which allows deferred completion
        io_uring: allow defer completion for aux posted cqes
        io_uring: defer all io_req_complete_failed
        io_uring: always lock in io_apoll_task_func
        io_uring: remove iopoll spinlock
        io_uring: iopoll protect complete_post
        io_uring: inline __io_req_complete_put()
        io_uring: remove io_req_tw_post_queue
        io_uring: use io_req_task_complete() in timeout
        io_uring: hold locks for io_req_complete_failed
        io_uring: add completion locking for iopoll
        io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post()
        ...
      54e60e50
    • Linus Torvalds's avatar
      Merge tag 'iomap-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · d523ec4c
      Linus Torvalds authored
      Pull iomap update from Darrick Wong:
      
       - Minor code cleanup to eliminate unnecessary bit shifting
      
      * tag 'iomap-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: directly use logical block size
      d523ec4c
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · a45a7db9
      Linus Torvalds authored
      Pull vfs remap_range update from Darrick Wong:
      
       - Make some minor adjustments to the remap range preparation function
         to skip file updates when the request length is adjusted downwards to
         zero.
      
      * tag 'vfs-6.2-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        fs/remap_range: avoid spurious writeback on zero length request
      a45a7db9
    • Linus Torvalds's avatar
      Merge tag 'fs.xattr.simple.rework.rbtree.rwlock.v6.2' of... · 02bf43c7
      Linus Torvalds authored
      Merge tag 'fs.xattr.simple.rework.rbtree.rwlock.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
      
      Pull simple-xattr updates from Christian Brauner:
       "This ports the simple xattr infrastucture to rely on a simple rbtree
        protected by a read-write lock instead of a linked list protected by a
        spinlock.
      
        A while ago we received reports about scaling issues for filesystems
        using the simple xattr infrastructure that also support setting a
        larger number of xattrs. Specifically, cgroups and tmpfs.
      
        Both cgroupfs and tmpfs can be mounted by unprivileged users in
        unprivileged containers and root in an unprivileged container can set
        an unrestricted number of security.* xattrs and privileged users can
        also set unlimited trusted.* xattrs. A few more words on further that
        below. Other xattrs such as user.* are restricted for kernfs-based
        instances to a fairly limited number.
      
        As there are apparently users that have a fairly large number of
        xattrs we should scale a bit better. Using a simple linked list
        protected by a spinlock used for set, get, and list operations doesn't
        scale well if users use a lot of xattrs even if it's not a crazy
        number.
      
        Let's switch to a simple rbtree protected by a rwlock. It scales way
        better and gets rid of the perf issues some people reported. We
        originally had fancier solutions even using an rcu+seqlock protected
        rbtree but we had concerns about being to clever and also that
        deletion from an rbtree with rcu+seqlock isn't entirely safe.
      
        The rbtree plus rwlock is perfectly fine. By far the most common
        operation is getting an xattr. While setting an xattr is not and
        should be comparatively rare. And listxattr() often only happens when
        copying xattrs between files or together with the contents to a new
        file.
      
        Holding a lock across listxattr() is unproblematic because it doesn't
        list the values of xattrs. It can only be used to list the names of
        all xattrs set on a file. And the number of xattr names that can be
        listed with listxattr() is limited to XATTR_LIST_MAX aka 65536 bytes.
        If a larger buffer is passed then vfs_listxattr() caps it to
        XATTR_LIST_MAX and if more xattr names are found it will return
        -E2BIG. In short, the maximum amount of memory that can be retrieved
        via listxattr() is limited and thus listxattr() bounded.
      
        Of course, the API is broken as documented on xattr(7) already. While
        I have no idea how the xattr api ended up in this state we should
        probably try to come up with something here at some point. An iterator
        pattern similar to readdir() as an alternative to listxattr() or
        something else.
      
        Right now it is extremly strange that users can set millions of xattrs
        but then can't use listxattr() to know which xattrs are actually set.
        And it's really trivial to do:
      
      	for i in {1..1000000}; do setfattr -n security.$i -v $i ./file1; done
      
        And around 5000 xattrs it's impossible to use listxattr() to figure
        out which xattrs are actually set. So I have suggested that we try to
        limit the number of xattrs for simple xattrs at least. But that's a
        future patch and I don't consider it very urgent.
      
        A bonus of this port to rbtree+rwlock is that we shrink the memory
        consumption for users of the simple xattr infrastructure.
      
        This also adds kernel documentation to all the functions"
      
      * tag 'fs.xattr.simple.rework.rbtree.rwlock.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        xattr: use rbtree for simple_xattrs
      02bf43c7
    • Linus Torvalds's avatar
      Merge tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm · c76ff350
      Linus Torvalds authored
      Pull lsm updates from Paul Moore:
      
       - Improve the error handling in the device cgroup such that memory
         allocation failures when updating the access policy do not
         potentially alter the policy.
      
       - Some minor fixes to reiserfs to ensure that it properly releases
         LSM-related xattr values.
      
       - Update the security_socket_getpeersec_stream() LSM hook to take
         sockptr_t values.
      
         Previously the net/BPF folks updated the getsockopt code in the
         network stack to leverage the sockptr_t type to make it easier to
         pass both kernel and __user pointers, but unfortunately when they did
         so they didn't convert the LSM hook.
      
         While there was/is no immediate risk by not converting the LSM hook,
         it seems like this is a mistake waiting to happen so this patch
         proactively does the LSM hook conversion.
      
       - Convert vfs_getxattr_alloc() to return an int instead of a ssize_t
         and cleanup the callers. Internally the function was never going to
         return anything larger than an int and the callers were doing some
         very odd things casting the return value; this patch fixes all that
         and helps bring a bit of sanity to vfs_getxattr_alloc() and its
         callers.
      
       - More verbose, and helpful, LSM debug output when the system is booted
         with "lsm.debug" on the command line. There are examples in the
         commit description, but the quick summary is that this patch provides
         better information about which LSMs are enabled and the ordering in
         which they are processed.
      
       - General comment and kernel-doc fixes and cleanups.
      
      * tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
        lsm: Fix description of fs_context_parse_param
        lsm: Add/fix return values in lsm_hooks.h and fix formatting
        lsm: Clarify documentation of vm_enough_memory hook
        reiserfs: Add missing calls to reiserfs_security_free()
        lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths
        device_cgroup: Roll back to original exceptions after copy failure
        LSM: Better reporting of actual LSMs at boot
        lsm: make security_socket_getpeersec_stream() sockptr_t safe
        audit: Fix some kernel-doc warnings
        lsm: remove obsoleted comments for security hooks
        fs: edit a comment made in bad taste
      c76ff350
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 57888f7b
      Linus Torvalds authored
      Pull selinux updates from Paul Moore:
       "Two SELinux patches: one increases the sleep time on deprecated
        functionality, and one removes the indirect calls in the sidtab
        context conversion code"
      
      * tag 'selinux-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: remove the sidtab context conversion indirect calls
        selinux: increase the deprecation sleep for checkreqprot and runtime disable
      57888f7b
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · bbdf4d54
      Linus Torvalds authored
      Pull audit updates from Paul Moore:
       "Two performance oriented patches for the audit subsystem: one
        consolidates similar code to gain some caching advantages, while the
        other stores a value in a stack variable to avoid repeated lookups in
        a loop.
      
        The commit descriptions have more information, including some
        before/after performance measurements"
      
      * tag 'audit-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit: unify audit_filter_{uring(), inode_name(), syscall()}
        audit: cache ctx->major in audit_filter_syscall()
      bbdf4d54
    • Linus Torvalds's avatar
      Merge tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux · 299e2b19
      Linus Torvalds authored
      Pull landlock updates from Mickaël Salaün:
       "This adds file truncation support to Landlock, contributed by Günther
        Noack. As described by Günther [1], the goal of these patches is to
        work towards a more complete coverage of file system operations that
        are restrictable with Landlock.
      
        The known set of currently unsupported file system operations in
        Landlock is described at [2]. Out of the operations listed there,
        truncate is the only one that modifies file contents, so these patches
        should make it possible to prevent the direct modification of file
        contents with Landlock.
      
        The new LANDLOCK_ACCESS_FS_TRUNCATE access right covers both the
        truncate(2) and ftruncate(2) families of syscalls, as well as open(2)
        with the O_TRUNC flag. This includes usages of creat() in the case
        where existing regular files are overwritten.
      
        Additionally, this introduces a new Landlock security blob associated
        with opened files, to track the available Landlock access rights at
        the time of opening the file. This is in line with Unix's general
        approach of checking the read and write permissions during open(), and
        associating this previously checked authorization with the opened
        file. An ongoing patch documents this use case [3].
      
        In order to treat truncate(2) and ftruncate(2) calls differently in an
        LSM hook, we split apart the existing security_path_truncate hook into
        security_path_truncate (for truncation by path) and
        security_file_truncate (for truncation of previously opened files)"
      
      Link: https://lore.kernel.org/r/20221018182216.301684-1-gnoack3000@gmail.com [1]
      Link: https://www.kernel.org/doc/html/v6.1/userspace-api/landlock.html#filesystem-flags [2]
      Link: https://lore.kernel.org/r/20221209193813.972012-1-mic@digikod.net [3]
      
      * tag 'landlock-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
        samples/landlock: Document best-effort approach for LANDLOCK_ACCESS_FS_REFER
        landlock: Document Landlock's file truncation support
        samples/landlock: Extend sample tool to support LANDLOCK_ACCESS_FS_TRUNCATE
        selftests/landlock: Test ftruncate on FDs created by memfd_create(2)
        selftests/landlock: Test FD passing from restricted to unrestricted processes
        selftests/landlock: Locally define __maybe_unused
        selftests/landlock: Test open() and ftruncate() in multiple scenarios
        selftests/landlock: Test file truncation support
        landlock: Support file truncation
        landlock: Document init_layer_masks() helper
        landlock: Refactor check_access_path_dual() into is_access_to_paths_allowed()
        security: Create file_truncate hook from path_truncate hook
      299e2b19
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-6.2-2022-12-13' of git://git.infradead.org/users/hch/dma-mapping · e529d350
      Linus Torvalds authored
      Pull dma-mapping updates from Christoph Hellwig:
      
       - reduce the swiotlb buffer size on allocation failure (Alexey
         Kardashevskiy)
      
       - clean up passing of bogus GFP flags to the dma-coherent allocator
         (Christoph Hellwig)
      
      * tag 'dma-mapping-6.2-2022-12-13' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: reject __GFP_COMP in dma_alloc_attrs
        ALSA: memalloc: don't pass bogus GFP_ flags to dma_alloc_*
        s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent
        cnic: don't pass bogus GFP_ flags to dma_alloc_coherent
        RDMA/qib: don't pass bogus GFP_ flags to dma_alloc_coherent
        RDMA/hfi1: don't pass bogus GFP_ flags to dma_alloc_coherent
        media: videobuf-dma-contig: use dma_mmap_coherent
        swiotlb: reduce the swiotlb buffer size on allocation failure
      e529d350
    • Linus Torvalds's avatar
      Merge tag 'configfs-6.2-2022-12-13' of git://git.infradead.org/users/hch/configfs · 6a24711d
      Linus Torvalds authored
      Pull configfs updates from Christoph Hellwig:
      
       - fix a memory leak in configfs_create_dir (Chen Zhongjin)
      
       - remove mentions of committable items that were implemented (Bartosz
         Golaszewski)
      
      * tag 'configfs-6.2-2022-12-13' of git://git.infradead.org/users/hch/configfs:
        configfs: remove mentions of committable items
        configfs: fix possible memory leak in configfs_create_dir()
      6a24711d
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · a044dab5
      Linus Torvalds authored
      Pull NFS client updates from Trond Myklebust
       "Bugfixes:
         - Fix NULL pointer dereference in the mount parser
         - Fix memory stomp in decode_attr_security_label
         - Fix credential leak in _nfs4_discover_trunking()
         - Fix buffer leak in rpcrdma_req_create()
         - Fix leaked socket in rpc_sockname()
         - Fix deadlock between nfs4_open_recover_helper() and delegreturn
         - Fix an Oops in nfs_d_automount()
         - Fix potential race in nfs_call_unlink()
         - Multiple fixes for the open context mode
         - NFSv4.2 READ_PLUS fixes
         - Fix a regression in which small rsize/wsize values are being
           forbidden
         - Fail client initialisation if the NFSv4.x state manager thread
           can't run
         - Avoid spurious warning of lost lock that is being unlocked.
         - Ensure the initialisation of struct nfs4_label
      
        Features and cleanups:
         - Trigger the "ls -l" readdir heuristic sooner
         - Clear the file access cache upon login to ensure supplementary
           group info is in sync between the client and server
         - pnfs: Fix up the logging of layout stateids
         - NFSv4.2: Change the default KConfig value for READ_PLUS
         - Use sysfs_emit() instead of scnprintf() where appropriate"
      
      * tag 'nfs-for-6.2-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
        NFSv4.2: Change the default KConfig value for READ_PLUS
        NFSv4.x: Fail client initialisation if state manager thread can't run
        fs: nfs: sysfs: use sysfs_emit() to instead of scnprintf()
        NFS: use sysfs_emit() to instead of scnprintf()
        NFS: Allow very small rsize & wsize again
        NFSv4.2: Fix up READ_PLUS alignment
        NFSv4.2: Set the correct size scratch buffer for decoding READ_PLUS
        SUNRPC: Fix missing release socket in rpc_sockname()
        xprtrdma: Fix regbuf data not freed in rpcrdma_req_create()
        NFS: avoid spurious warning of lost lock that is being unlocked.
        nfs: fix possible null-ptr-deref when parsing param
        NFSv4: check FMODE_EXEC from open context mode in nfs4_opendata_access()
        NFS: make sure open context mode have FMODE_EXEC when file open for exec
        NFS4.x/pnfs: Fix up logging of layout stateids
        NFS: Fix a race in nfs_call_unlink()
        NFS: Fix an Oops in nfs_d_automount()
        NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturn
        NFSv4: Fix a credential leak in _nfs4_discover_trunking()
        NFS: Trigger the "ls -l" readdir heuristic sooner
        NFSv4.2: Fix initialisation of struct nfs4_label
        ...
      a044dab5
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 76482297
      Linus Torvalds authored
      Pull nfsd updates from Chuck Lever:
       "This release introduces support for the CB_RECALL_ANY operation. NFSD
        can send this operation to request that clients return any delegations
        they choose. The server uses this operation to handle low memory
        scenarios or indicate to a client when that client has reached the
        maximum number of delegations the server supports.
      
        The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst
        support for sparse files in local filesystems and the VFS is improved.
      
        Two major data structure fixes appear in this release:
      
         - The nfs4_file hash table is replaced with a resizable hash table to
           reduce the latency of NFSv4 OPEN operations.
      
         - Reference counting in the NFSD filecache has been hardened against
           races.
      
        In furtherance of removing support for NFSv2 in a subsequent kernel
        release, a new Kconfig option enables server-side support for NFSv2 to
        be left out of a kernel build.
      
        MAINTAINERS has been updated to indicate that changes to fs/exportfs
        should go through the NFSD tree"
      
      * tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits)
        NFSD: Avoid clashing function prototypes
        SUNRPC: Fix crasher in unwrap_integ_data()
        SUNRPC: Make the svc_authenticate tracepoint conditional
        NFSD: Use only RQ_DROPME to signal the need to drop a reply
        SUNRPC: Clean up xdr_write_pages()
        SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
        NFSD: add CB_RECALL_ANY tracepoints
        NFSD: add delegation reaper to react to low memory condition
        NFSD: add support for sending CB_RECALL_ANY
        NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
        trace: Relocate event helper files
        NFSD: pass range end to vfs_fsync_range() instead of count
        lockd: fix file selection in nlmsvc_cancel_blocked
        lockd: ensure we use the correct file descriptor when unlocking
        lockd: set missing fl_flags field when retrieving args
        NFSD: Use struct_size() helper in alloc_session()
        nfsd: return error if nfs4_setacl fails
        lockd: set other missing fields when unlocking files
        NFSD: Add an nfsd_file_fsync tracepoint
        sunrpc: svc: Remove an unused static function svc_ungetu32()
        ...
      76482297
    • Linus Torvalds's avatar
      Merge tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 149c51f8
      Linus Torvalds authored
      Pull btrfs updates from David Sterba:
       "This round there are a lot of cleanups and moved code so the diffstat
        looks huge, otherwise there are some nice performance improvements and
        an update to raid56 reliability.
      
        User visible features:
      
         - raid56 reliability vs performance trade off:
            - fix destructive RMW for raid5 data (raid6 still needs work): do
              full checksum verification for all data during RMW cycle, this
              should prevent rewriting potentially corrupted data without
              notice
            - stripes are cached in memory which should reduce the performance
              impact but still can hurt some workloads
            - checksums are verified after repair again
            - this is the last option without introducing additional features
              (write intent bitmap, journal, another tree), the extra checksum
              read/verification was supposed to be avoided by the original
              implementation exactly for performance reasons but that caused
              all the reliability problems
      
         - discard=async by default for devices that support it
      
         - implement emergency flush reserve to avoid almost all unnecessary
           transaction aborts due to ENOSPC in cases where there are too many
           delayed refs or delayed allocation
      
         - skip block group synchronization if there's no change in used
           bytes, can reduce transaction commit count for some workloads
      
        Performance improvements:
      
         - fiemap and lseek:
            - overall speedup due to skipping unnecessary or duplicate
              searches (-40% run time)
            - cache some data structures and sharedness of extents (-30% run
              time)
      
         - send:
            - faster backref resolution when finding clones
            - cached leaf to root mapping for faster backref walking
            - improved clone/sharing detection
            - overall run time improvements (-70%)
      
        Core:
      
         - module initialization converted to a table of function pointers run
           in a sequence
      
         - preparation for fscrypt, extend passing file names across calls,
           dir item can store encryption status
      
         - raid56 updates:
            - more accurate error tracking of sectors within stripe
            - simplify recovery path and remove dedicated endio worker kthread
            - simplify scrub call paths
            - refactoring to support the extra data checksum verification
              during RMW cycle
      
         - tree block parentness checks consolidated and done at metadata read
           time
      
         - improved error handling
      
         - cleanups:
            - move a lot of code for better synchronization between kernel and
              user space sources, split big files
            - enum cleanups
            - GFP flag cleanups
            - header file cleanups, prototypes, dependencies
            - redundant parameter cleanups
            - inline extent handling simplifications
            - inode parameter conversion
            - data structure cleanups, reductions, renames, merges"
      
      * tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
        btrfs: print transaction aborted messages with an error level
        btrfs: sync some cleanups from progs into uapi/btrfs.h
        btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
        btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
        btrfs: remove outdated logic from overwrite_item() and add assertion
        btrfs: unify overwrite_item() and do_overwrite_item()
        btrfs: replace strncpy() with strscpy()
        btrfs: fix uninitialized variable in find_first_clear_extent_bit
        btrfs: fix uninitialized parent in insert_state
        btrfs: add might_sleep() annotations
        btrfs: add stack helpers for a few btrfs items
        btrfs: add nr_global_roots to the super block definition
        btrfs: remove BTRFS_LEAF_DATA_OFFSET
        btrfs: add helpers for manipulating leaf items and data
        btrfs: add eb to btrfs_node_key_ptr_offset
        btrfs: pass the extent buffer for the btrfs_item_nr helpers
        btrfs: move the csum helpers into ctree.h
        btrfs: move eb offset helpers into extent_io.h
        btrfs: move file_extent_item helpers into file-item.h
        btrfs: move leaf_data_end into ctree.c
        ...
      149c51f8
    • Linus Torvalds's avatar
      Merge tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 97971df8
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "These patches include the usual cleanups and minor fixes, the removal
        of code that is no longer needed due to recent improvements, and
        improvements to processing large volumes of messages during heavy
        locking activity.
      
        Summary:
      
         - Misc code cleanup
      
         - Fix a couple of socket handling bugs: a double release on an error
           path and a data-ready race in an accept loop
      
         - Remove code for resending dir-remove messages. This code is no
           longer needed since the midcomms layer now ensures the messages are
           resent if needed
      
         - Add tracepoints for dlm messages
      
         - Improve callback queueing by replacing the fixed array with a list
      
         - Simplify the handling of a remove message followed by a lookup
           message by sending both without releasing a spinlock in between
      
         - Improve the concurrency of sending and receiving messages by
           holding locks for a shorter time, and changing how workqueues are
           used
      
         - Remove old code for shutting down sockets, which is no longer
           needed with the reliable connection handling that was recently
           added"
      
      * tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (37 commits)
        fs: dlm: fix building without lockdep
        fs: dlm: parallelize lowcomms socket handling
        fs: dlm: don't init error value
        fs: dlm: use saved sk_error_report()
        fs: dlm: use sock2con without checking null
        fs: dlm: remove dlm_node_addrs lookup list
        fs: dlm: don't put dlm_local_addrs on heap
        fs: dlm: cleanup listen sock handling
        fs: dlm: remove socket shutdown handling
        fs: dlm: use listen sock as dlm running indicator
        fs: dlm: use list_first_entry_or_null
        fs: dlm: remove twice INIT_WORK
        fs: dlm: add midcomms init/start functions
        fs: dlm: add dst nodeid for msg tracing
        fs: dlm: rename seq to h_seq for msg tracing
        fs: dlm: rename DLM_IFL_NEED_SCHED to DLM_IFL_CB_PENDING
        fs: dlm: ast do WARN_ON_ONCE() on hotpath
        fs: dlm: drop lkb ref in bug case
        fs: dlm: avoid false-positive checker warning
        fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
        ...
      97971df8
    • Linus Torvalds's avatar
      Merge tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy · 56c003e4
      Linus Torvalds authored
      Pull jfs updates from David Kleikamp:
       "Assorted JFS fixes for 6.2"
      
      * tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy:
        jfs: makes diUnmount/diMount in jfs_mount_rw atomic
        jfs: Fix a typo in function jfs_umount
        fs: jfs: fix shift-out-of-bounds in dbDiscardAG
        jfs: Fix fortify moan in symlink
        jfs: remove redundant assignments to ipaimap and ipaimap2
        jfs: remove unused declarations for jfs
        fs/jfs/jfs_xattr.h: Fix spelling typo in comment
        MAINTAINERS: git://github -> https://github.com for kleikamp
        fs/jfs: replace ternary operator with min_t()
        fs: jfs: fix shift-out-of-bounds in dbAllocAG
      56c003e4
    • Linus Torvalds's avatar
      Merge tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · cda6a60a
      Linus Torvalds authored
      Pull udf and ext2 fixes from Jan Kara:
      
       - a couple of smaller cleanups and fixes for ext2
      
       - fixes of a data corruption issues in udf when handling holes and
         preallocation extents
      
       - fixes and cleanups of several smaller issues in udf
      
       - add maintainer entry for isofs
      
      * tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        udf: Fix extending file within last block
        udf: Discard preallocation before extending file with a hole
        udf: Do not bother looking for prealloc extents if i_lenExtents matches i_size
        udf: Fix preallocation discarding at indirect extent boundary
        udf: Increase UDF_MAX_READ_VERSION to 0x0260
        fs/ext2: Fix code indentation
        ext2: unbugger ext2_empty_dir()
        udf: remove ->writepage
        ext2: remove ->writepage
        ext2: Don't flush page immediately for DIRSYNC directories
        ext2: Fix some kernel-doc warnings
        maintainers: Add ISOFS entry
        udf: Avoid double brelse() in udf_rename()
        fs: udf: Optimize udf_free_in_core_inode and udf_find_fileset function
      cda6a60a
    • Linus Torvalds's avatar
      Merge tag 'fs.xattr.simple.noaudit.v6.2' of... · 07d7a4d6
      Linus Torvalds authored
      Merge tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
      
      Pull xattr audit fix from Seth Forshee:
       "This is a single patch to remove auditing of the capability check in
        simple_xattr_list().
      
        This check is done to check whether trusted xattrs should be included
        by listxattr(2). SELinux will normally log a denial when capable() is
        called and the task's SELinux context doesn't have the corresponding
        capability permission allowed, which can end up spamming the log.
      
        Since a failed check here cannot be used to infer malicious intent,
        auditing is of no real value, and it makes sense to stop auditing the
        capability check"
      
      * tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        fs: don't audit the capability check in simple_xattr_list()
      07d7a4d6
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.squashfs.v6.2' of... · 6e8948a0
      Linus Torvalds authored
      Merge tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
      
      Pull squashfs update from Seth Forshee:
       "This is a simple patch to enable idmapped mounts for squashfs.
      
        All functionality squashfs needs to support idmapped mounts is already
        implemented in generic VFS code, so all that is needed is to set
        FS_ALLOW_IDMAP in fs_flags"
      
      * tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        squashfs: enable idmapped mounts
      6e8948a0
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 043930b1
      Linus Torvalds authored
      Pull fuse update from Miklos Szeredi:
      
       - Allow some write requests to proceed in parallel
      
       - Fix a performance problem with allow_sys_admin_access
      
       - Add a special kind of invalidation that doesn't immediately purge
         submounts
      
       - On revalidation treat the target of rename(RENAME_NOREPLACE) the same
         as open(O_EXCL)
      
       - Use type safe helpers for some mnt_userns transformations
      
      * tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: Rearrange fuse_allow_current_process checks
        fuse: allow non-extending parallel direct writes on the same file
        fuse: remove the unneeded result variable
        fuse: port to vfs{g,u}id_t and associated helpers
        fuse: Remove user_ns check for FUSE_DEV_IOC_CLONE
        fuse: always revalidate rename target dentry
        fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
        fs/fuse: Replace kmap() with kmap_local_page()
      043930b1
    • Linus Torvalds's avatar
      Merge tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 6df7cc22
      Linus Torvalds authored
      Pull overlayfs update from Miklos Szeredi:
      
       - Fix a couple of bugs found by syzbot
      
       - Don't ingore some open flags set by fcntl(F_SETFL)
      
       - Fix failure to create a hard link in certain cases
      
       - Use type safe helpers for some mnt_userns transformations
      
       - Improve performance of mount
      
       - Misc cleanups
      
      * tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: Kconfig: Fix spelling mistake "undelying" -> "underlying"
        ovl: use inode instead of dentry where possible
        ovl: Add comment on upperredirect reassignment
        ovl: use plain list filler in indexdir and workdir cleanup
        ovl: do not reconnect upper index records in ovl_indexdir_cleanup()
        ovl: fix comment typos
        ovl: port to vfs{g,u}id_t and associated helpers
        ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
        ovl: Use "buf" flexible array for memcpy() destination
        ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags
        ovl: fix use inode directly in rcu-walk mode
      6df7cc22
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 4a6bff11
      Linus Torvalds authored
      Pull erofs updates from Gao Xiang:
       "In this cycle, large folios are now enabled in the iomap/fscache mode
        for uncompressed files first. In order to do that, we've also cleaned
        up better interfaces between erofs and fscache, which are acked by
        fscache/netfs folks and included in this pull request.
      
        Other than that, there are random fixes around erofs over fscache and
        crafted images by syzbot, minor cleanups and documentation updates.
      
        Summary:
      
         - Enable large folios for iomap/fscache mode
      
         - Avoid sysfs warning due to mounting twice with the same fsid and
           domain_id in fscache mode
      
         - Refine fscache interface among erofs, fscache, and cachefiles
      
         - Use kmap_local_page() only for metabuf
      
         - Fixes around crafted images found by syzbot
      
         - Minor cleanups and documentation updates"
      
      * tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: validate the extent length for uncompressed pclusters
        erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails
        erofs: Fix pcluster memleak when its block address is zero
        erofs: use kmap_local_page() only for erofs_bread()
        erofs: enable large folios for fscache mode
        erofs: support large folios for fscache mode
        erofs: switch to prepare_ondemand_read() in fscache mode
        fscache,cachefiles: add prepare_ondemand_read() callback
        erofs: clean up cached I/O strategies
        erofs: update documentation
        erofs: check the uniqueness of fsid in shared domain in advance
        erofs: enable large folios for iomap mode
      4a6bff11
    • Linus Torvalds's avatar
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · ad0d9da1
      Linus Torvalds authored
      Pull fsverity updates from Eric Biggers:
       "The main change this cycle is to stop using the PG_error flag to track
        verity failures, and instead just track failures at the bio level.
        This follows a similar fscrypt change that went into 6.1, and it is a
        step towards freeing up PG_error for other uses.
      
        There's also one other small cleanup"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fsverity: simplify fsverity_get_digest()
        fsverity: stop using PG_error to track error status
      ad0d9da1
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt · 8129bac6
      Linus Torvalds authored
      Pull fscrypt updates from Eric Biggers:
       "This release adds SM4 encryption support, contributed by Tianjia
        Zhang. SM4 is a Chinese block cipher that is an alternative to AES.
      
        I recommend against using SM4, but (according to Tianjia) some people
        are being required to use it. Since SM4 has been turning up in many
        other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.),
        it hasn't been very controversial, and some people have to use it, I
        don't think it would be fair for me to reject this optional feature.
      
        Besides the above, there are a couple cleanups"
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
        fscrypt: add additional documentation for SM4 support
        fscrypt: remove unused Speck definitions
        fscrypt: Add SM4 XTS/CTS symmetric algorithm support
        blk-crypto: Add support for SM4-XTS blk crypto mode
        fscrypt: add comment for fscrypt_valid_enc_modes_v1()
        fscrypt: pass super_block to fscrypt_put_master_key_activeref()
      8129bac6
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · deb9acc1
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "A large number of cleanups and bug fixes, with many of the bug fixes
        found by Syzbot and fuzzing. (Many of the bug fixes involve less-used
        ext4 features such as fast_commit, inline_data and bigalloc)
      
        In addition, remove the writepage function for ext4, since the
        medium-term plan is to remove ->writepage() entirely. (The VM doesn't
        need or want writepage() for writeback, since it is fine with
        ->writepages() so long as ->migrate_folio() is implemented)"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
        ext4: fix reserved cluster accounting in __es_remove_extent()
        ext4: fix inode leak in ext4_xattr_inode_create() on an error path
        ext4: allocate extended attribute value in vmalloc area
        ext4: avoid unaccounted block allocation when expanding inode
        ext4: initialize quota before expanding inode in setproject ioctl
        ext4: stop providing .writepage hook
        mm: export buffer_migrate_folio_norefs()
        ext4: switch to using write_cache_pages() for data=journal writeout
        jbd2: switch jbd2_submit_inode_data() to use fs-provided hook for data writeout
        ext4: switch to using ext4_do_writepages() for ordered data writeout
        ext4: move percpu_rwsem protection into ext4_writepages()
        ext4: provide ext4_do_writepages()
        ext4: add support for writepages calls that cannot map blocks
        ext4: drop pointless IO submission from ext4_bio_write_page()
        ext4: remove nr_submitted from ext4_bio_write_page()
        ext4: move keep_towrite handling to ext4_bio_write_page()
        ext4: handle redirtying in ext4_bio_write_page()
        ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
        ext4: make ext4_mb_initialize_context return void
        ext4: fix deadlock due to mbcache entry corruption
        ...
      deb9acc1
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.mnt_idmap.v6.2' of... · 9b93f506
      Linus Torvalds authored
      Merge tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
      
      Pull idmapping updates from Christian Brauner:
       "Last cycle we've already made the interaction with idmapped mounts
        more robust and type safe by introducing the vfs{g,u}id_t type. This
        cycle we concluded the conversion and removed the legacy helpers.
      
        Currently we still pass around the plain namespace that was attached
        to a mount. This is in general pretty convenient but it makes it easy
        to conflate namespaces that are relevant on the filesystem - with
        namespaces that are relevent on the mount level. Especially for
        filesystem developers without detailed knowledge in this area this can
        be a potential source for bugs.
      
        Instead of passing the plain namespace we introduce a dedicated type
        struct mnt_idmap and replace the pointer with a pointer to a struct
        mnt_idmap. There are no semantic or size changes for the mount struct
        caused by this.
      
        We then start converting all places aware of idmapped mounts to rely
        on struct mnt_idmap. Once the conversion is done all helpers down to
        the really low-level make_vfs{g,u}id() and from_vfs{g,u}id() will take
        a struct mnt_idmap argument instead of two namespace arguments. This
        way it becomes impossible to conflate the two removing and thus
        eliminating the possibility of any bugs. Fwiw, I fixed some issues in
        that area a while ago in ntfs3 and ksmbd in the past. Afterwards only
        low-level code can ultimately use the associated namespace for any
        permission checks. Even most of the vfs can be completely obivious
        about this ultimately and filesystems will never interact with it in
        any form in the future.
      
        A struct mnt_idmap currently encompasses a simple refcount and pointer
        to the relevant namespace the mount is idmapped to. If a mount isn't
        idmapped then it will point to a static nop_mnt_idmap and if it
        doesn't that it is idmapped. As usual there are no allocations or
        anything happening for non-idmapped mounts. Everthing is carefully
        written to be a nop for non-idmapped mounts as has always been the
        case.
      
        If an idmapped mount is created a struct mnt_idmap is allocated and a
        reference taken on the relevant namespace. Each mount that gets
        idmapped or inherits the idmap simply bumps the reference count on
        struct mnt_idmap. Just a reminder that we only allow a mount to change
        it's idmapping a single time and only if it hasn't already been
        attached to the filesystems and has no active writers.
      
        The actual changes are fairly straightforward but this will have huge
        benefits for maintenance and security in the long run even if it
        causes some churn.
      
        Note that this also makes it possible to extend struct mount_idmap in
        the future. For example, it would be possible to place the namespace
        pointer in an anonymous union together with an idmapping struct. This
        would allow us to expose an api to userspace that would let it specify
        idmappings directly instead of having to go through the detour of
        setting up namespaces at all"
      
      * tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        acl: conver higher-level helpers to rely on mnt_idmap
        fs: introduce dedicated idmap type for mounts
      9b93f506
    • Linus Torvalds's avatar
      Merge tag 'fs.vfsuid.conversion.v6.2' of... · e1212e9b
      Linus Torvalds authored
      Merge tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
      
      Pull vfsuid updates from Christian Brauner:
       "Last cycle we introduced the vfs{g,u}id_t types and associated helpers
        to gain type safety when dealing with idmapped mounts. That initial
        work already converted a lot of places over but there were still some
        left,
      
        This converts all remaining places that still make use of non-type
        safe idmapping helpers to rely on the new type safe vfs{g,u}id based
        helpers.
      
        Afterwards it removes all the old non-type safe helpers"
      
      * tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        fs: remove unused idmapping helpers
        ovl: port to vfs{g,u}id_t and associated helpers
        fuse: port to vfs{g,u}id_t and associated helpers
        ima: use type safe idmapping helpers
        apparmor: use type safe idmapping helpers
        caps: use type safe idmapping helpers
        fs: use type safe idmapping helpers
        mnt_idmapping: add missing helpers
      e1212e9b
    • Linus Torvalds's avatar
      Merge tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping · cf619f89
      Linus Torvalds authored
      Pull setgid inheritance updates from Christian Brauner:
       "This contains the work to make setgid inheritance consistent between
        modifying a file and when changing ownership or mode as this has been
        a repeated source of very subtle bugs. The gist is that we perform the
        same permission checks in the write path as we do in the ownership and
        mode changing paths after this series where we're currently doing
        different things.
      
        We've already made setgid inheritance a lot more consistent and
        reliable in the last releases by moving setgid stripping from the
        individual filesystems up into the vfs. This aims to make the logic
        even more consistent and easier to understand and also to fix
        long-standing overlayfs setgid inheritance bugs. Miklos was nice
        enough to just let me carry the trivial overlayfs patches from Amir
        too.
      
        Below is a more detailed explanation how the current difference in
        setgid handling lead to very subtle bugs exemplified via overlayfs
        which is a victim of the current rules. I hope this explains why I
        think taking the regression risk here is worth it.
      
        A long while ago I found a few setgid inheritance bugs in overlayfs in
        the write path in certain conditions. Amir recently picked this back
        up in [1] and I jumped on board to fix this more generally.
      
        On the surface all that overlayfs would need to fix setgid inheritance
        would be to call file_remove_privs() or file_modified() but actually
        that isn't enough because the setgid inheritance api is wildly
        inconsistent in that area.
      
        Before this pr setgid stripping in file_remove_privs()'s old
        should_remove_suid() helper was inconsistent with other parts of the
        vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is
        S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups
        and the caller isn't privileged over the inode although we require
        this already in setattr_prepare() and setattr_copy() and so all
        filesystem implement this requirement implicitly because they have to
        use setattr_{prepare,copy}() anyway.
      
        But the inconsistency shows up in setgid stripping bugs for overlayfs
        in xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
        generic/687). For example, we test whether suid and setgid stripping
        works correctly when performing various write-like operations as an
        unprivileged user (fallocate, reflink, write, etc.):
      
            echo "Test 1 - qa_user, non-exec file $verb"
            setup_testfile
            chmod a+rws $junk_file
            commit_and_check "$qa_user" "$verb" 64k 64k
      
        The test basically creates a file with 6666 permissions. While the
        file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP
        set.
      
        On a regular filesystem like xfs what will happen is:
      
            sys_fallocate()
            -> vfs_fallocate()
               -> xfs_file_fallocate()
                  -> file_modified()
                     -> __file_remove_privs()
                        -> dentry_needs_remove_privs()
                           -> should_remove_suid()
                        -> __remove_privs()
                           newattrs.ia_valid = ATTR_FORCE | kill;
                           -> notify_change()
                              -> setattr_copy()
      
        In should_remove_suid() we can see that ATTR_KILL_SUID is raised
        unconditionally because the file in the test has S_ISUID set.
      
        But we also see that ATTR_KILL_SGID won't be set because while the
        file is S_ISGID it is not S_IXGRP (see above) which is a condition for
        ATTR_KILL_SGID being raised.
      
        So by the time we call notify_change() we have attr->ia_valid set to
        ATTR_KILL_SUID | ATTR_FORCE.
      
        Now notify_change() sees that ATTR_KILL_SUID is set and does:
      
            ia_valid      = attr->ia_valid |= ATTR_MODE
            attr->ia_mode = (inode->i_mode & ~S_ISUID);
      
        which means that when we call setattr_copy() later we will definitely
        update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.
      
        Now we call into the filesystem's ->setattr() inode operation which
        will end up calling setattr_copy(). Since ATTR_MODE is set we will
        hit:
      
            if (ia_valid & ATTR_MODE) {
                    umode_t mode = attr->ia_mode;
                    vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode);
                    if (!vfsgid_in_group_p(vfsgid) &&
                        !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
                            mode &= ~S_ISGID;
                    inode->i_mode = mode;
            }
      
        and since the caller in the test is neither capable nor in the group
        of the inode the S_ISGID bit is stripped.
      
        But assume the file isn't suid then ATTR_KILL_SUID won't be raised
        which has the consequence that neither the setgid nor the suid bits
        are stripped even though it should be stripped because the inode isn't
        in the caller's groups and the caller isn't privileged over the inode.
      
        If overlayfs is in the mix things become a bit more complicated and
        the bug shows up more clearly.
      
        When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to
        file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be
        raised but because the check in notify_change() is questioning the
        ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped
        the S_ISGID bit isn't removed even though it should be stripped:
      
            sys_fallocate()
            -> vfs_fallocate()
               -> ovl_fallocate()
                  -> file_remove_privs()
                     -> dentry_needs_remove_privs()
                        -> should_remove_suid()
                     -> __remove_privs()
                        newattrs.ia_valid = ATTR_FORCE | kill;
                        -> notify_change()
                           -> ovl_setattr()
                              /* TAKE ON MOUNTER'S CREDS */
                              -> ovl_do_notify_change()
                                 -> notify_change()
                              /* GIVE UP MOUNTER'S CREDS */
                 /* TAKE ON MOUNTER'S CREDS */
                 -> vfs_fallocate()
                    -> xfs_file_fallocate()
                       -> file_modified()
                          -> __file_remove_privs()
                             -> dentry_needs_remove_privs()
                                -> should_remove_suid()
                             -> __remove_privs()
                                newattrs.ia_valid = attr_force | kill;
                                -> notify_change()
      
        The fix for all of this is to make file_remove_privs()'s
        should_remove_suid() helper perform the same checks as we already
        require in setattr_prepare() and setattr_copy() and have
        notify_change() not pointlessly requiring S_IXGRP again. It doesn't
        make any sense in the first place because the caller must calculate
        the flags via should_remove_suid() anyway which would raise
        ATTR_KILL_SGID
      
        Note that some xfstests will now fail as these patches will cause the
        setgid bit to be lost in certain conditions for unprivileged users
        modifying a setgid file when they would've been kept otherwise. I
        think this risk is worth taking and I explained and mentioned this
        multiple times on the list [2].
      
        Enforcing the rules consistently across write operations and
        chmod/chown will lead to losing the setgid bit in cases were it
        might've been retained before.
      
        While I've mentioned this a few times but it's worth repeating just to
        make sure that this is understood. For the sake of maintainability,
        consistency, and security this is a risk worth taking.
      
        If we really see regressions for workloads the fix is to have special
        setgid handling in the write path again with different semantics from
        chmod/chown and possibly additional duct tape for overlayfs. I'll
        update the relevant xfstests with if you should decide to merge this
        second setgid cleanup.
      
        Before that people should be aware that there might be failures for
        fstests where unprivileged users modify a setgid file"
      
      Link: https://lore.kernel.org/linux-fsdevel/20221003123040.900827-1-amir73il@gmail.com [1]
      Link: https://lore.kernel.org/linux-fsdevel/20221122142010.zchf2jz2oymx55qi@wittgenstein [2]
      
      * tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        fs: use consistent setgid checks in is_sxid()
        ovl: remove privs in ovl_fallocate()
        ovl: remove privs in ovl_copyfile()
        attr: use consistent sgid stripping checks
        attr: add setattr_should_drop_sgid()
        fs: move should_remove_suid()
        attr: add in_group_or_capable()
      cf619f89
    • Linus Torvalds's avatar
      Merge tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping · 6a518afc
      Linus Torvalds authored
      Pull VFS acl updates from Christian Brauner:
       "This contains the work that builds a dedicated vfs posix acl api.
      
        The origins of this work trace back to v5.19 but it took quite a while
        to understand the various filesystem specific implementations in
        sufficient detail and also come up with an acceptable solution.
      
        As we discussed and seen multiple times the current state of how posix
        acls are handled isn't nice and comes with a lot of problems: The
        current way of handling posix acls via the generic xattr api is error
        prone, hard to maintain, and type unsafe for the vfs until we call
        into the filesystem's dedicated get and set inode operations.
      
        It is already the case that posix acls are special-cased to death all
        the way through the vfs. There are an uncounted number of hacks that
        operate on the uapi posix acl struct instead of the dedicated vfs
        struct posix_acl. And the vfs must be involved in order to interpret
        and fixup posix acls before storing them to the backing store, caching
        them, reporting them to userspace, or for permission checking.
      
        Currently a range of hacks and duct tape exist to make this work. As
        with most things this is really no ones fault it's just something that
        happened over time. But the code is hard to understand and difficult
        to maintain and one is constantly at risk of introducing bugs and
        regressions when having to touch it.
      
        Instead of continuing to hack posix acls through the xattr handlers
        this series builds a dedicated posix acl api solely around the get and
        set inode operations.
      
        Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl()
        helpers must be used in order to interact with posix acls. They
        operate directly on the vfs internal struct posix_acl instead of
        abusing the uapi posix acl struct as we currently do. In the end this
        removes all of the hackiness, makes the codepaths easier to maintain,
        and gets us type safety.
      
        This series passes the LTP and xfstests suites without any
        regressions. For xfstests the following combinations were tested:
         - xfs
         - ext4
         - btrfs
         - overlayfs
         - overlayfs on top of idmapped mounts
         - orangefs
         - (limited) cifs
      
        There's more simplifications for posix acls that we can make in the
        future if the basic api has made it.
      
        A few implementation details:
      
         - The series makes sure to retain exactly the same security and
           integrity module permission checks. Especially for the integrity
           modules this api is a win because right now they convert the uapi
           posix acl struct passed to them via a void pointer into the vfs
           struct posix_acl format to perform permission checking on the mode.
      
           There's a new dedicated security hook for setting posix acls which
           passes the vfs struct posix_acl not a void pointer. Basing checking
           on the posix acl stored in the uapi format is really unreliable.
           The vfs currently hacks around directly in the uapi struct storing
           values that frankly the security and integrity modules can't
           correctly interpret as evidenced by bugs we reported and fixed in
           this area. It's not necessarily even their fault it's just that the
           format we provide to them is sub optimal.
      
         - Some filesystems like 9p and cifs need access to the dentry in
           order to get and set posix acls which is why they either only
           partially or not even at all implement get and set inode
           operations. For example, cifs allows setxattr() and getxattr()
           operations but doesn't allow permission checking based on posix
           acls because it can't implement a get acl inode operation.
      
           Thus, this patch series updates the set acl inode operation to take
           a dentry instead of an inode argument. However, for the get acl
           inode operation we can't do this as the old get acl method is
           called in e.g., generic_permission() and inode_permission(). These
           helpers in turn are called in various filesystem's permission inode
           operation. So passing a dentry argument to the old get acl inode
           operation would amount to passing a dentry to the permission inode
           operation which we shouldn't and probably can't do.
      
           So instead of extending the existing inode operation Christoph
           suggested to add a new one. He also requested to ensure that the
           get and set acl inode operation taking a dentry are consistently
           named. So for this version the old get acl operation is renamed to
           ->get_inode_acl() and a new ->get_acl() inode operation taking a
           dentry is added. With this we can give both 9p and cifs get and set
           acl inode operations and in turn remove their complex custom posix
           xattr handlers.
      
           In the future I hope to get rid of the inode method duplication but
           it isn't like we have never had this situation. Readdir is just one
           example. And frankly, the overall gain in type safety and the more
           pleasant api wise are simply too big of a benefit to not accept
           this duplication for a while.
      
         - We've done a full audit of every codepaths using variant of the
           current generic xattr api to get and set posix acls and
           surprisingly it isn't that many places. There's of course always a
           chance that we might have missed some and if so I'm sure we'll find
           them soon enough.
      
           The crucial codepaths to be converted are obviously stacking
           filesystems such as ecryptfs and overlayfs.
      
           For a list of all callers currently using generic xattr api helpers
           see [2] including comments whether they support posix acls or not.
      
         - The old vfs generic posix acl infrastructure doesn't obey the
           create and replace semantics promised on the setxattr(2) manpage.
           This patch series doesn't address this. It really is something we
           should revisit later though.
      
        The patches are roughly organized as follows:
      
         (1) Change existing set acl inode operation to take a dentry
             argument (Intended to be a non-functional change)
      
         (2) Rename existing get acl method (Intended to be a non-functional
             change)
      
         (3) Implement get and set acl inode operations for filesystems that
             couldn't implement one before because of the missing dentry.
             That's mostly 9p and cifs (Intended to be a non-functional
             change)
      
         (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(),
             and vfs_set_acl() including security and integrity hooks
             (Intended to be a non-functional change)
      
         (5) Implement get and set acl inode operations for stacking
             filesystems (Intended to be a non-functional change)
      
         (6) Switch posix acl handling in stacking filesystems to new posix
             acl api now that all filesystems it can stack upon support it.
      
         (7) Switch vfs to new posix acl api (semantical change)
      
         (8) Remove all now unused helpers
      
         (9) Additional regression fixes reported after we merged this into
             linux-next
      
        Thanks to Seth for a lot of good discussion around this and
        encouragement and input from Christoph"
      
      * tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits)
        posix_acl: Fix the type of sentinel in get_acl
        orangefs: fix mode handling
        ovl: call posix_acl_release() after error checking
        evm: remove dead code in evm_inode_set_acl()
        cifs: check whether acl is valid early
        acl: make vfs_posix_acl_to_xattr() static
        acl: remove a slew of now unused helpers
        9p: use stub posix acl handlers
        cifs: use stub posix acl handlers
        ovl: use stub posix acl handlers
        ecryptfs: use stub posix acl handlers
        evm: remove evm_xattr_acl_change()
        xattr: use posix acl api
        ovl: use posix acl api
        ovl: implement set acl method
        ovl: implement get acl method
        ecryptfs: implement set acl method
        ecryptfs: implement get acl method
        ksmbd: use vfs_remove_acl()
        acl: add vfs_remove_acl()
        ...
      6a518afc
    • Linus Torvalds's avatar
      Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · bd907413
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "misc pile"
      
      * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: sysv: Fix sysv_nblocks() returns wrong value
        get rid of INT_LIMIT, use type_max() instead
        btrfs: replace INT_LIMIT(loff_t) with OFFSET_MAX
        fs: simplify vfs_get_super
        fs: drop useless condition from inode_needs_update_time
      bd907413
    • Linus Torvalds's avatar
      Merge tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 13c574fe
      Linus Torvalds authored
      Pull namespace fix from Al Viro:
       "Fix weird corner case in copy_mnt_ns()"
      
      * tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        copy_mnt_ns(): handle a corner case (overmounted mntns bindings) saner
      13c574fe
    • Linus Torvalds's avatar
      Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 75f4d9af
      Linus Torvalds authored
      Pull iov_iter updates from Al Viro:
       "iov_iter work; most of that is about getting rid of direction
        misannotations and (hopefully) preventing more of the same for the
        future"
      
      * tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        use less confusing names for iov_iter direction initializers
        iov_iter: saner checks for attempt to copy to/from iterator
        [xen] fix "direction" argument of iov_iter_kvec()
        [vhost] fix 'direction' argument of iov_iter_{init,bvec}()
        [target] fix iov_iter_bvec() "direction" argument
        [s390] memcpy_real(): WRITE is "data source", not destination...
        [s390] zcore: WRITE is "data source", not destination...
        [infiniband] READ is "data destination", not source...
        [fsi] WRITE is "data source", not destination...
        [s390] copy_oldmem_kernel() - WRITE is "data source", not destination
        csum_and_copy_to_iter(): handle ITER_DISCARD
        get rid of unlikely() on page_copy_sane() calls
      75f4d9af
    • Linus Torvalds's avatar
      Merge tag 'pull-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 268369b1
      Linus Torvalds authored
      Pull alpha updates from Al Viro:
       "Alpha architecture cleanups and fixes.
      
        One thing *not* included is lazy FPU switching stuff - this pile is
        just the straightforward stuff"
      
      * tag 'pull-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        alpha: ret_from_fork can go straight to ret_to_user
        alpha: syscall exit cleanup
        alpha: fix handling of a3 on straced syscalls
        alpha: fix syscall entry in !AUDUT_SYSCALL case
        alpha: _TIF_ALLWORK_MASK is unused
        alpha: fix TIF_NOTIFY_SIGNAL handling
      268369b1