1. 06 Jun, 2020 9 commits
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.8-2' of git://git.infradead.org/users/hch/dma-mapping · 6f2dc3d3
      Linus Torvalds authored
      Pull dma-mapping helpers from Christoph Hellwig:
       "These were in a separate stable branch so that various media and drm
        trees could pull the in for bug fixes, but looking at linux-next that
        hasn't actually happened yet. Still sending the APIs to you in the
        hope that these bug fixes get picked up for 5.8 in one way or another.
      
        Summary:
      
         - add DMA mapping helpers for struct sg_table (Marek Szyprowski)"
      
      * tag 'dma-mapping-5.8-2' of git://git.infradead.org/users/hch/dma-mapping:
        iommu: add generic helper for mapping sgtable objects
        scatterlist: add generic wrappers for iterating over sgtable objects
        dma-mapping: add generic helpers for mapping sgtable objects
      6f2dc3d3
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.8' of git://git.infradead.org/users/hch/dma-mapping · 1ee18de9
      Linus Torvalds authored
      Pull dma-mapping updates from Christoph Hellwig:
      
       - enhance the dma pool to allow atomic allocation on x86 with AMD SEV
         (David Rientjes)
      
       - two small cleanups (Jason Yan and Peter Collingbourne)
      
      * tag 'dma-mapping-5.8' of git://git.infradead.org/users/hch/dma-mapping:
        dma-contiguous: fix comment for dma_release_from_contiguous
        dma-pool: scale the default DMA coherent pool size with memory capacity
        x86/mm: unencrypted non-blocking DMA allocations use coherent pools
        dma-pool: add pool sizes to debugfs
        dma-direct: atomic allocations must come from atomic coherent pools
        dma-pool: dynamically expanding atomic pools
        dma-pool: add additional coherent pools to map to gfp mask
        dma-remap: separate DMA atomic pools from direct remap code
        dma-debug: make __dma_entry_alloc_check_leak() static
      1ee18de9
    • Linus Torvalds's avatar
      Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · e542e0dc
      Linus Torvalds authored
      Pull dmi update from Jean Delvare.
      
      * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        firmware/dmi: Report DMI Bios & EC firmware release
      e542e0dc
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 3925c3bb
      Linus Torvalds authored
      Pull PCI updates from Bjorn Helgaas:
       "Enumeration:
      
         - Program MPS for RCiEP devices (Ashok Raj)
      
         - Fix pci_register_host_bridge() device_register() error handling
           (Rob Herring)
      
         - Fix pci_host_bridge struct device release/free handling (Rob
           Herring)
      
        Resource management:
      
         - Allow resizing BARs for devices on root bus (Ard Biesheuvel)
      
        Power management:
      
         - Reduce Thunderbolt resume time by working around devices that don't
           support DLL Link Active reporting (Mika Westerberg)
      
         - Work around a Pericom USB controller OHCI/EHCI PME# defect
           (Kai-Heng Feng)
      
        Virtualization:
      
         - Add ACS quirk for Intel Root Complex Integrated Endpoints (Ashok
           Raj)
      
         - Avoid FLR for AMD Starship USB 3.0 (Kevin Buettner)
      
         - Avoid FLR for AMD Matisse HD Audio & USB 3.0 (Marcos Scriven)
      
        Error handling:
      
         - Use only _OSC (not HEST FIRMWARE_FIRST) to determine AER ownership
           (Alexandru Gagniuc, Kuppuswamy Sathyanarayanan)
      
         - Reduce verbosity by logging only ACPI_NOTIFY_DISCONNECT_RECOVER
           events (Kuppuswamy Sathyanarayanan)
      
         - Don't enable AER by default in Kconfig (Bjorn Helgaas)
      
        Peer-to-peer DMA:
      
         - Add AMD Zen Raven and Renoir Root Ports to whitelist (Alex Deucher)
      
        ASPM:
      
         - Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges (Kai-Heng Feng)
      
        Endpoint framework:
      
         - Fix DMA channel release in test (Kunihiko Hayashi)
      
         - Add page size as argument to pci_epc_mem_init() (Lad Prabhakar)
      
         - Add support to handle multiple base for mapping outbound memory
           (Lad Prabhakar)
      
        Generic host bridge driver:
      
         - Support building as module (Rob Herring)
      
         - Eliminate pci_host_common_probe wrappers (Rob Herring)
      
        Amlogic Meson PCIe controller driver:
      
         - Don't use FAST_LINK_MODE to set up link (Marc Zyngier)
      
        Broadcom STB PCIe controller driver:
      
         - Disable ASPM L0s if 'aspm-no-l0s' in DT (Jim Quinlan)
      
         - Fix clk_put() error (Jim Quinlan)
      
         - Fix window register offset (Jim Quinlan)
      
         - Assert fundamental reset on initialization (Nicolas Saenz Julienne)
      
         - Add notify xHCI reset property (Nicolas Saenz Julienne)
      
         - Add init routine for Raspberry Pi 4 VL805 USB controller (Nicolas
           Saenz Julienne)
      
         - Sync with Raspberry Pi 4 firmware for VL805 initialization (Nicolas
           Saenz Julienne)
      
        Cadence PCIe controller driver:
      
         - Remove "cdns,max-outbound-regions" DT property (replaced by
           "ranges") (Kishon Vijay Abraham I)
      
         - Read 32-bit (not 16-bit) Vendor ID/Device ID property from DT
           (Kishon Vijay Abraham I)
      
        Marvell Aardvark PCIe controller driver:
      
         - Improve link training (Marek Behún)
      
         - Add PHY support (Marek Behún)
      
         - Add "phys", "max-link-speed", "reset-gpios" to dt-binding (Marek
           Behún)
      
         - Train link immediately after enabling training to work around
           detection issues with some cards (Pali Rohár)
      
         - Issue PERST via GPIO to work around detection issues (Pali Rohár)
      
         - Don't blindly enable ASPM L0s (Pali Rohár)
      
         - Replace custom macros by standard linux/pci_regs.h macros (Pali
           Rohár)
      
        Microsoft Hyper-V host bridge driver:
      
         - Fix probe failure path to release resource (Wei Hu)
      
         - Retry PCI bus D0 entry on invalid device state for kdump (Wei Hu)
      
        Renesas R-Car PCIe controller driver:
      
         - Fix incorrect programming of OB windows (Andrew Murray)
      
         - Add suspend/resume (Kazufumi Ikeda)
      
         - Rename pcie-rcar.c to pcie-rcar-host.c (Lad Prabhakar)
      
         - Add endpoint controller driver (Lad Prabhakar)
      
         - Fix PCIEPAMR mask calculation (Lad Prabhakar)
      
         - Add r8a77961 to DT binding (Yoshihiro Shimoda)
      
        Socionext UniPhier Pro5 controller driver:
      
         - Add endpoint controller driver (Kunihiko Hayashi)
      
        Synopsys DesignWare PCIe controller driver:
      
         - Program outbound ATU upper limit register (Alan Mikhak)
      
         - Fix inner MSI IRQ domain registration (Marc Zyngier)
      
        Miscellaneous:
      
         - Check for platform_get_irq() failure consistently (negative return
           means failure) (Aman Sharma)
      
         - Fix several runtime PM get/put imbalances (Dinghao Liu)
      
         - Use flexible-array and struct_size() helpers for code cleanup
           (Gustavo A. R. Silva)
      
         - Update & fix issues in bridge emulation of PCIe registers (Jon
           Derrick)
      
         - Add macros for bridge window names (PCI_BRIDGE_IO_WINDOW, etc)
           (Krzysztof Wilczyński)
      
         - Work around Intel PCH MROMs that have invalid BARs (Xiaochun Lee)"
      
      * tag 'pci-v5.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (100 commits)
        PCI: uniphier: Add Socionext UniPhier Pro5 PCIe endpoint controller driver
        PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
        PCI/DPC: Print IRQ number used by port
        PCI/AER: Use "aer" variable for capability offset
        PCI/AER: Remove redundant dev->aer_cap checks
        PCI/AER: Remove redundant pci_is_pcie() checks
        PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership
        PCI: tegra: Fix runtime PM imbalance on error
        PCI: vmd: Filter resource type bits from shadow register
        PCI: tegra194: Fix runtime PM imbalance on error
        dt-bindings: PCI: Add UniPhier PCIe endpoint controller description
        PCI: hv: Use struct_size() helper
        PCI: Rename _DSM constants to align with spec
        PCI: Avoid FLR for AMD Starship USB 3.0
        PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
        x86/PCI: Drop unused xen_register_pirq() gsi_override parameter
        PCI: dwc: Use private data pointer of "struct irq_domain" to get pcie_port
        PCI: amlogic: meson: Don't use FAST_LINK_MODE to set up link
        PCI: dwc: Fix inner MSI IRQ domain registration
        PCI: dwc: pci-dra7xx: Use devm_platform_ioremap_resource_byname()
        ...
      3925c3bb
    • Zou Wei's avatar
      hpfs: fix warning due to superfluous semicolon · 9fa88c5d
      Zou Wei authored
      Fixes coccicheck warning:
      
        fs/hpfs/buffer.c:56:2-3: Unneeded semicolon
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Signed-off-by: default avatarMikulas Patocka <mikulas@twibright.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9fa88c5d
    • Linus Torvalds's avatar
      Merge branch 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · fe3bc8a9
      Linus Torvalds authored
      Pull workqueue updates from Tejun Heo:
       "Mostly cleanups and other trivial changes.
      
        The only interesting change is Sebastian's rcuwait conversion for RT"
      
      * 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON()
        workqueue: fix a piece of comment about reserved bits for work flags
        workqueue: remove useless unlock() and lock() in series
        workqueue: void unneeded requeuing the pwq in rescuer thread
        workqueue: Convert the pool::lock and wq_mayday_lock to raw_spinlock_t
        workqueue: Use rcuwait for wq_manager_wait
        workqueue: Remove unnecessary kfree() call in rcu_free_wq()
        workqueue: Fix an use after free in init_rescuer()
        workqueue: Use IS_ERR and PTR_ERR instead of PTR_ERR_OR_ZERO.
      fe3bc8a9
    • Linus Torvalds's avatar
      Merge branch 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 4a7e89c5
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
       "Just two patches: one to add system-level cpu.stat to the root cgroup
        for convenience and a trivial comment update"
      
      * 'for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: add cpu.stat file to root cgroup
        cgroup: Remove stale comments
      4a7e89c5
    • Linus Torvalds's avatar
      Merge tag 'integrity-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity · 3c0ad98c
      Linus Torvalds authored
      Pull integrity updates from Mimi Zohar:
       "The main changes are extending the TPM 2.0 PCR banks with bank
        specific file hashes, calculating the "boot_aggregate" based on other
        TPM PCR banks, using the default IMA hash algorithm, instead of SHA1,
        as the basis for the cache hash table key, and preventing the mprotect
        syscall to circumvent an IMA mmap appraise policy rule.
      
         - In preparation for extending TPM 2.0 PCR banks with bank specific
           digests, commit 0b6cf6b9 ("tpm: pass an array of
           tpm_extend_digest structures to tpm_pcr_extend()") modified
           tpm_pcr_extend(). The original SHA1 file digests were
           padded/truncated, before being extended into the other TPM PCR
           banks. This pull request calculates and extends the TPM PCR banks
           with bank specific file hashes completing the above change.
      
         - The "boot_aggregate", the first IMA measurement list record, is the
           "trusted boot" link between the pre-boot environment and the
           running OS. With TPM 2.0, the "boot_aggregate" record is not
           limited to being based on the SHA1 TPM PCR bank, but can be
           calculated based on any enabled bank, assuming the hash algorithm
           is also enabled in the kernel.
      
        Other changes include the following and five other bug fixes/code
        clean up:
      
         - supporting both a SHA1 and a larger "boot_aggregate" digest in a
           custom template format containing both the the SHA1 ('d') and
           larger digests ('d-ng') fields.
      
         - Initial hash table key fix, but additional changes would be good"
      
      * tag 'integrity-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
        ima: Directly free *entry in ima_alloc_init_template() if digests is NULL
        ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()
        ima: Directly assign the ima_default_policy pointer to ima_rules
        ima: verify mprotect change is consistent with mmap policy
        evm: Fix possible memory leak in evm_calc_hmac_or_hash()
        ima: Set again build_ima_appraise variable
        ima: Remove redundant policy rule set in add_rules()
        ima: Fix ima digest hash table key calculation
        ima: Use ima_hash_algo for collision detection in the measurement list
        ima: Calculate and extend PCR with digests in ima_template_entry
        ima: Allocate and initialize tfm for each PCR bank
        ima: Switch to dynamically allocated buffer for template digests
        ima: Store template digest directly in ima_template_entry
        ima: Evaluate error in init_ima()
        ima: Switch to ima_hash_algo for boot aggregate
      3c0ad98c
    • Erwan Velu's avatar
      firmware/dmi: Report DMI Bios & EC firmware release · f5152f4d
      Erwan Velu authored
      Some vendors like HPe or Dell, encode the release version of their BIOS
      in the "System BIOS {Major|Minor} Release" fields of Type 0.
      
      This information is used to know which bios release actually runs.
      It could be used for some quirks, debugging sessions or inventory tasks.
      
      A typical output for a Dell system running the 65.27 bios is :
      	[root@t1700 ~]# cat /sys/devices/virtual/dmi/id/bios_release
      	65.27
      	[root@t1700 ~]#
      
      Servers that have a BMC encode the release version of their firmware in the
       "Embedded Controller Firmware {Major|Minor} Release" fields of Type 0.
      
      This information is used to know which BMC release actually runs.
      It could be used for some quirks, debugging sessions or inventory tasks.
      
      A typical output for a Dell system running the 3.75 bmc release is :
          [root@t1700 ~]# cat /sys/devices/virtual/dmi/id/ec_firmware_release
          3.75
          [root@t1700 ~]#
      Signed-off-by: default avatarErwan Velu <e.velu@criteo.com>
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      f5152f4d
  2. 05 Jun, 2020 31 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.8-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · aaa2faab
      Linus Torvalds authored
      Pull orangefs updates from Mike Marshall:
      
       - John Hubbard's conversion from get_user_pages() to pin_user_pages()
      
       - Colin Ian King's removal of an unneeded variable initialization
      
      * tag 'for-linus-5.8-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: convert get_user_pages() --> pin_user_pages()
        orangefs: remove redundant assignment to variable ret
      aaa2faab
    • Linus Torvalds's avatar
      Merge tag 'dlm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · e3cea0ca
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "This set includes a couple minor cleanups, and dropping the
        interruptible from a wait_event that waits for an event from the
        userspace cluster management"
      
      * tag 'dlm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        dlm: remove BUG() before panic()
        dlm: Switch to using wait_event()
        fs:dlm:remove unneeded semicolon in rcom.c
        dlm: user: Replace zero-length array with flexible-array member
        dlm: dlm_internal: Replace zero-length array with flexible-array member
      e3cea0ca
    • Linus Torvalds's avatar
      Merge tag '5.8-rc-smb3-fixes-part-1' of git://git.samba.org/sfrench/cifs-2.6 · 3803d5e4
      Linus Torvalds authored
      Pull cifs updates from Steve French:
       "22 changesets, 2 for stable.
      
        Includes big performance improvement for large i/o when using
        multichannel, also includes DFS fixes"
      
      * tag '5.8-rc-smb3-fixes-part-1' of git://git.samba.org/sfrench/cifs-2.6: (22 commits)
        cifs: update internal module version number
        cifs: multichannel: try to rebind when reconnecting a channel
        cifs: multichannel: use pointer for binding channel
        smb3: remove static checker warning
        cifs: multichannel: move channel selection above transport layer
        cifs: multichannel: always zero struct cifs_io_parms
        cifs: dump Security Type info in DebugData
        smb3: fix incorrect number of credits when ioctl MaxOutputResponse > 64K
        smb3: default to minimum of two channels when multichannel specified
        cifs: multichannel: move channel selection in function
        cifs: fix minor typos in comments and log messages
        smb3: minor update to compression header definitions
        cifs: minor fix to two debug messages
        cifs: Standardize logging output
        smb3: Add new parm "nodelete"
        cifs: move some variables off the stack in smb2_ioctl_query_info
        cifs: reduce stack use in smb2_compound_op
        cifs: get rid of unused parameter in reconn_setup_dfs_targets()
        cifs: handle hostnames that resolve to same ip in failover
        cifs: set up next DFS target before generic_ip_connect()
        ...
      3803d5e4
    • Linus Torvalds's avatar
      Merge tag 'afs-next-20200604' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 9daa0a27
      Linus Torvalds authored
      Pull AFS updates from David Howells:
       "There's some core VFS changes which affect a couple of filesystems:
      
         - Make the inode hash table RCU safe and providing some RCU-safe
           accessor functions. The search can then be done without taking the
           inode_hash_lock. Care must be taken because the object may be being
           deleted and no wait is made.
      
         - Allow iunique() to avoid taking the inode_hash_lock.
      
         - Allow AFS's callback processing to avoid taking the inode_hash_lock
           when using the inode table to find an inode to notify.
      
         - Improve Ext4's time updating. Konstantin Khlebnikov said "For now,
           I've plugged this issue with try-lock in ext4 lazy time update.
           This solution is much better."
      
        Then there's a set of changes to make a number of improvements to the
        AFS driver:
      
         - Improve callback (ie. third party change notification) processing
           by:
      
            (a) Relying more on the fact we're doing this under RCU and by
                using fewer locks. This makes use of the RCU-based inode
                searching outlined above.
      
            (b) Moving to keeping volumes in a tree indexed by volume ID
                rather than a flat list.
      
            (c) Making the server and volume records logically part of the
                cell. This means that a server record now points directly at
                the cell and the tree of volumes is there. This removes an N:M
                mapping table, simplifying things.
      
         - Improve keeping NAT or firewall channels open for the server
           callbacks to reach the client by actively polling the fileserver on
           a timed basis, instead of only doing it when we have an operation
           to process.
      
         - Improving detection of delayed or lost callbacks by including the
           parent directory in the list of file IDs to be queried when doing a
           bulk status fetch from lookup. We can then check to see if our copy
           of the directory has changed under us without us getting notified.
      
         - Determine aliasing of cells (such as a cell that is pointed to be a
           DNS alias). This allows us to avoid having ambiguity due to
           apparently different cells using the same volume and file servers.
      
         - Improve the fileserver rotation to do more probing when it detects
           that all of the addresses to a server are listed as non-responsive.
           It's possible that an address that previously stopped responding
           has become responsive again.
      
        Beyond that, lay some foundations for making some calls asynchronous:
      
         - Turn the fileserver cursor struct into a general operation struct
           and hang the parameters off of that rather than keeping them in
           local variables and hang results off of that rather than the call
           struct.
      
         - Implement some general operation handling code and simplify the
           callers of operations that affect a volume or a volume component
           (such as a file). Most of the operation is now done by core code.
      
         - Operations are supplied with a table of operations to issue
           different variants of RPCs and to manage the completion, where all
           the required data is held in the operation object, thereby allowing
           these to be called from a workqueue.
      
         - Put the standard "if (begin), while(select), call op, end" sequence
           into a canned function that just emulates the current behaviour for
           now.
      
        There are also some fixes interspersed:
      
         - Don't let the EACCES from ICMP6 mapping reach the user as such,
           since it's confusing as to whether it's a filesystem error. Convert
           it to EHOSTUNREACH.
      
         - Don't use the epoch value acquired through probing a server. If we
           have two servers with the same UUID but in different cells, it's
           hard to draw conclusions from them having different epoch values.
      
         - Don't interpret the argument to the CB.ProbeUuid RPC as a
           fileserver UUID and look up a fileserver from it.
      
         - Deal with servers in different cells having the same UUIDs. In the
           event that a CB.InitCallBackState3 RPC is received, we have to
           break the callback promises for every server record matching that
           UUID.
      
         - Don't let afs_statfs return values that go below 0.
      
         - Don't use running fileserver probe state to make server selection
           and address selection decisions on. Only make decisions on final
           state as the running state is cleared at the start of probing"
      
      Acked-by: Al Viro <viro@zeniv.linux.org.uk> (fs/inode.c part)
      
      * tag 'afs-next-20200604' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (27 commits)
        afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly
        afs: Show more a bit more server state in /proc/net/afs/servers
        afs: Don't use probe running state to make decisions outside probe code
        afs: Fix afs_statfs() to not let the values go below zero
        afs: Fix the by-UUID server tree to allow servers with the same UUID
        afs: Reorganise volume and server trees to be rooted on the cell
        afs: Add a tracepoint to track the lifetime of the afs_volume struct
        afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op
        afs: Detect cell aliases 2 - Cells with no root volumes
        afs: Detect cell aliases 1 - Cells with root volumes
        afs: Implement client support for the YFSVL.GetCellName RPC op
        afs: Retain more of the VLDB record for alias detection
        afs: Fix handling of CB.ProbeUuid cache manager op
        afs: Don't get epoch from a server because it may be ambiguous
        afs: Build an abstraction around an "operation" concept
        afs: Rename struct afs_fs_cursor to afs_operation
        afs: Remove the error argument from afs_protocol_error()
        afs: Set error flag rather than return error from file status decode
        afs: Make callback processing more efficient.
        afs: Show more information in /proc/net/afs/servers
        ...
      9daa0a27
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 0b166a57
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "A lot of bug fixes and cleanups for ext4, including:
      
         - Fix performance problems found in dioread_nolock now that it is the
           default, caused by transaction leaks.
      
         - Clean up fiemap handling in ext4
      
         - Clean up and refactor multiple block allocator (mballoc) code
      
         - Fix a problem with mballoc with a smaller file systems running out
           of blocks because they couldn't properly use blocks that had been
           reserved by inode preallocation.
      
         - Fixed a race in ext4_sync_parent() versus rename()
      
         - Simplify the error handling in the extent manipulation code
      
         - Make sure all metadata I/O errors are felected to
           ext4_ext_dirty()'s and ext4_make_inode_dirty()'s callers.
      
         - Avoid passing an error pointer to brelse in ext4_xattr_set()
      
         - Fix race which could result to freeing an inode on the dirty last
           in data=journal mode.
      
         - Fix refcount handling if ext4_iget() fails
      
         - Fix a crash in generic/019 caused by a corrupted extent node"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
        ext4: avoid unnecessary transaction starts during writeback
        ext4: don't block for O_DIRECT if IOCB_NOWAIT is set
        ext4: remove the access_ok() check in ext4_ioctl_get_es_cache
        fs: remove the access_ok() check in ioctl_fiemap
        fs: handle FIEMAP_FLAG_SYNC in fiemap_prep
        fs: move fiemap range validation into the file systems instances
        iomap: fix the iomap_fiemap prototype
        fs: move the fiemap definitions out of fs.h
        fs: mark __generic_block_fiemap static
        ext4: remove the call to fiemap_check_flags in ext4_fiemap
        ext4: split _ext4_fiemap
        ext4: fix fiemap size checks for bitmap files
        ext4: fix EXT4_MAX_LOGICAL_BLOCK macro
        add comment for ext4_dir_entry_2 file_type member
        jbd2: avoid leaking transaction credits when unreserving handle
        ext4: drop ext4_journal_free_reserved()
        ext4: mballoc: use lock for checking free blocks while retrying
        ext4: mballoc: refactor ext4_mb_good_group()
        ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling
        ext4: mballoc: refactor ext4_mb_discard_preallocations()
        ...
      0b166a57
    • Linus Torvalds's avatar
      Merge tag 'for-5.8/dm-changes' of... · b25c6644
      Linus Torvalds authored
      Merge tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - The largest change for this cycle is the DM zoned target's metadata
         version 2 feature that adds support for pairing regular block devices
         with a zoned device to ease the performance impact associated with
         finite random zones of zoned device.
      
         The changes came in three batches: the first prepared for and then
         added the ability to pair a single regular block device, the second
         was a batch of fixes to improve zoned's reclaim heuristic, and the
         third removed the limitation of only adding a single additional
         regular block device to allow many devices.
      
         Testing has shown linear scaling as more devices are added.
      
       - Add new emulated block size (ebs) target that emulates a smaller
         logical_block_size than a block device supports
      
         The primary use-case is to emulate "512e" devices that have 512 byte
         logical_block_size and 4KB physical_block_size. This is useful to
         some legacy applications that otherwise wouldn't be able to be used
         on 4K devices because they depend on issuing IO in 512 byte
         granularity.
      
       - Add discard interfaces to DM bufio. First consumer of the interface
         is the dm-ebs target that makes heavy use of dm-bufio.
      
       - Fix DM crypt's block queue_limits stacking to not truncate
         logic_block_size.
      
       - Add Documentation for DM integrity's status line.
      
       - Switch DMDEBUG from a compile time config option to instead use
         dynamic debug via pr_debug.
      
       - Fix DM multipath target's hueristic for how it manages
         "queue_if_no_path" state internally.
      
         DM multipath now avoids disabling "queue_if_no_path" unless it is
         actually needed (e.g. in response to configure timeout or explicit
         "fail_if_no_path" message).
      
         This fixes reports of spurious -EIO being reported back to userspace
         application during fault tolerance testing with an NVMe backend.
         Added various dynamic DMDEBUG messages to assist with debugging
         queue_if_no_path in the future.
      
       - Add a new DM multipath "Historical Service Time" Path Selector.
      
       - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.
      
       - Improve DM writecache target performance by using explicit cache
         flushing for target's single-threaded usecase and a small cleanup to
         remove unnecessary test in persistent_memory_claim.
      
       - Other small cleanups in DM core, dm-persistent-data, and DM
         integrity.
      
      * tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
        dm crypt: avoid truncating the logical block size
        dm mpath: add DM device name to Failing/Reinstating path log messages
        dm mpath: enhance queue_if_no_path debugging
        dm mpath: restrict queue_if_no_path state machine
        dm mpath: simplify __must_push_back
        dm zoned: check superblock location
        dm zoned: prefer full zones for reclaim
        dm zoned: select reclaim zone based on device index
        dm zoned: allocate zone by device index
        dm zoned: support arbitrary number of devices
        dm zoned: move random and sequential zones into struct dmz_dev
        dm zoned: per-device reclaim
        dm zoned: add metadata pointer to struct dmz_dev
        dm zoned: add device pointer to struct dm_zone
        dm zoned: allocate temporary superblock for tertiary devices
        dm zoned: convert to xarray
        dm zoned: add a 'reserved' zone flag
        dm zoned: improve logging messages for reclaim
        dm zoned: avoid unnecessary device recalulation for secondary superblock
        dm zoned: add debugging message for reading superblocks
        ...
      b25c6644
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 818dbde7
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       :This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
        target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
        of other minor updates.
      
        There are no major core changes in this series apart from a
        refactoring in scsi_lib.c"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
        scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
        scsi: cxgb3i: Fix some leaks in init_act_open()
        scsi: ibmvscsi: Make some functions static
        scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
        scsi: ufs: Fix WriteBooster flush during runtime suspend
        scsi: ufs: Fix index of attributes query for WriteBooster feature
        scsi: ufs: Allow WriteBooster on UFS 2.2 devices
        scsi: ufs: Remove unnecessary memset for dev_info
        scsi: ufs-qcom: Fix scheduling while atomic issue
        scsi: mpt3sas: Fix reply queue count in non RDPQ mode
        scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
        scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
        scsi: vhost: Notify TCM about the maximum sg entries supported per command
        scsi: qla2xxx: Remove return value from qla_nvme_ls()
        scsi: qla2xxx: Remove an unused function
        scsi: iscsi: Register sysfs for iscsi workqueue
        scsi: scsi_debug: Parser tables and code interaction
        scsi: core: Refactor scsi_mq_setup_tags function
        scsi: core: Fix incorrect usage of shost_for_each_device
        scsi: qla2xxx: Fix endianness annotations in source files
        ...
      818dbde7
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 242b2331
      Linus Torvalds authored
      Pull rdma updates from Jason Gunthorpe:
       "A more active cycle than most of the recent past, with a few large,
        long discussed works this time.
      
        The RNBD block driver has been posted for nearly two years now, and
        flowing through RDMA due to it also introducing a new ULP.
      
        The removal of FMR has been a recurring discussion theme for a long
        time.
      
        And the usual smattering of features and bug fixes.
      
        Summary:
      
         - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa
      
         - Continuing driver cleanups in bnxt_re, hns
      
         - Big cleanup of mlx5 QP creation flows
      
         - More consistent use of src port and flow label when LAG is used and
           a mlx5 implementation
      
         - Additional set of cleanups for IB CM
      
         - 'RNBD' network block driver and target. This is a network block
           RDMA device specific to ionos's cloud environment. It brings strong
           multipath and resiliency capabilities.
      
         - Accelerated IPoIB for HFI1
      
         - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
           async fds
      
         - Support for exchanging the new IBTA defiend ECE data during RDMA CM
           exchanges
      
         - Removal of the very old and insecure FMR interface from all ULPs
           and drivers. FRWR should be preferred for at least a decade now"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
        RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
        RDMA/mlx5: Return ECE DC support
        RDMA/mlx5: Don't rely on FW to set zeros in ECE response
        RDMA/mlx5: Return an error if copy_to_user fails
        IB/hfi1: Use free_netdev() in hfi1_netdev_free()
        RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
        RDMA/core: Move and rename trace_cm_id_create()
        IB/hfi1: Fix hfi1_netdev_rx_init() error handling
        RDMA: Remove 'max_map_per_fmr'
        RDMA: Remove 'max_fmr'
        RDMA/core: Remove FMR device ops
        RDMA/rdmavt: Remove FMR memory registration
        RDMA/mthca: Remove FMR support for memory registration
        RDMA/mlx4: Remove FMR support for memory registration
        RDMA/i40iw: Remove FMR leftovers
        RDMA/bnxt_re: Remove FMR leftovers
        RDMA/mlx5: Remove FMR leftovers
        RDMA/core: Remove FMR pool API
        RDMA/rds: Remove FMR support for memory registration
        RDMA/srp: Remove support for FMR memory registration
        ...
      242b2331
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 3f7e8237
      Linus Torvalds authored
      Pull GPIO updates from Linus Walleij:
       "This is the bulk of GPIO changes for the v5.8 kernel cycle.
      
        Core changes:
      
         - A new GPIO aggregator driver has been merged: this can join a few
           select GPIO lines into a new aggregated GPIO chip. This can be used
           for security: a process can be granted access to only these lines,
           for example for industrial control. Another way to use this is to
           reexpose certain select lines to a virtual machine or container.
      
         - Warn if the gpio-line-names is too long in he DT parser core.
      
         - GPIO lines can now be looked up by line name in addition to being
           looked up by offset.
      
        New drivers:
      
         - A new generic regmap GPIO driver has been merged. Too many regmap
           drivers are starting to look like each other so we need to create
           some common ground and try to move drivers over to using that.
      
         - The F7188X driver now supports F81865.
      
        Driver improvements:
      
         - Large improvements to the PCA953x expander, get multiple lines and
           several cleanups.
      
         - Large improvements to the DesignWare DWAPB driver, and Sergey Semin
           has volunteered to maintain it.
      
         - PL061 can now be built as a module, this is part of a bigger effort
           to make the ARM platforms more modular"
      
      * tag 'gpio-v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (77 commits)
        gpio: pca953x: Drop unneeded ACPI_PTR()
        MAINTAINERS: Add gpio regmap section
        gpio: add a reusable generic gpio_chip using regmap
        gpiolib: Introduce gpiochip_irqchip_add_domain()
        gpio: gpiolib: Allow GPIO IRQs to lazy disable
        gpiolib: Separate GPIO_GET_LINEINFO_WATCH_IOCTL conditional
        gpio: rcar: Fix runtime PM imbalance on error
        gpio: pca935x: Allow IRQ support for driver built as a module
        gpio: pxa: Add COMPILE_TEST support
        dt-bindings: gpio: Add renesas,em-gio bindings
        MAINTAINERS: Fix file name for DesignWare GPIO DT schema
        gpio: dwapb: Remove unneeded has_irq member in struct dwapb_port_property
        gpio: dwapb: Don't use IRQ 0 as valid Linux interrupt
        gpio: dwapb: avoid error message for optional IRQ
        gpio: dwapb: Call acpi_gpiochip_free_interrupts() on GPIO chip de-registration
        gpio: max730x: bring gpiochip_add_data after port config
        MAINTAINERS: Add GPIO Aggregator section
        docs: gpio: Add GPIO Aggregator documentation
        gpio: Add GPIO Aggregator
        gpiolib: Add support for GPIO lookup by line name
        ...
      3f7e8237
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.8-1' of git://github.com/cminyard/linux-ipmi · 1f2dc7f5
      Linus Torvalds authored
      Pull IPMI updates from Corey Minyard:
       "A few small fixes for things, nothing earth shattering"
      
      * tag 'for-linus-5.8-1' of git://github.com/cminyard/linux-ipmi:
        ipmi:ssif: Remove dynamic platform device handing
        Try to load acpi_ipmi when an SSIF ACPI IPMI interface is added
        ipmi_si: Load acpi_ipmi when ACPI IPMI interface added
        ipmi:bt-bmc: Fix error handling and status check
        ipmi: Replace guid_copy() with import_guid() where it makes sense
        ipmi: use vzalloc instead of kmalloc for user creation
        ipmi:bt-bmc: Fix some format issue of the code
        ipmi:bt-bmc: Avoid unnecessary check
      1f2dc7f5
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio · 5a36f0f3
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Block accesses to disabled MMIO space (Alex Williamson)
      
       - VFIO device migration API (Kirti Wankhede)
      
       - type1 IOMMU dirty bitmap API and implementation (Kirti Wankhede)
      
       - PCI NULL capability masking (Alex Williamson)
      
       - Memory leak fixes (Qian Cai)
      
       - Reference leak fix (Qiushi Wu)
      
      * tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio:
        vfio iommu: typecast corrections
        vfio iommu: Use shift operation for 64-bit integer division
        vfio/mdev: Fix reference count leak in add_mdev_supported_type
        vfio: Selective dirty page tracking if IOMMU backed device pins pages
        vfio iommu: Add migration capability to report supported features
        vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap
        vfio iommu: Implementation of ioctl for dirty pages tracking
        vfio iommu: Add ioctl definition for dirty pages tracking
        vfio iommu: Cache pgsize_bitmap in struct vfio_iommu
        vfio iommu: Remove atomicity of ref_count of pinned pages
        vfio: UAPI for migration interface for device state
        vfio/pci: fix memory leaks of eventfd ctx
        vfio/pci: fix memory leaks in alloc_perm_bits()
        vfio-pci: Mask cap zero
        vfio-pci: Invalidate mmaps and block MMIO access on disabled memory
        vfio-pci: Fault mmaps to enable vma tracking
        vfio/type1: Support faulting PFNMAP vmas
      5a36f0f3
    • Linus Torvalds's avatar
      Merge tag 'core_core_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac7b3421
      Linus Torvalds authored
      Pull READ_IMPLIES_EXEC changes from Borislav Petkov:
       "Split the old READ_IMPLIES_EXEC workaround from executable
        PT_GNU_STACK now that toolchains long support PT_GNU_STACK marking and
        there's no need anymore to force modern programs into having all its
        user mappings executable instead of only the stack and the PROT_EXEC
        ones.
      
        Disable that automatic READ_IMPLIES_EXEC forcing on x86-64 and
        arm64.
      
        Add tables documenting how READ_IMPLIES_EXEC is handled on x86-64, arm
        and arm64.
      
        By Kees Cook"
      
      * tag 'core_core_updates_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        arm64/elf: Disable automatic READ_IMPLIES_EXEC for 64-bit address spaces
        arm32/64/elf: Split READ_IMPLIES_EXEC from executable PT_GNU_STACK
        arm32/64/elf: Add tables to document READ_IMPLIES_EXEC
        x86/elf: Disable automatic READ_IMPLIES_EXEC on 64-bit
        x86/elf: Split READ_IMPLIES_EXEC from executable PT_GNU_STACK
        x86/elf: Add table to document READ_IMPLIES_EXEC
      ac7b3421
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 7ae77150
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Support for userspace to send requests directly to the on-chip GZIP
         accelerator on Power9.
      
       - Rework of our lockless page table walking (__find_linux_pte()) to
         make it safe against parallel page table manipulations without
         relying on an IPI for serialisation.
      
       - A series of fixes & enhancements to make our machine check handling
         more robust.
      
       - Lots of plumbing to add support for "prefixed" (64-bit) instructions
         on Power10.
      
       - Support for using huge pages for the linear mapping on 8xx (32-bit).
      
       - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound
         driver.
      
       - Removal of some obsolete 40x platforms and associated cruft.
      
       - Initial support for booting on Power10.
      
       - Lots of other small features, cleanups & fixes.
      
      Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan,
      Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent
      Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe
      JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F.,
      Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A.
      R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley,
      Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan
      Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal
      Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
      Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai,
      Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
      Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler,
      Wolfram Sang, Xiongfeng Wang.
      
      * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits)
        powerpc/pseries: Make vio and ibmebus initcalls pseries specific
        cxl: Remove dead Kconfig options
        powerpc: Add POWER10 architected mode
        powerpc/dt_cpu_ftrs: Add MMA feature
        powerpc/dt_cpu_ftrs: Enable Prefixed Instructions
        powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected
        powerpc: Add support for ISA v3.1
        powerpc: Add new HWCAP bits
        powerpc/64s: Don't set FSCR bits in INIT_THREAD
        powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
        powerpc/64s: Don't let DT CPU features set FSCR_DSCR
        powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
        powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
        powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel
        powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations
        powerpc/module_64: Consolidate ftrace code
        powerpc/32: Disable KASAN with pages bigger than 16k
        powerpc/uaccess: Don't set KUEP by default on book3s/32
        powerpc/uaccess: Don't set KUAP by default on book3s/32
        powerpc/8xx: Reduce time spent in allow_user_access() and friends
        ...
      7ae77150
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · 084623e4
      Linus Torvalds authored
      Pull module updates from Jessica Yu:
      
       - Harden CONFIG_STRICT_MODULE_RWX by rejecting any module that has
         SHF_WRITE|SHF_EXECINSTR sections
      
       - Remove and clean up nested #ifdefs, as it makes code hard to read
      
      * tag 'modules-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        module: Harden STRICT_MODULE_RWX
        module: break nested ARCH_HAS_STRICT_MODULE_RWX and STRICT_MODULE_RWX #ifdefs
      084623e4
    • Eric Biggers's avatar
      dm crypt: avoid truncating the logical block size · 64611a15
      Eric Biggers authored
      queue_limits::logical_block_size got changed from unsigned short to
      unsigned int, but it was forgotten to update crypt_io_hints() to use the
      new type.  Fix it.
      
      Fixes: ad6bf88a ("block: fix an integer overflow in logical block size")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      64611a15
    • Mike Snitzer's avatar
      dm mpath: add DM device name to Failing/Reinstating path log messages · 04867370
      Mike Snitzer authored
      When there are many DM multipath devices it really helps to have
      additional context for which DM device a failed or reinstated path is
      part of.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      04867370
    • Mike Snitzer's avatar
      dm mpath: enhance queue_if_no_path debugging · 4c3f4838
      Mike Snitzer authored
      Add more DMDEBUG that shows arguments passed and caller, and another
      that shows state of related flags at end of queue_if_no_path().
      
      Also add queue_if_no_path DMDEBUG to multipath_resume().
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      4c3f4838
    • Mike Snitzer's avatar
      dm mpath: restrict queue_if_no_path state machine · 553ec94c
      Mike Snitzer authored
      Do not allow saving disabled queue_if_no_path if already saved as
      enabled; implies multiple suspends (which shouldn't ever happen).  Log
      if this unlikely scenario is ever triggered.
      
      Also, only write MPATHF_SAVED_QUEUE_IF_NO_PATH during presuspend or if
      "fail_if_no_path" message.  MPATHF_SAVED_QUEUE_IF_NO_PATH is no longer
      always modified, e.g.: even if queue_if_no_path()'s save_old_value
      argument wasn't set.  This just implies a bit tighter control over
      the management of MPATHF_SAVED_QUEUE_IF_NO_PATH.  Side-effect is
      multipath_resume() doesn't reset MPATHF_QUEUE_IF_NO_PATH unless
      MPATHF_SAVED_QUEUE_IF_NO_PATH was set (during presuspend); and at that
      time the MPATHF_SAVED_QUEUE_IF_NO_PATH bit gets cleared.  So
      MPATHF_SAVED_QUEUE_IF_NO_PATH's use is much more narrow in scope.
      
      Last, but not least, do _not_ disable queue_if_no_path during noflush
      suspend.  There is no need/benefit to saving off queue_if_no_path via
      MPATHF_SAVED_QUEUE_IF_NO_PATH and clearing MPATHF_QUEUE_IF_NO_PATH for
      noflush suspend -- by avoiding this needless queue_if_no_path flag
      churn there is less potential for MPATHF_QUEUE_IF_NO_PATH to get lost.
      Which avoids potential for IOs to be errored back up to userspace
      during DM multipath's handling of path failures.
      
      That said, this last change papers over a reported issue concerning
      request-based dm-multipath's interaction with blk-mq, relative to
      suspend and resume: multipath_endio is being called _before_
      multipath_resume.  This should never happen if DM suspend's
      blk_mq_quiesce_queue() + dm_wait_for_completion() is genuinely waiting
      for all inflight blk-mq requests to complete.  Similarly:
      drivers/md/dm.c:__dm_resume() clearly calls dm_table_resume_targets()
      _before_ dm_start_queue()'s blk_mq_unquiesce_queue() is called.  If
      the queue isn't even restarted until after multipath_resume(); the BIG
      question that still needs answering is: how can multipath_end_io beat
      multipath_resume in a race!?
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      553ec94c
    • Mike Snitzer's avatar
      dm mpath: simplify __must_push_back · a862e4e2
      Mike Snitzer authored
      Remove micro-optimization that infers device is between presuspend and
      resume (was done purely to avoid call to dm_noflush_suspending, which
      isn't expensive anyway).
      
      Remove flags argument since they are no longer checked.
      
      And remove must_push_back_bio() since it was simply a call to
      __must_push_back().
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      a862e4e2
    • Hannes Reinecke's avatar
      dm zoned: check superblock location · 27d49ac1
      Hannes Reinecke authored
      When specifying several devices the superblock location must be
      checked to ensure the devices are specified in the correct order.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      27d49ac1
    • Hannes Reinecke's avatar
      dm zoned: prefer full zones for reclaim · 2094045f
      Hannes Reinecke authored
      Prefer full zones when selecting the next zone for reclaim.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      2094045f
    • Hannes Reinecke's avatar
      dm zoned: select reclaim zone based on device index · 69875d44
      Hannes Reinecke authored
      per-device reclaim should select zones on that device only.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      69875d44
    • Hannes Reinecke's avatar
      dm zoned: allocate zone by device index · 22c1ef66
      Hannes Reinecke authored
      When allocating a zone, pass in an indicator on which device the zone
      should be allocated; this increases performance for a multi-device
      setup because reclaim will now allocate zones on the device for which
      reclaim is running.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      22c1ef66
    • Hannes Reinecke's avatar
      dm zoned: support arbitrary number of devices · 4dba1288
      Hannes Reinecke authored
      Remove the hard-coded limit of two devices and support an unlimited
      number of additional zoned devices.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      4dba1288
    • Hannes Reinecke's avatar
      dm zoned: move random and sequential zones into struct dmz_dev · bd82fdab
      Hannes Reinecke authored
      Random and sequential zones should be part of the respective
      device structure to make arbitration between devices possible.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      bd82fdab
    • Hannes Reinecke's avatar
      dm zoned: per-device reclaim · f97809ae
      Hannes Reinecke authored
      Instead of having one reclaim workqueue for the entire set we should
      be allocating a reclaim workqueue per device; doing so will reduce
      contention and should boost performance for a multi-device setup.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      f97809ae
    • Hannes Reinecke's avatar
      dm zoned: add metadata pointer to struct dmz_dev · 18979819
      Hannes Reinecke authored
      Add a metadata pointer within struct dmz_dev and use it as argument
      for blkdev_report_zones() instead of the metadata itself.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      18979819
    • Hannes Reinecke's avatar
      dm zoned: add device pointer to struct dm_zone · 8f22272a
      Hannes Reinecke authored
      Add a pointer, to the containing device, within struct dm_zone and
      kill dmz_zone_to_dev().
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      8f22272a
    • Hannes Reinecke's avatar
      dm zoned: allocate temporary superblock for tertiary devices · 5d2c74f3
      Hannes Reinecke authored
      Checking the tertiary superblock just consists of validating UUIDs,
      crcs, and the generation number; it doesn't have contents which would
      be required during the actual operation.
      
      So allocate a temporary superblock when checking tertiary devices to
      avoid having to store it together with the 'real' superblocks.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      5d2c74f3
    • Hannes Reinecke's avatar
      dm zoned: convert to xarray · a92fbc44
      Hannes Reinecke authored
      The zones array is getting really large, and large arrays tend to
      wreak havoc with the CPU caches.  So convert it to xarray to become
      more cache friendly.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: Colin Ian King <colin.king@canonical.com> # fix leak in dmz_insert
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      a92fbc44
    • Hannes Reinecke's avatar
      dm zoned: add a 'reserved' zone flag · aec67b4f
      Hannes Reinecke authored
      Instead of counting the number of reserved zones in dmz_free_zone(),
      mark the zone as 'reserved' during allocation and simplify
      dmz_free_zone().
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      aec67b4f