1. 02 Jun, 2022 1 commit
  2. 31 May, 2022 1 commit
    • Vaibhav Jain's avatar
      powerpc/papr_scm: don't requests stats with '0' sized stats buffer · 07bf9431
      Vaibhav Jain authored
      Sachin reported [1] that on a POWER-10 lpar he is seeing a kernel panic being
      reported with vPMEM when papr_scm probe is being called. The panic is of the
      form below and is observed only with following option disabled(profile) for the
      said LPAR 'Enable Performance Information Collection' in the HMC:
      
       Kernel attempted to write user page (1c) - exploit attempt? (uid: 0)
       BUG: Kernel NULL pointer dereference on write at 0x0000001c
       Faulting instruction address: 0xc008000001b90844
       Oops: Kernel access of bad area, sig: 11 [#1]
      <snip>
       NIP [c008000001b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm]
       LR [c008000001b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
       Call Trace:
             0xc00000000941bca0 (unreliable)
             papr_scm_probe+0x2ac/0x6ec [papr_scm]
             platform_probe+0x98/0x150
             really_probe+0xfc/0x510
             __driver_probe_device+0x17c/0x230
      <snip>
       ---[ end trace 0000000000000000 ]---
       Kernel panic - not syncing: Fatal exception
      
      On investigation looks like this panic was caused due to a 'stat_buffer' of
      size==0 being provided to drc_pmem_query_stats() to fetch all performance
      stats-ids of an NVDIMM. However drc_pmem_query_stats() shouldn't have been called
      since the vPMEM NVDIMM doesn't support and performance stat-id's. This was caused
      due to missing check for 'p->stat_buffer_len' at the beginning of
      papr_scm_pmu_check_events() which indicates that the NVDIMM doesn't support
      performance-stats.
      
      Fix this by introducing the check for 'p->stat_buffer_len' at the beginning of
      papr_scm_pmu_check_events().
      
      [1] https://lore.kernel.org/all/6B3A522A-6A5F-4CC9-B268-0C63AA6E07D3@linux.ibm.com
      
      Fixes: 0e0946e2 ("powerpc/papr_scm: Fix leaking nvdimm_events_map elements")
      Reported-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
      Tested-by: default avatarSachin Sant <sachinp@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20220524112353.1718454-1-vaibhav@linux.ibm.com
      07bf9431
  3. 29 May, 2022 3 commits
    • Michael Ellerman's avatar
      powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK · 1346d00e
      Michael Ellerman authored
      The HAVE_IRQ_EXIT_ON_IRQ_STACK option tells generic code that irq_exit()
      is called while still running on the hard irq stack (hardirq_ctx[] in
      the powerpc code).
      
      Selecting the option means the generic code will *not* switch to the
      softirq stack before running softirqs, because the code is already
      running on the (mostly empty) hard irq stack.
      
      But since commit 1b1b6a6f ("powerpc: handle irq_enter/irq_exit in
      interrupt handler wrappers"), irq_exit() is now called on the regular task
      stack, not the hard irq stack.
      
      That's because previously irq_exit() was called in __do_irq() which is
      run on the hard irq stack, but now it is called in
      interrupt_async_exit_prepare() which is called from do_irq() constructed
      by the wrapper macro, which is after the switch back to the task stack.
      
      So drop HAVE_IRQ_EXIT_ON_IRQ_STACK from the Kconfig. This will mean an
      extra stack switch when processing some interrupts, but should
      significantly reduce the likelihood of stack overflow.
      
      It also means the softirq stack will be used for running softirqs from
      other interrupts that don't use the hard irq stack, eg. timer interrupts.
      
      Fixes: 1b1b6a6f ("powerpc: handle irq_enter/irq_exit in interrupt handler wrappers")
      Cc: stable@vger.kernel.org # v5.12+
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20220525032639.1947280-1-mpe@ellerman.id.au
      1346d00e
    • He Ying's avatar
      powerpc/kasan: Silence KASAN warnings in __get_wchan() · a1b29ba2
      He Ying authored
      The following KASAN warning was reported in our kernel.
      
        BUG: KASAN: stack-out-of-bounds in get_wchan+0x188/0x250
        Read of size 4 at addr d216f958 by task ps/14437
      
        CPU: 3 PID: 14437 Comm: ps Tainted: G           O      5.10.0 #1
        Call Trace:
        [daa63858] [c0654348] dump_stack+0x9c/0xe4 (unreliable)
        [daa63888] [c035cf0c] print_address_description.constprop.3+0x8c/0x570
        [daa63908] [c035d6bc] kasan_report+0x1ac/0x218
        [daa63948] [c00496e8] get_wchan+0x188/0x250
        [daa63978] [c0461ec8] do_task_stat+0xce8/0xe60
        [daa63b98] [c0455ac8] proc_single_show+0x98/0x170
        [daa63bc8] [c03cab8c] seq_read_iter+0x1ec/0x900
        [daa63c38] [c03cb47c] seq_read+0x1dc/0x290
        [daa63d68] [c037fc94] vfs_read+0x164/0x510
        [daa63ea8] [c03808e4] ksys_read+0x144/0x1d0
        [daa63f38] [c005b1dc] ret_from_syscall+0x0/0x38
        --- interrupt: c00 at 0x8fa8f4
            LR = 0x8fa8cc
      
        The buggy address belongs to the page:
        page:98ebcdd2 refcount:0 mapcount:0 mapping:00000000 index:0x2 pfn:0x1216f
        flags: 0x0()
        raw: 00000000 00000000 01010122 00000000 00000002 00000000 ffffffff 00000000
        raw: 00000000
        page dumped because: kasan: bad access detected
      
        Memory state around the buggy address:
         d216f800: 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00
         d216f880: f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >d216f900: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
                                                  ^
         d216f980: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00
         d216fa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      
      After looking into this issue, I find the buggy address belongs
      to the task stack region. It seems KASAN has something wrong.
      I look into the code of __get_wchan in x86 architecture and
      find the same issue has been resolved by the commit
      f7d27c35 ("x86/mm, kasan: Silence KASAN warnings in get_wchan()").
      The solution could be applied to powerpc architecture too.
      
      As Andrey Ryabinin said, get_wchan() is racy by design, it may
      access volatile stack of running task, thus it may access
      redzone in a stack frame and cause KASAN to warn about this.
      
      Use READ_ONCE_NOCHECK() to silence these warnings.
      Reported-by: default avatarWanming Hu <huwanming@huaweil.com>
      Signed-off-by: default avatarHe Ying <heying24@huawei.com>
      Signed-off-by: default avatarChen Jingwen <chenjingwen6@huawei.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20220121014418.155675-1-heying24@huawei.com
      a1b29ba2
    • Paul Mackerras's avatar
      powerpc/kasan: Mark more real-mode code as not to be instrumented · 743cdb7b
      Paul Mackerras authored
      This marks more files and functions that can possibly be called in
      real mode as not to be instrumented by KASAN.  Most were found by
      inspection, except for get_pseries_errorlog() which was reported as
      causing a crash in testing.
      Reported-by: default avatarNageswara R Sastry <rnsastry@linux.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo
      743cdb7b
  4. 28 May, 2022 6 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 6112bd00
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)
      
       - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later)
      
       - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ
      
       - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later
      
       - Drop support for system call instruction emulation
      
       - Many other small features and fixes
      
      Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas
      Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian
      King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank
      Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai,
      Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
      Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof
      Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes,
      Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas
      Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras,
      Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
      Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang
      wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing,
      Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng.
      
      * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
        powerpc/64: Include cache.h directly in paca.h
        powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
        powerpc/xics: Include missing header
        powerpc/powernv/pci: Drop VF MPS fixup
        powerpc/fsl_book3e: Don't set rodata RO too early
        powerpc/microwatt: Add mmu bits to device tree
        powerpc/powernv/flash: Check OPAL flash calls exist before using
        powerpc/powermac: constify device_node in of_irq_parse_oldworld()
        powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
        selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
        powerpc: Enable the DAWR on POWER9 DD2.3 and above
        powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
        powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
        powerpc: Fix all occurences of "the the"
        selftests/powerpc/pmu/ebb: remove fixed_instruction.S
        powerpc/platforms/83xx: Use of_device_get_match_data()
        powerpc/eeh: Drop redundant spinlock initialization
        powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
        powerpc/pseries/vas: Call misc_deregister if sysfs init fails
        powerpc/papr_scm: Fix leaking nvdimm_events_map elements
        ...
      6112bd00
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 907bb57a
      Linus Torvalds authored
      Pull pin control updates from Linus Walleij:
       "Pretty big this time. Mostly due to (nice) Renesas refactorings.
      
        Core changes:
      
         - New helpers from Andy such as for_each_gpiochip_node() affecting
           both GPIO and pin control, improving a bunch of drivers in the
           process.
      
         - Pulled in Marc Zyngiers work to make IRQ chips immutable, and
           started to apply fixups on top.
      
        New drivers:
      
         - New driver for Marvell MVEBU 98DX2530.
      
         - New driver for Mediatek MT8195.
      
         - Support Qualcomm PMX65 and PM6125.
      
         - New driver for Qualcomm SC7280 LPASS pin control.
      
         - New driver for Rockchip RK3588.
      
         - New driver for NXP Freescale i.MXRT1170.
      
         - New driver for Mediatek MT6795 Helio X10.
      
        Improvements:
      
         - Several Aspeed G6 cleanups and non-critical fixes.
      
         - Thorought refactoring of some of the ever improving Renesas
           drivers.
      
         - Clean up Mediatek MT8192 bindings a bit.
      
         - PWM output and clock monitoring in the Ocelot LAN966x driver.
      
         - Thorough refactoring and cleanup of the Ralink drivers such as
           RT2880, RT3883, RT305X, MT7620, MT7621, MT7628 splitting these into
           proper sub-drivers"
      
      * tag 'pinctrl-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (161 commits)
        pinctrl: apple: Use a raw spinlock for the regmap
        pinctrl: berlin: bg4ct: Use devm_platform_*ioremap_resource() APIs
        pinctrl: intel: Fix kernel doc format, i.e. add return sections
        dt-bindings: pinctrl: qcom: Drop 'maxItems' on 'wakeup-parent'
        pinctrl: starfive: Make the irqchip immutable
        pinctrl: mediatek: Add pinctrl driver for MT6795 Helio X10
        dt-bindings: pinctrl: Add MediaTek MT6795 pinctrl bindings
        pinctrl: freescale: Add i.MXRT1170 pinctrl driver support
        dt-bindings: pinctrl: add i.MXRT1170 pinctrl Documentation
        dt-bindings: pinctrl: rockchip: increase max amount of device functions
        dt-bindings: pinctrl: qcom,pmic-gpio: add 'gpio-reserved-ranges'
        dt-bindings: pinctrl: qcom,pmic-gpio: add 'input-disable'
        dt-bindings: pinctrl: qcom,pmic-gpio: describe gpio-line-names
        dt-bindings: pinctrl: qcom,pmic-gpio: fix matching pin config
        dt-bindings: pinctrl: qcom,pmic-gpio: document PM8150L and PMM8155AU
        pinctrl: qcom: spmi-gpio: Add pm6125 compatible
        dt-bindings: pinctrl: qcom-pmic-gpio: Add pm6125 compatible
        pinctrl: intel: Drop unused irqchip member in struct intel_pinctrl
        pinctrl: intel: make irq_chip immutable
        pinctrl: cherryview: Use GPIO chip pointer in chv_gpio_irq_mask_unmask()
        ...
      907bb57a
    • Jason A. Donenfeld's avatar
      Revert "crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE" · ca7984df
      Jason A. Donenfeld authored
      This reverts commit 8bdc2a19.
      
      It got merged a bit prematurely and shortly after the kernel test robot
      and Sudip pointed out build failures:
      
        arm: imx_v6_v7_defconfig and multi_v7_defconfig
        mips: decstation_64_defconfig, decstation_defconfig, decstation_r4k_defconfig
      
        In file included from crypto/chacha20poly1305.c:13:
        include/crypto/poly1305.h:56:46: error: 'CONFIG_CRYPTO_LIB_POLY1305_RSIZE' undeclared here (not in a function); did you mean 'CONFIG_CRYPTO_POLY1305_MODULE'?
           56 |                 struct poly1305_key opaque_r[CONFIG_CRYPTO_LIB_POLY1305_RSIZE];
              |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      We could attempt to fix this by listing the dependencies piecemeal, but
      it's not as obvious as it looks: drivers like caam use this macro in
      headers even if there's no .o compiled in that makes use of it.  So
      actually fixing this might require a bit more of a comprehensive
      approach, rather than whack-a-mole with hunting down which drivers use
      which headers which use this macro.
      
      Therefore, this commit just reverts the change, and maybe the problem
      can be visited on the next rainy day.
      Reported-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: 8bdc2a19 ("crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca7984df
    • Linus Torvalds's avatar
      Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 9d004b2f
      Linus Torvalds authored
      Pull cxl updates from Dan Williams:
       "Compute Express Link (CXL) updates for this cycle.
      
        The highlight is new driver-core infrastructure and CXL subsystem
        changes for allowing lockdep to validate device_lock() usage. Thanks
        to PeterZ for setting me straight on the current capabilities of the
        lockdep API, and Greg acked it as well.
      
        On the CXL ACPI side this update adds support for CXL _OSC so that
        platform firmware knows that it is safe to still grant Linux native
        control of PCIe hotplug and error handling in the presence of CXL
        devices. A circular dependency problem was discovered between suspend
        and CXL memory for cases where the suspend image might be stored in
        CXL memory where that image also contains the PCI register state to
        restore to re-enable the device. Disable suspend for now until an
        architecture is defined to clarify that conflict.
      
        Lastly a collection of reworks, fixes, and cleanups to the CXL
        subsystem where support for snooping mailbox commands and properly
        handling the "mem_enable" flow are the highlights.
      
        Summary:
      
         - Add driver-core infrastructure for lockdep validation of
           device_lock(), and fixup a deadlock report that was previously
           hidden behind the 'lockdep no validate' policy.
      
         - Add CXL _OSC support for claiming native control of CXL hotplug and
           error handling.
      
         - Disable suspend in the presence of CXL memory unless and until a
           protocol is identified for restoring PCI device context from memory
           hosted on CXL PCI devices.
      
         - Add support for snooping CXL mailbox commands to protect against
           inopportune changes, like set-partition with the 'immediate' flag
           set.
      
         - Rework how the driver detects legacy CXL 1.1 configurations (CXL
           DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
           configurations (CXL HDM Capability).
      
         - Miscellaneous cleanups and fixes from -next exposure"
      
      * tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
        cxl/port: Enable HDM Capability after validating DVSEC Ranges
        cxl/port: Reuse 'struct cxl_hdm' context for hdm init
        cxl/port: Move endpoint HDM Decoder Capability init to port driver
        cxl/pci: Drop @info argument to cxl_hdm_decode_init()
        cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
        cxl/mem: Skip range enumeration if mem_enable clear
        cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
        cxl/pci: Move cxl_await_media_ready() to the core
        cxl/mem: Validate port connectivity before dvsec ranges
        cxl/mem: Fix cxl_mem_probe() error exit
        cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
        cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
        cxl/mem: Drop mem_enabled check from wait_for_media()
        nvdimm: Fix firmware activation deadlock scenarios
        device-core: Kill the lockdep_mutex
        nvdimm: Drop nd_device_lock()
        ACPI: NFIT: Drop nfit_device_lock()
        nvdimm: Replace lockdep_mutex with local lock classes
        cxl: Drop cxl_device_lock()
        cxl/acpi: Add root device lockdep validation
        ...
      9d004b2f
    • Linus Torvalds's avatar
      Merge tag 'clang-format-for-linus-v5.19-rc1' of https://github.com/ojeda/linux · a9f94826
      Linus Torvalds authored
      Pull clang-format updates from Miguel Ojeda:
       "clang-format modernization and cleanups.
      
        A few changes from Brian Norris and Mickaël Salaün to start taking
        advantage of some clang-format 11 features, plus a few cleanups and
        the usual update of the macro list"
      
      * tag 'clang-format-for-linus-v5.19-rc1' of https://github.com/ojeda/linux:
        clang-format: Fix space after for_each macros
        clang-format: Fix goto labels indentation
        clang-format: Update to clang-format >= 6
        clang-format: Extend the for_each list with tools/
        clang-format: Simplify command with `sort -u`
        clang-format: Use POSIX locale for `sort`
        clang-format: Update with v5.18-rc7's `for_each` macro list
      a9f94826
    • Linus Torvalds's avatar
      Merge tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · d075c0c1
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
      
         - Test in-place en/decryption with two sglists in testmgr
      
         - Fix process vs softirq race in cryptd
      
        Algorithms:
      
         - Add arm64 acceleration for sm4
      
         - Add s390 acceleration for chacha20
      
        Drivers:
      
         - Add polarfire soc hwrng support in mpsf
      
         - Add support for TI SoC AM62x in sa2ul
      
         - Add support for ATSHA204 cryptochip in atmel-sha204a
      
         - Add support for PRNG in caam
      
         - Restore support for storage encryption in qat
      
         - Restore support for storage encryption in hisilicon/sec"
      
      * tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
        hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume()
        crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY
        crypto: qat - add support for 401xx devices
        crypto: qat - re-enable registration of algorithms
        crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag
        crypto: qat - add param check for DH
        crypto: qat - add param check for RSA
        crypto: qat - remove dma_free_coherent() for DH
        crypto: qat - remove dma_free_coherent() for RSA
        crypto: qat - fix memory leak in RSA
        crypto: qat - add backlog mechanism
        crypto: qat - refactor submission logic
        crypto: qat - use pre-allocated buffers in datapath
        crypto: qat - set to zero DH parameters before free
        crypto: s390 - add crypto library interface for ChaCha20
        crypto: talitos - Uniform coding style with defined variable
        crypto: octeontx2 - simplify the return expression of otx2_cpt_aead_cbc_aes_sha_setkey()
        crypto: cryptd - Protect per-CPU resource by disabling BH.
        crypto: sun8i-ce - do not fallback if cryptlen is less than sg length
        crypto: sun8i-ce - rework debugging
        ...
      d075c0c1
  5. 27 May, 2022 29 commits
    • Linus Torvalds's avatar
      Merge tag '5.19-rc-smb3-client-fixes-updated' of git://git.samba.org/sfrench/cifs-2.6 · bf272460
      Linus Torvalds authored
      Pull cifs client updates from Steve French:
      
       - multichannel fixes to improve reconnect after network failure
      
       - improved caching of root directory contents (extending benefit of
         directory leases)
      
       - two DFS fixes
      
       - three fixes for improved debugging
      
       - an NTLMSSP fix for mounts t0 older servers
      
       - new mount parm to allow disabling creating sparse files
      
       - various cleanup fixes and minor fixes pointed out by coverity
      
      * tag '5.19-rc-smb3-client-fixes-updated' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
        smb3: remove unneeded null check in cifs_readdir
        cifs: fix ntlmssp on old servers
        cifs: cache the dirents for entries in a cached directory
        cifs: avoid parallel session setups on same channel
        cifs: use new enum for ses_status
        cifs: do not use tcpStatus after negotiate completes
        smb3: add mount parm nosparse
        smb3: don't set rc when used and unneeded in query_info_compound
        smb3: check for null tcon
        cifs: fix minor compile warning
        Add various fsctl structs
        Add defines for various newer FSCTLs
        smb3: add trace point for oplock not found
        cifs: return the more nuanced writeback error on close()
        smb3: add trace point for lease not found issue
        cifs: smbd: fix typo in comment
        cifs: set the CREATE_NOT_FILE when opening the directory in use_cached_dir()
        cifs: check for smb1 in open_cached_dir()
        cifs: move definition of cifs_fattr earlier in cifsglob.h
        cifs: print TIDs as hex
        ...
      bf272460
    • Linus Torvalds's avatar
      Merge tag 'jfs-5.19' of https://github.com/kleikamp/linux-shaggy · aef1ff15
      Linus Torvalds authored
      Pull jfs updates from David Kleikamp:
       "One bug fix and some code cleanup"
      
      * tag 'jfs-5.19' of https://github.com/kleikamp/linux-shaggy:
        fs/jfs: Remove dead code
        fs: jfs: fix possible NULL pointer dereference in dbFree()
      aef1ff15
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 35cdd865
      Linus Torvalds authored
      Pull libnvdimm and DAX updates from Dan Williams:
       "New support for clearing memory errors when a file is in DAX mode,
        alongside with some other fixes and cleanups.
      
        Previously it was only possible to clear these errors using a truncate
        or hole-punch operation to trigger the filesystem to reallocate the
        block, now, any page aligned write can opportunistically clear errors
        as well.
      
        This change spans x86/mm, nvdimm, and fs/dax, and has received the
        appropriate sign-offs. Thanks to Jane for her work on this.
      
        Summary:
      
         - Add support for clearing memory error via pwrite(2) on DAX
      
         - Fix 'security overwrite' support in the presence of media errors
      
         - Miscellaneous cleanups and fixes for nfit_test (nvdimm unit tests)"
      
      * tag 'libnvdimm-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        pmem: implement pmem_recovery_write()
        pmem: refactor pmem_clear_poison()
        dax: add .recovery_write dax_operation
        dax: introduce DAX_RECOVERY_WRITE dax access mode
        mce: fix set_mce_nospec to always unmap the whole page
        x86/mce: relocate set{clear}_mce_nospec() functions
        acpi/nfit: rely on mce->misc to determine poison granularity
        testing: nvdimm: asm/mce.h is not needed in nfit.c
        testing: nvdimm: iomap: make __nfit_test_ioremap a macro
        nvdimm: Allow overwrite in the presence of disabled dimms
        tools/testing/nvdimm: remove unneeded flush_workqueue
      35cdd865
    • Linus Torvalds's avatar
      Merge tag 'mfd-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · ea6c3bc6
      Linus Torvalds authored
      Pull MFD updates from Lee Jones:
       "New Device Support
         - Add support for {Power,Home} Keys to MediaTek MT6359
         - Add support for SC2730 to Spreadtrum SPRD SC27XX SPI
         - Add support for additional Alder Lake-P I2C Controllers to Intel
           LPSS PCI
      
        Fix-ups:
         - Convert GPIO to GPIOD (hi655x-pmic)
         - Only register devices that exist (cros_ec_dev)
         - Remove unused code (syscon, reg-mux)
         - Rework .remove() API to return void (twl-core, rt4831)
         - Trivial - whitespace, spelling, coding style (tps65218,
           sprd-sc27xx-spi, google,cros-ec)
         - DT binding changes (samsung,exynos5433-lpass, rockchip,rk805,
           rockchip,rk808, rockchip,rk809, rockchip,rk817, rockchip,rk818,
           wlf,arizona)
      
        Bug Fixes:
         - Fix error handling bugs (ipaq-micro, davinci_voicecodec)"
      
      * tag 'mfd-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        dt-bindings: cros-ec: Fix a typo in description
        dt-bindings: mfd: wlf,arizona: Add spi-max-frequency
        mfd: rt4831: Improve error reporting for problems during .remove()
        mfd: davinci_voicecodec: Fix possible null-ptr-deref davinci_vc_probe()
        mfd: intel-lpss: Add support for ADL-P i2c6 and i2c7
        dt-bindings: mfd: rk808: Convert bindings to yaml
        mfd: twl4030: Make twl4030_exit_irq() return void
        mfd: twl6030: Make twl6030_exit_irq() return void
        dt-bindings: mfd: samsung,exynos5433-lpass: Fix 'dma-channels/requests' properties
        mfd: sprd: Jugle {of,spi}_device_id tables into numerical order
        mfd: sprd: Add SC2730 PMIC to SPI device ID table
        dt-bindings: Drop undocumented i.MX iomuxc-gpr bindings in examples
        mfd: cros_ec_dev: Only register PCHG device if present
        mfd: mt6397-core: Add resources for PMIC keys for MT6359
        mfd: mt6359: Add missing defines necessary for mtk-pmic-keys support
        mfd: ipaq-micro: Fix error check return value of platform_get_irq()
        mfd: hi655x-pmic: Replace legacy gpio interface for gpiod interface
        mfd: tps65218: Fix trivial typo in comment
      ea6c3bc6
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 6b0e34a0
      Linus Torvalds authored
      Pull clk updates from Stephen Boyd:
       "Mainly driver updates this time around.
      
        There's a single patch to the core clk framework that simplifies a
        runtime PM call. Otherwise the majority of the diff falls to a few SoC
        drivers: Qualcomm, STM32 and MediaTek. Those SoCs gain some new
        hardware support and what comes along with that is quite a few lines
        of data and some clk_ops code.
      
        Beyond the new hardware support we have the usual pile of driver
        updates that add missing clks on already supported SoCs or fix up
        problems like bad clk tree descriptions. It's nice to see that more
        drivers are moving to clk_hw based APIs too.
      
        New Drivers:
         - Add STM32MP13 RCC driver (Reset Clock Controller)
         - MediaTek MT8186 SoC clk support
         - Airoha EN7523 SoC system clocks
         - Clock driver for exynosautov9 SoC
         - Renesas R-Car V4H and RZ/V2M SoCs
         - Renesas RZ/G2UL SoC
         - LPASS clk driver for Qualcomm sc7280 SoC
         - GCC clk driver for Qualcomm SC8280XP SoC
      
        Updates:
         - SDCC uses floor clk ops on Qualcomm MSM8976
         - Add modem reset and fix RPM clks on Qualcomm MSM8976
         - Add the two missing CLKOUT clocks for U8500/DB8500 SoC
         - Mark some clks critical on Ingenic X1000
         - Convert ux500 to clk_hw
         - Move MediaTek driver to clk_hw provider APIs
         - Use i2c driver probe_new to avoid id scans
         - Convert a number of Rockchip dt bindings to YAML
         - Mark hclk_vo critical on Rockchip rk3568
         - Use pm_runtime_resume_and_get to fix pm_runtime_get_sync() usage
         - Various cleanups like memory allocation error checks and plugged
           leaks
         - Allwinner H6 RTC clock support
         - Allwinner H616 32 kHz clock support
         - Add the Universal Flash Storage clock on Renesas R-Car S4-8
         - Add I2C, SSIF-2 (sound), USB, CANFD, OSTM (timer), WDT, SPI Multi
           I/O Bus, RSPI, TSU (thermal), and ADC clocks and resets on Renesas
           RZ/G2UL
         - Add display clock support on Renesas RZ/G2L
         - Add RPC (QSPI/HyperFlash) clocks on Renesas R-Car E3 and D3
         - Add 27 MHz phy PLL ref clock on i.MX
         - Add mcore_booted module parameter to tell kernel M core has already
           booted for i.MX
         - Remove snvs clock on i.MX because it was for secure world only
         - Add dt bindings for i.MX8MN GPT
         - Add DISP2 pixel clock for i.MX8MP
         - Add clkout1/2 for i.MX8MP
         - Fix parent clock of ubs_root_clk for i.MX8MP
         - Implement better RCG parking on Qualcomm SoCs using the shared RCG
           clk ops
         - Kerneldoc fixes
         - Switch Tegra BPMP to determine_rate clk op
         - Add a pointer to dt schema for generic clock bindings"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (168 commits)
        Revert "clk: qcom: regmap-mux: add pipe clk implementation"
        Revert "clk: qcom: gcc-sc7280: use new clk_regmap_mux_safe_ops for PCIe pipe clocks"
        Revert "clk: qcom: gcc-sm8450: use new clk_regmap_mux_safe_ops for PCIe pipe clocks"
        clk: bcm: rpi: Use correct order for the parameters of devm_kcalloc()
        clk: stm32mp13: add safe mux management
        clk: stm32mp13: add multi mux function
        clk: stm32mp13: add all STM32MP13 kernel clocks
        clk: stm32mp13: add all STM32MP13 peripheral clocks
        clk: stm32mp13: manage secured clocks
        clk: stm32mp13: add composite clock
        clk: stm32mp13: add stm32 divider clock
        clk: stm32mp13: add stm32_gate management
        clk: stm32mp13: add stm32_mux clock management
        clk: stm32: Introduce STM32MP13 RCC drivers (Reset Clock Controller)
        dt-bindings: rcc: stm32: add new compatible for STM32MP13 SoC
        clk: ti: clkctrl: replace usage of found with dedicated list iterator variable
        clk: ti: composite: Prefer kcalloc over open coded arithmetic
        dt-bindings: clock: exynosautov9: correct count of NR_CLK
        clk: mediatek: mt8173: Switch to clk_hw provider APIs
        clk: mediatek: Switch to clk_hw provider APIs
        ...
      6b0e34a0
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 3cc30140
      Linus Torvalds authored
      Pull pci updates from Bjorn Helgaas:
       "Resource management:
      
         - Restrict E820 clipping to PCI host bridge windows (Bjorn Helgaas)
      
         - Log E820 clipping better (Bjorn Helgaas)
      
         - Add kernel cmdline options to enable/disable E820 clipping (Hans de
           Goede)
      
         - Disable E820 reserved region clipping for IdeaPads, Yoga, Yoga
           Slip, Acer Spin 5, Clevo Barebone systems where clipping leaves no
           usable address space for touchpads, Thunderbolt devices, etc (Hans
           de Goede)
      
         - Disable E820 clipping by default starting in 2023 (Hans de Goede)
      
        PCI device hotplug:
      
         - Include files to remove implicit dependencies (Christophe Leroy)
      
         - Only put Root Ports in D3 if they can signal and wake from D3 so
           AMD Yellow Carp doesn't miss hotplug events (Mario Limonciello)
      
        Power management:
      
         - Define pci_restore_standard_config() only for CONFIG_PM_SLEEP since
           it's unused otherwise (Krzysztof Kozlowski)
      
         - Power up devices completely, including anything platform firmware
           needs to do, during runtime resume (Rafael J. Wysocki)
      
         - Move pci_resume_bus() to PM callbacks so we observe the required
           bridge power-up delays (Rafael J. Wysocki)
      
         - Drop unneeded runtime_d3cold device flag (Rafael J. Wysocki)
      
         - Split pci_raw_set_power_state() between pci_power_up() and a new
           pci_set_low_power_state() (Rafael J. Wysocki)
      
         - Set current_state to D3cold if config read returns ~0, indicating
           the device is not accessible (Rafael J. Wysocki)
      
         - Do not call pci_update_current_state() from pci_power_up() so BARs
           and ASPM config are restored correctly (Rafael J. Wysocki)
      
         - Write 0 to PMCSR in pci_power_up() in all cases (Rafael J. Wysocki)
      
         - Split pci_power_up() to pci_set_full_power_state() to avoid some
           redundant operations (Rafael J. Wysocki)
      
         - Skip restoring BARs if device is not in D0 (Rafael J. Wysocki)
      
         - Rearrange and clarify pci_set_power_state() (Rafael J. Wysocki)
      
         - Remove redundant BAR restores from pci_pm_thaw_noirq() (Rafael J.
           Wysocki)
      
        Virtualization:
      
         - Acquire device lock before config space access lock to avoid AB/BA
           deadlock with sriov_numvfs_store() (Yicong Yang)
      
        Error handling:
      
         - Clear MULTI_ERR_COR/UNCOR_RCV bits, which a race could previously
           leave permanently set (Kuppuswamy Sathyanarayanan)
      
        Peer-to-peer DMA:
      
         - Whitelist Intel Skylake-E Root Ports regardless of which devfn they
           are (Shlomo Pongratz)
      
        ASPM:
      
         - Override L1 acceptable latency advertised by Intel DG2 so ASPM L1
           can be enabled (Mika Westerberg)
      
        Cadence PCIe controller driver:
      
         - Set up device-specific register to allow PTM Responder to be
           enabled by the normal architected bit (Christian Gmeiner)
      
         - Override advertised FLR support since the controller doesn't
           implement FLR correctly (Parshuram Thombare)
      
        Cadence PCIe endpoint driver:
      
         - Correct bitmap size for the ob_region_map of outbound window usage
           (Dan Carpenter)
      
        Freescale i.MX6 PCIe controller driver:
      
         - Fix PERST# assertion/deassertion so we observe the required delays
           before accessing device (Francesco Dolcini)
      
        Freescale Layerscape PCIe controller driver:
      
         - Add "big-endian" DT property (Hou Zhiqiang)
      
         - Update SCFG DT property (Hou Zhiqiang)
      
         - Add "aer", "pme", "intr" DT properties (Li Yang)
      
         - Add DT compatible strings for ls1028a (Xiaowei Bao)
      
        Intel VMD host bridge driver:
      
         - Assign VMD IRQ domain before enumeration to avoid IOMMU interrupt
           remapping errors when MSI-X remapping is disabled (Nirmal Patel)
      
         - Revert VMD workaround that kept MSI-X remapping enabled when IOMMU
           remapping was enabled (Nirmal Patel)
      
        Marvell MVEBU PCIe controller driver:
      
         - Add of_pci_get_slot_power_limit() to parse the
           'slot-power-limit-milliwatt' DT property (Pali Rohár)
      
         - Add mvebu support for sending Set_Slot_Power_Limit message (Pali
           Rohár)
      
        MediaTek PCIe controller driver:
      
         - Fix refcount leak in mtk_pcie_subsys_powerup() (Miaoqian Lin)
      
        MediaTek PCIe Gen3 controller driver:
      
         - Reset PHY and MAC at probe time (AngeloGioacchino Del Regno)
      
        Microchip PolarFlare PCIe controller driver:
      
         - Add chained_irq_enter()/chained_irq_exit() calls to mc_handle_msi()
           and mc_handle_intx() to avoid lost interrupts (Conor Dooley)
      
         - Fix interrupt handling race (Daire McNamara)
      
        NVIDIA Tegra194 PCIe controller driver:
      
         - Drop tegra194 MSI register save/restore, which is unnecessary since
           the DWC core does it (Jisheng Zhang)
      
        Qualcomm PCIe controller driver:
      
         - Add SM8150 SoC DT binding and support (Bhupesh Sharma)
      
         - Fix pipe clock imbalance (Johan Hovold)
      
         - Fix runtime PM imbalance on probe errors (Johan Hovold)
      
         - Fix PHY init imbalance on probe errors (Johan Hovold)
      
         - Convert DT binding to YAML (Dmitry Baryshkov)
      
         - Update DT binding to show that resets aren't required for
           MSM8996/APQ8096 platforms (Dmitry Baryshkov)
      
         - Add explicit register names per chipset in DT binding (Dmitry
           Baryshkov)
      
         - Add sc7280-specific clock and reset definitions to DT binding
           (Dmitry Baryshkov)
      
        Rockchip PCIe controller driver:
      
         - Fix bitmap size when searching for free outbound region (Dan
           Carpenter)
      
        Rockchip DesignWare PCIe controller driver:
      
         - Remove "snps,dw-pcie" from rockchip-dwc DT "compatible" property
           because it's not fully compatible with rockchip (Peter Geis)
      
         - Reset rockchip-dwc controller at probe (Peter Geis)
      
         - Add rockchip-dwc INTx support (Peter Geis)
      
        Synopsys DesignWare PCIe controller driver:
      
         - Return error instead of success if DMA mapping of MSI area fails
           (Jiantao Zhang)
      
        Miscellaneous:
      
         - Change pci_set_dma_mask() documentation references to
           dma_set_mask() (Alex Williamson)"
      
      * tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (64 commits)
        dt-bindings: PCI: qcom: Add schema for sc7280 chipset
        dt-bindings: PCI: qcom: Specify reg-names explicitly
        dt-bindings: PCI: qcom: Do not require resets on msm8996 platforms
        dt-bindings: PCI: qcom: Convert to YAML
        PCI: qcom: Fix unbalanced PHY init on probe errors
        PCI: qcom: Fix runtime PM imbalance on probe errors
        PCI: qcom: Fix pipe clock imbalance
        PCI: qcom: Add SM8150 SoC support
        dt-bindings: pci: qcom: Document PCIe bindings for SM8150 SoC
        x86/PCI: Disable E820 reserved region clipping starting in 2023
        x86/PCI: Disable E820 reserved region clipping via quirks
        x86/PCI: Add kernel cmdline options to use/ignore E820 reserved regions
        PCI: microchip: Fix potential race in interrupt handling
        PCI/AER: Clear MULTI_ERR_COR/UNCOR_RCV bits
        PCI: cadence: Clear FLR in device capabilities register
        PCI: cadence: Allow PTM Responder to be enabled
        PCI: vmd: Revert 2565e5b6 ("PCI: vmd: Do not disable MSI-X remapping if interrupt remapping is enabled by IOMMU.")
        PCI: vmd: Assign VMD IRQ domain before enumeration
        PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
        PCI: rockchip-dwc: Add legacy interrupt support
        ...
      3cc30140
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 8291eaaf
      Linus Torvalds authored
      Pull more MM updates from Andrew Morton:
      
       - Two follow-on fixes for the post-5.19 series "Use pageblock_order for
         cma and alloc_contig_range alignment", from Zi Yan.
      
       - A series of z3fold cleanups and fixes from Miaohe Lin.
      
       - Some memcg selftests work from Michal Koutný <mkoutny@suse.com>
      
       - Some swap fixes and cleanups from Miaohe Lin
      
       - Several individual minor fixups
      
      * tag 'mm-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
        mm/shmem.c: suppress shift warning
        mm: Kconfig: reorganize misplaced mm options
        mm: kasan: fix input of vmalloc_to_page()
        mm: fix is_pinnable_page against a cma page
        mm: filter out swapin error entry in shmem mapping
        mm/shmem: fix infinite loop when swap in shmem error at swapoff time
        mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range
        mm/swapfile: fix lost swap bits in unuse_pte()
        mm/swapfile: unuse_pte can map random data if swap read fails
        selftests: memcg: factor out common parts of memory.{low,min} tests
        selftests: memcg: remove protection from top level memcg
        selftests: memcg: adjust expected reclaim values of protected cgroups
        selftests: memcg: expect no low events in unprotected sibling
        selftests: memcg: fix compilation
        mm/z3fold: fix z3fold_page_migrate races with z3fold_map
        mm/z3fold: fix z3fold_reclaim_page races with z3fold_free
        mm/z3fold: always clear PAGE_CLAIMED under z3fold page lock
        mm/z3fold: put z3fold page back into unbuddied list when reclaim or migration fails
        revert "mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc"
        mm/z3fold: throw warning on failure of trylock_page in z3fold_alloc
        ...
      8291eaaf
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 77fb622d
      Linus Torvalds authored
      Pull hotfixes from Andrew Morton:
       "Six hotfixes.
      
        The page_table_check one from Miaohe Lin is considered a minor thing
        so it isn't marked for -stable. The remainder address pre-5.19 issues
        and are cc:stable"
      
      * tag 'mm-hotfixes-stable-2022-05-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm/page_table_check: fix accessing unmapped ptep
        kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]
        mm/page_alloc: always attempt to allocate at least one page during bulk allocation
        hugetlb: fix huge_pmd_unshare address update
        zsmalloc: fix races between asynchronous zspage free and page migration
        Revert "mm/cma.c: remove redundant cma_mutex lock"
      77fb622d
    • Linus Torvalds's avatar
      Merge tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 6f664045
      Linus Torvalds authored
      Pull misc updates from Andrew Morton:
       "The non-MM patch queue for this merge window.
      
        Not a lot of material this cycle. Many singleton patches against
        various subsystems. Most notably some maintenance work in ocfs2
        and initramfs"
      
      * tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits)
        kcov: update pos before writing pc in trace function
        ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock
        ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock
        fs/ntfs: remove redundant variable idx
        fat: remove time truncations in vfat_create/vfat_mkdir
        fat: report creation time in statx
        fat: ignore ctime updates, and keep ctime identical to mtime in memory
        fat: split fat_truncate_time() into separate functions
        MAINTAINERS: add Muchun as a memcg reviewer
        proc/sysctl: make protected_* world readable
        ia64: mca: drop redundant spinlock initialization
        tty: fix deadlock caused by calling printk() under tty_port->lock
        relay: remove redundant assignment to pointer buf
        fs/ntfs3: validate BOOT sectors_per_clusters
        lib/string_helpers: fix not adding strarray to device's resource list
        kernel/crash_core.c: remove redundant check of ck_cmdline
        ELF, uapi: fixup ELF_ST_TYPE definition
        ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()
        ipc: update semtimedop() to use hrtimer
        ipc/sem: remove redundant assignments
        ...
      6f664045
    • Jason A. Donenfeld's avatar
      crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE · 8bdc2a19
      Jason A. Donenfeld authored
      When CRYPTO_LIB_POLY1305 is unset, CRYPTO_LIB_POLY1305_RSIZE
      is still set in the Kconfig, cluttering things.
      
      Fix this by making CRYPTO_LIB_POLY1305_RSIZE depend on
      CRYPTO_LIB_POLY1305.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8bdc2a19
    • Baolin Wang's avatar
      arm64/hugetlb: Fix building errors in huge_ptep_clear_flush() · e68b823a
      Baolin Wang authored
      Fix the arm64 build error which was caused by commit ae075629 ("mm:
      change huge_ptep_clear_flush() to return the original pte") interacting
      with commit fb396bb4 ("arm64/hugetlb: Drop TLB flush from
      get_clear_flush()"):
      
        arch/arm64/mm/hugetlbpage.c: In function ‘huge_ptep_clear_flush’:
        arch/arm64/mm/hugetlbpage.c:515:9: error: implicit declaration of function ‘get_clear_flush’; did you mean ‘ptep_clear_flush’? [-Werror=implicit-function-declaration]
          515 |  return get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
              |         ^~~~~~~~~~~~~~~
              |         ptep_clear_flush
      
      Due to the new get_clear_contig() has dropped TLB flush, we should add
      an explicit TLB flush in huge_ptep_clear_flush() to keep original
      semantics when changing to use new get_clear_contig().
      
      Fixes: fb396bb4 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()").
      Fixes: ae075629 ("mm: change huge_ptep_clear_flush() to return the original pte")
      Reported-and-tested-by: default avatarLinux Kernel Functional Testing <lkft@linaro.org>
      Reported-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Reviewed-by: default avatarGavin Shan <gshan@redhat.com>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e68b823a
    • David Howells's avatar
      pipe: Fix missing lock in pipe_resize_ring() · 189b0ddc
      David Howells authored
      pipe_resize_ring() needs to take the pipe->rd_wait.lock spinlock to
      prevent post_one_notification() from trying to insert into the ring
      whilst the ring is being replaced.
      
      The occupancy check must be done after the lock is taken, and the lock
      must be taken after the new ring is allocated.
      
      The bug can lead to an oops looking something like:
      
       BUG: KASAN: use-after-free in post_one_notification.isra.0+0x62e/0x840
       Read of size 4 at addr ffff88801cc72a70 by task poc/27196
       ...
       Call Trace:
        post_one_notification.isra.0+0x62e/0x840
        __post_watch_notification+0x3b7/0x650
        key_create_or_update+0xb8b/0xd20
        __do_sys_add_key+0x175/0x340
        __x64_sys_add_key+0xbe/0x140
        do_syscall_64+0x5c/0xc0
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Reported by Selim Enes Karaduman @Enesdex working with Trend Micro Zero
      Day Initiative.
      
      Fixes: c73be61c ("pipe: Add general notification queue support")
      Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17291
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      189b0ddc
    • Steve French's avatar
      smb3: remove unneeded null check in cifs_readdir · 44a48081
      Steve French authored
      Coverity pointed out an unneeded check.
      
      Addresses-Coverity: 1518030 ("Null pointer dereferences")
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      44a48081
    • Andrew Morton's avatar
      mm/shmem.c: suppress shift warning · fa020a2b
      Andrew Morton authored
      mm/shmem.c:1948 shmem_getpage_gfp() warn: should '(((1) << 12) / 512) << folio_order(folio)' be a 64 bit type?
      
      On i386, so an unsigned long is 32-bit, but i_blocks is a 64-bit blkcnt_t.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarJessica Clarke <jrtc27@jrtc27.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fa020a2b
    • Vlastimil Babka's avatar
      mm: Kconfig: reorganize misplaced mm options · 0710d012
      Vlastimil Babka authored
      After commits 7b42f104 ("mm: Kconfig: move swap and slab config
      options to the MM section") and 519bcb79 ("mm: Kconfig: group swap,
      slab, hotplug and thp options into submenus") we now have nicely organized
      mm related config options.  I have noticed some that were still misplaced,
      so this moves them from various places into the new structure:
      
      VM_EVENT_COUNTERS, COMPAT_BRK, MMAP_ALLOW_UNINITIALIZED to mm/Kconfig and
      general MM section.
      
      SLUB_STATS to mm/Kconfig and the slab submenu.
      
      DEBUG_SLAB, SLUB_DEBUG, SLUB_DEBUG_ON to mm/Kconfig.debug and the Kernel
      hacking / Memory Debugging submenu.
      
      Link: https://lkml.kernel.org/r/20220525112559.1139-1-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0710d012
    • Kefeng Wang's avatar
      mm: kasan: fix input of vmalloc_to_page() · fbf4df06
      Kefeng Wang authored
      When print virtual mapping info for vmalloc address, it should pass
      the addr not page, fix it.
      
      Link: https://lkml.kernel.org/r/20220525120804.38155-1-wangkefeng.wang@huawei.com
      Fixes: c056a364 ("kasan: print virtual mapping info in reports")
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fbf4df06
    • Minchan Kim's avatar
      mm: fix is_pinnable_page against a cma page · 1c563432
      Minchan Kim authored
      Pages in the CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA so
      the current is_pinnable_page() could miss CMA pages which have
      MIGRATE_ISOLATE.  It ends up pinning CMA pages as longterm for the
      pin_user_pages() API so CMA allocations keep failing until the pin is
      released.
      
           CPU 0                                   CPU 1 - Task B
      
      cma_alloc
      alloc_contig_range
                                              pin_user_pages_fast(FOLL_LONGTERM)
      change pageblock as MIGRATE_ISOLATE
                                              internal_get_user_pages_fast
                                              lockless_pages_from_mm
                                              gup_pte_range
                                              try_grab_folio
                                              is_pinnable_page
                                                return true;
                                              So, pinned the page successfully.
      page migration failure with pinned page
                                              ..
                                              .. After 30 sec
                                              unpin_user_page(page)
      
      CMA allocation succeeded after 30 sec.
      
      The CMA allocation path protects the migration type change race using
      zone->lock but what GUP path need to know is just whether the page is on
      CMA area or not rather than exact migration type.  Thus, we don't need
      zone->lock but just checks migration type in either of (MIGRATE_ISOLATE
      and MIGRATE_CMA).
      
      Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause rejecting
      of pinning pages on MIGRATE_ISOLATE pageblocks even though it's neither
      CMA nor movable zone if the page is temporarily unmovable.  However, such
      a migration failure by unexpected temporal refcount holding is general
      issue, not only come from MIGRATE_ISOLATE and the MIGRATE_ISOLATE is also
      transient state like other temporal elevated refcount problem.
      
      Link: https://lkml.kernel.org/r/20220524171525.976723-1-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1c563432
    • Miaohe Lin's avatar
      mm: filter out swapin error entry in shmem mapping · ba6851b4
      Miaohe Lin authored
      There might be swapin error entries in shmem mapping.  Filter them out to
      avoid "Bad swap file entry" complaint.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-6-linmiaohe@huawei.comSigned-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ba6851b4
    • Miaohe Lin's avatar
      mm/shmem: fix infinite loop when swap in shmem error at swapoff time · 6cec2b95
      Miaohe Lin authored
      When swap in shmem error at swapoff time, there would be a infinite loop
      in the while loop in shmem_unuse_inode().  It's because swapin error is
      deliberately ignored now and thus info->swapped will never reach 0.  So we
      can't escape the loop in shmem_unuse().
      
      In order to fix the issue, swapin_error entry is stored in the mapping
      when swapin error occurs.  So the swapcache page can be freed and the user
      won't end up with a permanently mounted swap because a sector is bad.  If
      the page is accessed later, the user process will be killed so that
      corrupted data is never consumed.  On the other hand, if the page is never
      accessed, the user won't even notice it.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-5-linmiaohe@huawei.comSigned-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reported-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6cec2b95
    • Miaohe Lin's avatar
      mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range · 7b49514f
      Miaohe Lin authored
      Once the MADV_FREE operation has succeeded, callers can expect they might
      get zero-fill pages if accessing the memory again.  Therefore it should be
      safe to delete the hwpoison entry and swapin error entry.  There is no
      reason to kill the process if it has called MADV_FREE on the range.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-4-linmiaohe@huawei.comSigned-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Suggested-by: default avatarAlistair Popple <apopple@nvidia.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7b49514f
    • Miaohe Lin's avatar
      mm/swapfile: fix lost swap bits in unuse_pte() · 14a762dd
      Miaohe Lin authored
      This is observed by code review only but not any real report.
      
      When we turn off swapping we could have lost the bits stored in the swap
      ptes.  The new rmap-exclusive bit is fine since that turned into a page
      flag, but not for soft-dirty and uffd-wp.  Add them.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-3-linmiaohe@huawei.comSigned-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Suggested-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      14a762dd
    • Miaohe Lin's avatar
      mm/swapfile: unuse_pte can map random data if swap read fails · 9f186f9e
      Miaohe Lin authored
      Patch series "A few fixup patches for mm", v4.
      
      This series contains a few patches to avoid mapping random data if swap
      read fails and fix lost swap bits in unuse_pte.  Also we free hwpoison and
      swapin error entry in madvise_free_pte_range and so on.  More details can
      be found in the respective changelogs.  
      
      
      This patch (of 5):
      
      There is a bug in unuse_pte(): when swap page happens to be unreadable,
      page filled with random data is mapped into user address space.  In case
      of error, a special swap entry indicating swap read fails is set to the
      page table.  So the swapcache page can be freed and the user won't end up
      with a permanently mounted swap because a sector is bad.  And if the page
      is accessed later, the user process will be killed so that corrupted data
      is never consumed.  On the other hand, if the page is never accessed, the
      user won't even notice it.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-1-linmiaohe@huawei.com
      Link: https://lkml.kernel.org/r/20220519125030.21486-2-linmiaohe@huawei.comSigned-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Howells <dhowells@redhat.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9f186f9e
    • Michal Koutný's avatar
      selftests: memcg: factor out common parts of memory.{low,min} tests · f079a020
      Michal Koutný authored
      The memory protection test setup and runtime is almost equal for
      memory.low and memory.min cases.
      
      It makes modification of the common parts prone to mistakes, since the
      protections are similar not only in setup but also in principle, factor
      the common part out.
      
      Past exceptions between the tests:
      - missing memory.min is fine (kept),
      - test_memcg_low protected orphaned pagecache (adapted like
        test_memcg_min and we keep the processes of protected memory running).
      
      The evaluation in two tests is different (OOM of allocator vs low events
      of protégés), this is kept different.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-6-mkoutny@suse.comSigned-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      CC: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: David Vernet <void@manifault.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f079a020
    • Michal Koutný's avatar
      selftests: memcg: remove protection from top level memcg · 6a359190
      Michal Koutný authored
      The reclaim is triggered by memory limit in a subtree, therefore the
      testcase does not need configured protection against external reclaim.
      
      Also, correct respective comments.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-5-mkoutny@suse.comSigned-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: David Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6a359190
    • Michal Koutný's avatar
      selftests: memcg: adjust expected reclaim values of protected cgroups · f10b6e9a
      Michal Koutný authored
      The numbers are not easy to derive in a closed form (certainly mere
      protections ratios do not apply), therefore use a simulation to obtain
      expected numbers.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.comSigned-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: David Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f10b6e9a
    • Michal Koutný's avatar
      selftests: memcg: expect no low events in unprotected sibling · 1d09069f
      Michal Koutný authored
      This is effectively a revert of commit cdc69458 ("cgroup: account for
      memory_recursiveprot in test_memcg_low()").  The case test_memcg_low will
      fail with memory_recursiveprot until resolved in reclaim code.
      
      However, this patch preserves the existing helpers and variables for later
      uses.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-3-mkoutny@suse.comSigned-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarDavid Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1d09069f
    • Michal Koutný's avatar
      selftests: memcg: fix compilation · ff3b72a5
      Michal Koutný authored
      Patch series "memcontrol selftests fixups", v2.
      
      Flushing the patches to make memcontrol selftests check the events
      behavior we had consensus about (test_memcg_low fails).
      
      (test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
      even before the refactoring.)
      
      The two bigger changes are:
      - adjustment of the protected values to make tests succeed with the given
        tolerance,
      - both test_memcg_low and test_memcg_min check protection of memory in
        populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst
        memory.min should not apply to empty cgroups, which is not the case
        currently. Therefore I unified tests with the populated case in order to to
        bring more broken tests).
      
      
      This patch (of 5):
      
      This fixes mis-applied changes from commit 72b1e03a ("cgroup: account
      for memory_localevents in test_memcg_oom_group_leaf_events()").
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-1-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220518161859.21565-2-mkoutny@suse.comSigned-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarDavid Vernet <void@manifault.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ff3b72a5
    • Miaohe Lin's avatar
      mm/z3fold: fix z3fold_page_migrate races with z3fold_map · 943fb61d
      Miaohe Lin authored
      Think about the below scenario:
      
      CPU1				CPU2
       z3fold_page_migrate		z3fold_map
        z3fold_page_trylock
        ...
        z3fold_page_unlock
        /* slots still points to old zhdr*/
      				 get_z3fold_header
      				  get slots from handle
      				  get old zhdr from slots
      				  z3fold_page_trylock
      				  return *old* zhdr
        encode_handle(new_zhdr, FIRST|LAST|MIDDLE)
        put_page(page) /* zhdr is freed! */
      				 but zhdr is still used by caller!
      
      z3fold_map can map freed z3fold page and lead to use-after-free bug.  To
      fix it, we add PAGE_MIGRATED to indicate z3fold page is migrated and soon
      to be released.  So get_z3fold_header won't return such page.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-10-linmiaohe@huawei.com
      Fixes: 1f862989 ("mm/z3fold.c: support page migration")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      943fb61d
    • Miaohe Lin's avatar
      mm/z3fold: fix z3fold_reclaim_page races with z3fold_free · 04094226
      Miaohe Lin authored
      Think about the below scenario:
      
      CPU1				CPU2
      z3fold_reclaim_page		z3fold_free
       spin_lock(&pool->lock)		 get_z3fold_header -- hold page_lock
       kref_get_unless_zero
      				 kref_put--zhdr->refcount can be 1 now
       !z3fold_page_trylock
        kref_put -- zhdr->refcount is 0 now
         release_z3fold_page
          WARN_ON(!list_empty(&zhdr->buddy)); -- we're on buddy now!
          spin_lock(&pool->lock); -- deadlock here!
      
      z3fold_reclaim_page might race with z3fold_free and will lead to pool lock
      deadlock and zhdr buddy non-empty warning.  To fix this, defer getting the
      refcount until page_lock is held just like what __z3fold_alloc does.  Note
      this has the side effect that we won't break the reclaim if we meet a soon
      to be released z3fold page now.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-9-linmiaohe@huawei.com
      Fixes: dcf5aedb ("z3fold: stricter locking and more careful reclaim")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      04094226