Commits · 9fbed16c3f4f2bebff610ff1ebb756785dfde0be · Kirill Smelkov / linux

14 Nov, 2022 1 commit

ARM: 9259/1: stacktrace: Convert stacktrace to generic ARCH_STACKWALK · 9fbed16c

Li Huafei authored Oct 18, 2022

Historically architectures have had duplicated code in their stack trace
implementations for filtering what gets traced. In order to avoid this
duplication some generic code has been provided using a new interface
arch_stack_walk(), enabled by selecting ARCH_STACKWALK in Kconfig, which
factors all this out into the generic stack trace code. Convert ARM to
use this common infrastructure.

When initializing the stack frame of the current task, arm64 uses
__builtin_frame_address(1) to initialize the frame pointer, skipping
arch_stack_walk(), see the commit c607ab4f ("arm64: stacktrace:
don't trace arch_stack_walk()"). Since __builtin_frame_address(1) does
not work on ARM, unwind_frame() is used to unwind the stack one layer
forward before calling walk_stackframe().
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

9fbed16c

08 Nov, 2022 4 commits

ARM: 9258/1: stacktrace: Make stack walk callback consistent with generic code · 70ccc7c0

Li Huafei authored Oct 18, 2022

As with the generic arch_stack_walk() code the ARM stack walk code takes
a callback that is called per stack frame. Currently the ARM code always
passes a struct stackframe to the callback and the generic code just
passes the pc, however none of the users ever reference anything in the
struct other than the pc value. The ARM code also uses a return type of
int while the generic code uses a return type of bool though in both
cases the return value is a boolean value and the sense is inverted
between the two.

In order to reduce code duplication when ARM is converted to use
arch_stack_walk() change the signature and return sense of the ARM
specific callback to match that of the generic code.
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Linus Waleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

70ccc7c0

ARM: 9265/1: pass -march= only to compiler · 1d2e9b67

Nick Desaulniers authored Oct 24, 2022

When both -march= and -Wa,-march= are specified for assembler or
assembler-with-cpp sources, GCC and Clang will prefer the -Wa,-march=
value but Clang will warn that -march= is unused.

warning: argument unused during compilation: '-march=armv6k'
[-Wunused-command-line-argument]

This is the top group of warnings we observe when using clang to
assemble the kernel via `ARCH=arm make LLVM=1`.

Split the arch-y make variable into two, so that -march= flags only get
passed to the compiler, not the assembler. -D flags are added to
KBUILD_CPPFLAGS which is used for both C and assembler-with-cpp sources.

Clang is trying to warn that it doesn't support different values for
-march= and -Wa,-march= (like GCC does, but the kernel doesn't need this)
though the value of the preprocessor define __thumb2__ is based on
-march=. Make sure to re-set __thumb2__ via -D flag for assembler
sources now that we're no longer passing -march= to the assembler. Set
it to a different value than the preprocessor would for -march= in case
-march= gets accidentally re-added to KBUILD_AFLAGS in the future.
Thanks to Ard and Nathan for this suggestion.

Link: https://github.com/ClangBuiltLinux/linux/issues/1315
Link: https://github.com/ClangBuiltLinux/linux/issues/1587
Link: https://github.com/llvm/llvm-project/issues/55656Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

1d2e9b67

ARM: 9264/1: only use -mtp=cp15 for the compiler · 26b12e08

Nick Desaulniers authored Oct 24, 2022

Avoids an error from the assembler for CONFIG_THUMB2 kernels:

clang-15: error: hardware TLS register is not supported for the thumbv4t
sub-architecture

This flag only makes sense to pass to the compiler, not the assembler.

Perhaps CFLAGS_ABI can be renamed to CPPFLAGS_ABI to reflect that they
will be passed to both the compiler and assembler for sources that
require pre-processing.
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

26b12e08

ARM: 9263/1: use .arch directives instead of assembler command line flags · a2faac39

Nick Desaulniers authored Oct 24, 2022

Similar to commit a6c30873 ("ARM: 8989/1: use .fpu assembler
directives instead of assembler arguments").

GCC and GNU binutils support setting the "sub arch" via -march=,
-Wa,-march, target function attribute, and .arch assembler directive.

Clang was missing support for -Wa,-march=, but this was implemented in
clang-13.

The behavior of both GCC and Clang is to
prefer -Wa,-march= over -march= for assembler and assembler-with-cpp
sources, but Clang will warn about the -march= being unused.

clang: warning: argument unused during compilation: '-march=armv6k'
[-Wunused-command-line-argument]

Since most assembler is non-conditionally assembled with one sub arch
(modulo arch/arm/delay-loop.S which conditionally is assembled as armv4
based on CONFIG_ARCH_RPC, and arch/arm/mach-at91/pm-suspend.S which is
conditionally assembled as armv7-a based on CONFIG_CPU_V7), prefer the
.arch assembler directive.

Add a few more instances found in compile testing as found by Arnd and
Nathan.

Link: https://github.com/llvm/llvm-project/commit/1d51c699b9e2ebc5bcfdbe85c74cc871426333d4
Link: https://bugs.llvm.org/show_bug.cgi?id=48894
Link: https://github.com/ClangBuiltLinux/linux/issues/1195
Link: https://github.com/ClangBuiltLinux/linux/issues/1315Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

a2faac39

07 Nov, 2022 9 commits

ARM: 9262/1: remove lazy evaluation in Makefile · 5aa4860e

Nick Desaulniers authored Oct 24, 2022

arch-y and tune-y used lazy evaluation since they used to contain

cc-option checks. They don't any longer, so just eagerly evaluate these
command line flags.
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

5aa4860e

ARM: 9261/1: amba: Drop redundant assignments of the system PM callbacks · 3f712c7c

Ulf Hansson authored Oct 24, 2022

If the system PM callbacks haven't been assigned, the PM core falls back to
invoke the corresponding the pm_generic_* helpers for the device. Let's
rely on this behaviour and drop the redundant assignments.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

3f712c7c

ARM: 9260/1: lib/xor: use r10 rather than r7 in xor_arm4regs_{2|3} · 527d0863

Nick Desaulniers authored Oct 18, 2022

kbuild test robot reports:
In file included from crypto/xor.c:17:
./arch/arm/include/asm/xor.h:61:3: error: write to reserved register 'R7'
                GET_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:20:10: note: expanded from macro 'GET_BLOCK_4'
        __asm__("ldmia  %0, {%1, %2, %3, %4}"
                ^
./arch/arm/include/asm/xor.h:63:3: error: write to reserved register 'R7'
                PUT_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:42:23: note: expanded from macro 'PUT_BLOCK_4'
        __asm__ __volatile__("stmia     %0!, {%2, %3, %4, %5}"
                             ^
./arch/arm/include/asm/xor.h:83:3: error: write to reserved register 'R7'
                GET_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:20:10: note: expanded from macro 'GET_BLOCK_4'
        __asm__("ldmia  %0, {%1, %2, %3, %4}"
                ^
./arch/arm/include/asm/xor.h:86:3: error: write to reserved register 'R7'
                PUT_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:42:23: note: expanded from macro 'PUT_BLOCK_4'
        __asm__ __volatile__("stmia     %0!, {%2, %3, %4, %5}"
                             ^
Thumb2 uses r7 rather than r11 as the frame pointer. Let's use r10
rather than r7 for these temporaries.

Link: https://github.com/ClangBuiltLinux/linux/issues/1732
Link: https://lore.kernel.org/llvm/202210072120.V1O2SuKY-lkp@intel.com/Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

527d0863

ARM: 9257/1: Disable FIQs (but not IRQs) on CPUs shutdown paths · 8fc0b333

Guilherme G. Piccoli authored Oct 17, 2022

Currently the regular CPU shutdown path for ARM disables IRQs/FIQs
in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which
is responsible for that. IRQs are architecturally masked when we
take an interrupt, but FIQs are high priority than IRQs, hence they
aren't masked. With that said, it makes sense to disable FIQs here,
but there's no need for (re-)disabling IRQs.

More than that: there is an alternative path for disabling CPUs,
in the form of function crash_smp_send_stop(), which is used for
kexec/panic path. This function relies on a SMP call that also
triggers a busy-wait loop [at machine_crash_nonpanic_core()], but
without disabling FIQs. This might lead to odd scenarios, like
early interrupts in the boot of kexec'd kernel or even interrupts
in secondary "disabled" CPUs while the main one still works in the
panic path and assumes all secondary CPUs are (really!) off.

So, let's disable FIQs in both paths and *not* disable IRQs a second
time, since they are already masked in both paths by the architecture.
This way, we keep both CPU quiesce paths consistent and safe.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

8fc0b333

ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod · 32200220

Nick Desaulniers authored Oct 11, 2022

clang-15's ability to elide loops completely became more aggressive when
it can deduce how a variable is being updated in a loop. Counting down
one variable by an increment of another can be replaced by a modulo
operation.

For 64b variables on 32b ARM EABI targets, this can result in the
compiler generating calls to __aeabi_uldivmod, which it does for a do
while loop in float64_rem().

For the kernel, we'd generally prefer that developers not open code 64b
division via binary / operators and instead use the more explicit
helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can
result in linkage failures due to undefined references to
__aeabi_uldivmod().

While developers can avoid open coding divisions on 64b variables, the
compiler doesn't know that the Linux kernel has a partial implementation
of a compiler runtime (--rtlib) to enforce this convention.

It's also undecidable for the compiler whether the code in question
would be faster to execute the loop vs elide it and do the 64b division.

While I actively avoid using the internal -mllvm command line flags, I
think we get better code than using barrier() here, which will force
reloads+spills in the loop for all toolchains.

Link: https://github.com/ClangBuiltLinux/linux/issues/1666Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

32200220

ARM: 9255/1: efi/dump UEFI runtime page tables for ARM · e513ffd8

Wang Kefeng authored Oct 11, 2022

UEFI runtime page tables dump only for ARM64 at present,
but ARM support EFI and ARM_PTDUMP_DEBUGFS now. Since
ARM could potentially execute with a 1G/3G user/kernel
split, choosing 1G as the upper limit for UEFI runtime
end, with this, we could enable UEFI runtime page tables
on ARM.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

e513ffd8

ARM: 9254/1: mm: Provide better message when kernel fault · b40b84b1

Wang Kefeng authored Oct 11, 2022

If there is a kernel fault, see do_kernel_fault(), we only print
the generic "paging request" or "NULL pointer dereference" message
which don't show read, write or excute information, let's provide
better fault message for them.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

b40b84b1

ARM: 9253/1: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL · d539fee9

Seung-Woo Kim authored Sep 30, 2022

To enable UBSAN on ARM, this patch enables ARCH_HAS_UBSAN_SANITIZE_ALL
from arm confiuration. Basic kernel bootup test is passed on arm with
CONFIG_UBSAN_SANITIZE_ALL enabled.

[florian: rebased against v6.0-rc7]
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

d539fee9

ARM: 9252/1: module: Teach unwinder about PLTs · 4ab07fd3

Alex Sverdlin authored Sep 27, 2022

"unwind: Index not found eef26358" warnings keep popping up on
CONFIG_ARM_MODULE_PLTS-enabled systems if the PC points to a PLT veneer.
Teach the unwinder how to deal with them, taking into account they don't
change state of the stack or register file except loading PC.

Link: https://lore.kernel.org/linux-arm-kernel/20200402153845.30985-1-kursad.oney@broadcom.com/Tested-by: Kursad Oney <kursad.oney@broadcom.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

4ab07fd3

04 Oct, 2022 6 commits

ARM: 9246/1: dump: show page table level name · e66372ec

Wang Kefeng authored Sep 13, 2022

ARM could have 3 page table level if ARM_LPAE enabled, or only 2 page
table level, let's show the page table level name when dump.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

e66372ec

ARM: 9245/1: dump: show FDT region · afd1efa1

Wang Kefeng authored Sep 13, 2022

Since commit 7a1be318 ("ARM: 9012/1: move device tree mapping out
of linear region"), FDT is placed between the end of the vmalloc region
and the start of the fixmap region, let's show it in dump.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

afd1efa1

ARM: 9242/1: kasan: Only map modules if CONFIG_KASAN_VMALLOC=n · 823f606a

Alex Sverdlin authored Sep 05, 2022

In case CONFIG_KASAN_VMALLOC=y kasan_populate_vmalloc() allocates the
shadow pages dynamically. But even worse is that kasan_release_vmalloc()
releases them, which is not compatible with create_mapping() of
MODULES_VADDR..MODULES_END range:

BUG: Bad page state in process kworker/9:1  pfn:2068b
page:e5e06160 refcount:0 mapcount:0 mapping:00000000 index:0x0
flags: 0x1000(reserved)
raw: 00001000 e5e06164 e5e06164 00000000 00000000 00000000 ffffffff 00000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
bad because of flags: 0x1000(reserved)
Modules linked in: ip_tables
CPU: 9 PID: 154 Comm: kworker/9:1 Not tainted 5.4.188-... #1
Hardware name: LSI Axxia AXM55XX
Workqueue: events do_free_init
unwind_backtrace
show_stack
dump_stack
bad_page
free_pcp_prepare
free_unref_page
kasan_depopulate_vmalloc_pte
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
__purge_vmap_area_lazy
_vm_unmap_aliases.part.0
__vunmap
do_free_init
process_one_work
worker_thread
kthread
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

823f606a

ARM: 9240/1: dma-mapping: Pass (void *) to virt_to_page() · 8770b9e5

Linus Walleij authored Aug 31, 2022

Pointers to virtual memory functions are (void *) but the
__dma_update_pte() function is passing an unsigned long.
Fix this up by explicit cast.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

8770b9e5

ARM: 9234/1: stacktrace: Avoid duplicate saving of exception PC value · 752ec621

Li Huafei authored Aug 26, 2022

Because an exception stack frame is not created in the exception entry,
save_trace() does special handling for the exception PC, but this is
only needed when CONFIG_FRAME_POINTER_UNWIND=y. When
CONFIG_ARM_UNWIND=y, unwind annotations have been added to the exception
entry and save_trace() will repeatedly save the exception PC:

    [0x7f000090] hrtimer_hander+0x8/0x10 [hrtimer]
    [0x8019ec50] __hrtimer_run_queues+0x18c/0x394
    [0x8019f760] hrtimer_run_queues+0xbc/0xd0
    [0x8019def0] update_process_times+0x34/0x80
    [0x801ad2a4] tick_periodic+0x48/0xd0
    [0x801ad3dc] tick_handle_periodic+0x1c/0x7c
    [0x8010f2e0] twd_handler+0x30/0x40
    [0x80177620] handle_percpu_devid_irq+0xa0/0x23c
    [0x801718d0] generic_handle_domain_irq+0x24/0x34
    [0x80502d28] gic_handle_irq+0x74/0x88
    [0x8085817c] generic_handle_arch_irq+0x58/0x78
    [0x80100ba8] __irq_svc+0x88/0xc8
    [0x80108114] arch_cpu_idle+0x38/0x3c
    [0x80108114] arch_cpu_idle+0x38/0x3c    <==== duplicate saved exception PC
    [0x80861bf8] default_idle_call+0x38/0x130
    [0x8015d5cc] do_idle+0x150/0x214
    [0x8015d978] cpu_startup_entry+0x18/0x1c
    [0x808589c0] rest_init+0xd8/0xdc
    [0x80c00a44] arch_post_acpi_subsys_init+0x0/0x8

We can move the special handling of the exception PC in save_trace() to
the unwind_frame() of the frame pointer unwinder.
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Linus Waleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

752ec621

ARM: 9233/1: stacktrace: Skip frame pointer boundary check for call_with_stack() · 5854e4d8

Li Huafei authored Aug 26, 2022

When using the frame pointer unwinder, it was found that the stack trace
output of stack_trace_save() is incomplete if the stack contains
call_with_stack():

 [0x7f00002c] dump_stack_task+0x2c/0x90 [hrtimer]
 [0x7f0000a0] hrtimer_hander+0x10/0x18 [hrtimer]
 [0x801a67f0] __hrtimer_run_queues+0x1b0/0x3b4
 [0x801a7350] hrtimer_run_queues+0xc4/0xd8
 [0x801a597c] update_process_times+0x3c/0x88
 [0x801b5a98] tick_periodic+0x50/0xd8
 [0x801b5bf4] tick_handle_periodic+0x24/0x84
 [0x8010ffc4] twd_handler+0x38/0x48
 [0x8017d220] handle_percpu_devid_irq+0xa8/0x244
 [0x80176e9c] generic_handle_domain_irq+0x2c/0x3c
 [0x8052e3a8] gic_handle_irq+0x7c/0x90
 [0x808ab15c] generic_handle_arch_irq+0x60/0x80
 [0x8051191c] call_with_stack+0x1c/0x20

For the frame pointer unwinder, unwind_frame() checks stackframe::fp by
stackframe::sp. Since call_with_stack() switches the SP from one stack
to another, stackframe::fp and stackframe: :sp will point to different
stacks, so we can no longer check stackframe::fp by stackframe::sp. Skip
checking stackframe::fp at this point to avoid this problem.
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Linus Waleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

5854e4d8

22 Sep, 2022 1 commit

ARM: 9224/1: Dump the stack traces based on the parameter 'regs' of show_regs() · 09cffeca

Zhen Lei authored Aug 03, 2022

Function show_regs() is usually called in interrupt handler or exception
handler, it prints the registers specified by the parameter 'regs', then
dump the stack traces. Although not explicitly documented, dump the stack
traces based on'regs' seems to make the most sense. Although dump_stack()
can finally dump the desired content, because 'regs' are saved by the
entry of current interrupt or exception. In the following example we can
see: 1) The backtrace of interrupt or exception handler is not expected,
it causes confusion. 2) Something is printed repeatedly. The line with
the kernel version "CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8",
the registers saved in "Exception stack" which 'regs' actually point to.

For example:
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (499 ticks this GP) idle=379/1/0x40000002 softirq=91/91 fqs=249
        (t=500 jiffies g=-911 q=13 ncpus=4)
CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8
Hardware name: ARM-Versatile Express
PC is at ktime_get+0x4c/0xe8
LR is at ktime_get+0x4c/0xe8
pc : 8019a474  lr : 8019a474  psr: 60000013
sp : cabd1f28  ip : 00000001  fp : 00000005
r10: 527bf1b8  r9 : 431bde82  r8 : d7b634db
r7 : 0000156e  r6 : 61f234f8  r5 : 00000001  r4 : 80ca86c0
r3 : ffffffff  r2 : fe5bce0b  r1 : 00000000  r0 : 01a431f4
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 6121406a  DAC: 00000051
CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8  <-----------start----------
Hardware name: ARM-Versatile Express                                          |
 unwind_backtrace from show_stack+0x10/0x14                                   |
 show_stack from dump_stack_lvl+0x40/0x4c                                     |
 dump_stack_lvl from rcu_dump_cpu_stacks+0x10c/0x134                          |
 rcu_dump_cpu_stacks from rcu_sched_clock_irq+0x780/0xaf4                     |
 rcu_sched_clock_irq from update_process_times+0x54/0x74                      |
 update_process_times from tick_periodic+0x3c/0xd4                            |
 tick_periodic from tick_handle_periodic+0x20/0x80                       worthless
 tick_handle_periodic from twd_handler+0x30/0x40                             or
 twd_handler from handle_percpu_devid_irq+0x8c/0x1c8                    duplicated
 handle_percpu_devid_irq from generic_handle_domain_irq+0x24/0x34             |
 generic_handle_domain_irq from gic_handle_irq+0x74/0x88                      |
 gic_handle_irq from generic_handle_arch_irq+0x34/0x44                        |
 generic_handle_arch_irq from call_with_stack+0x18/0x20                       |
 call_with_stack from __irq_svc+0x98/0xb0                                     |
Exception stack(0xcabd1ed8 to 0xcabd1f20)                                     |
1ec0:                                                       01a431f4 00000000 |
1ee0: fe5bce0b ffffffff 80ca86c0 00000001 61f234f8 0000156e d7b634db 431bde82 |
1f00: 527bf1b8 00000005 00000001 cabd1f28 8019a474 8019a474 60000013 ffffffff |
 __irq_svc from ktime_get+0x4c/0xe8                 <---------end--------------
 ktime_get from test_task+0x44/0x110
 test_task from kthread+0xd8/0xf4
 kthread from ret_from_fork+0x14/0x2c
Exception stack(0xcabd1fb0 to 0xcabd1ff8)
1fa0:                                     00000000 00000000 00000000 00000000
1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1fe0: 00000000 00000000 00000000 00000000 00000013 00000000

After replacing dump_stack() with dump_backtrace():
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (500 ticks this GP) idle=8f7/1/0x40000002 softirq=129/129 fqs=241
        (t=500 jiffies g=-915 q=13 ncpus=4)
CPU: 0 PID: 69 Comm: test0 Not tainted 5.19.0+ #9
Hardware name: ARM-Versatile Express
PC is at ktime_get+0x4c/0xe8
LR is at ktime_get+0x4c/0xe8
pc : 8019a494  lr : 8019a494  psr: 60000013
sp : cabddf28  ip : 00000001  fp : 00000002
r10: 0779cb48  r9 : 431bde82  r8 : d7b634db
r7 : 00000a66  r6 : e835ab70  r5 : 00000001  r4 : 80ca86c0
r3 : ffffffff  r2 : ff337d39  r1 : 00000000  r0 : 00cc82c6
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 611d006a  DAC: 00000051
 ktime_get from test_task+0x44/0x110
 test_task from kthread+0xd8/0xf4
 kthread from ret_from_fork+0x14/0x2c
Exception stack(0xcabddfb0 to 0xcabddff8)
dfa0:                                     00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

09cffeca

31 Aug, 2022 3 commits

ARM: 9232/1: Replace this_cpu_* with raw_cpu_* in handle_bad_stack() · 370d51c8

Zhen Lei authored Aug 25, 2022

The hardware automatically disable the IRQ interrupt before jumping to the
interrupt or exception vector. Therefore, the preempt_disable() operation
in this_cpu_read() after macro expansion is unnecessary. In fact, function
this_cpu_read() may trigger scheduling, see pseudocode below.

Pseudocode of this_cpu_read(xx):
preempt_disable_notrace();
raw_cpu_read(xx);
if (unlikely(__preempt_count_dec_and_test()))
	__preempt_schedule_notrace();

Therefore, use raw_cpu_* instead of this_cpu_* to eliminate potential
hazards. At the very least, it reduces a few lines of assembly code.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

370d51c8

ARM: 9228/1: vfp: kill vfp_flush/release_thread() · edd61fc1

Wang Kefeng authored Aug 16, 2022

Those functions are removed since 2006 commit d6551e88
("[ARM] Add thread_notify infrastructure").
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

edd61fc1

ARM: 9226/1: disable FDPIC ABI · 3d47ff25

Ben Wolsieffer authored Aug 12, 2022

When building with an arm-*-uclinuxfdpiceabi toolchain, the FDPIC ABI is
enabled by default but should not be used to build the kernel.
Therefore, pass -mno-fdpic if supported by the compiler.
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

3d47ff25

30 Aug, 2022 1 commit

ARM: 9221/1: traps: print un-hashed user pc on undefined instruction · ee50036b

Baruch Siach authored Jul 29, 2022

When user undefined instruction debug is enabled pc value is hashed like
kernel pointers for security reason. But the security benefit of this
hash is very limited because the code goes on to call __show_regs() that
prints the plain pointer value. pc is a user pointer anyway, so the
kernel does not leak anything. The only result is confusion about the
difference between the pc value on the first printed line, and the value
that __show_regs() prints.

Always print the plain value of pc.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

ee50036b

28 Aug, 2022 15 commits

Linux 6.0-rc3 · b90cb105
Linus Torvalds authored Aug 28, 2022

b90cb105

Merge tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · b467192e

Linus Torvalds authored Aug 28, 2022

Pull more hotfixes from Andrew Morton:
 "Seventeen hotfixes.  Mostly memory management things.

  Ten patches are cc:stable, addressing pre-6.0 issues"

* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  .mailmap: update Luca Ceresoli's e-mail address
  mm/mprotect: only reference swap pfn page if type match
  squashfs: don't call kmalloc in decompressors
  mm/damon/dbgfs: avoid duplicate context directory creation
  mailmap: update email address for Colin King
  asm-generic: sections: refactor memory_intersects
  bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
  ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
  Revert "memcg: cleanup racy sum avoidance code"
  mm/zsmalloc: do not attempt to free IS_ERR handle
  binder_alloc: add missing mmap_lock calls when using the VMA
  mm: re-allow pinning of zero pfns (again)
  vmcoreinfo: add kallsyms_num_syms symbol
  mailmap: update Guilherme G. Piccoli's email addresses
  writeback: avoid use-after-free after removing device
  shmem: update folio if shmem_replace_page() updates the page
  mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte

b467192e

Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linux · 373eff57

Linus Torvalds authored Aug 28, 2022

Pull bitmap fixes from Yury Norov:
 "Fix the reported issues, and implements the suggested improvements,
  for the version of the cpumask tests [1] that was merged with commit
  c41e8866 ("lib/test: introduce cpumask KUnit test suite").

  These changes include fixes for the tests, and better alignment with
  the KUnit style guidelines"

* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
  lib/cpumask_kunit: add tests file to MAINTAINERS
  lib/cpumask_kunit: log mask contents
  lib/test_cpumask: follow KUnit style guidelines
  lib/test_cpumask: fix cpu_possible_mask last test
  lib/test_cpumask: drop cpu_possible_mask full test

373eff57

.mailmap: update Luca Ceresoli's e-mail address · 0ebafe2e

Luca Ceresoli authored Aug 26, 2022

My Bootlin address is preferred from now on.

Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.comSigned-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

0ebafe2e

mm/mprotect: only reference swap pfn page if type match · 3d2f78f0

Peter Xu authored Aug 23, 2022

Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:

  kernel BUG at include/linux/swapops.h:117!
  CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S         O L 6.0.0-dbg-DEV #2
  RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
  Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
  c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
  48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
  RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
  RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
  RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
  RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
  R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
  R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
  FS:  00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   change_pte_range+0x36e/0x880
   change_p4d_range+0x2e8/0x670
   change_protection_range+0x14e/0x2c0
   mprotect_fixup+0x1ee/0x330
   do_mprotect_pkey+0x34c/0x440
   __x64_sys_mprotect+0x1d/0x30

It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.

Fix it by only calling it when it's a write migration entry where the page*
is used.

[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3d2f78f0

squashfs: don't call kmalloc in decompressors · 1f13dff0

Phillip Lougher authored Aug 22, 2022

The decompressors may be called while in an atomic section.  So move the
kmalloc() out of this path, and into the "page actor" init function.

This fixes a regression introduced by commit
f268eedd ("squashfs: extend "page actor" to handle missing pages")

Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk
Fixes: f268eedd ("squashfs: extend "page actor" to handle missing pages")
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1f13dff0

mm/damon/dbgfs: avoid duplicate context directory creation · d26f6070

Badari Pulavarty authored Aug 21, 2022

When user tries to create a DAMON context via the DAMON debugfs interface
with a name of an already existing context, the context directory creation
fails but a new context is created and added in the internal data
structure, due to absence of the directory creation success check.  As a
result, memory could leak and DAMON cannot be turned on.  An example test
case is as below:

    # cd /sys/kernel/debug/damon/
    # echo "off" >  monitor_on
    # echo paddr > target_ids
    # echo "abc" > mk_context
    # echo "abc" > mk_context
    # echo $$ > abc/target_ids
    # echo "on" > monitor_on  <<< fails

Return value of 'debugfs_create_dir()' is expected to be ignored in
general, but this is an exceptional case as DAMON feature is depending
on the debugfs functionality and it has the potential duplicate name
issue.  This commit therefore fixes the issue by checking the directory
creation failure and immediately return the error in the case.

Link: https://lkml.kernel.org/r/20220821180853.2400-1-sj@kernel.org
Fixes: 75c1c2b5 ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Badari Pulavarty <badari.pulavarty@intel.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[ 5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

d26f6070

mailmap: update email address for Colin King · ac733f65

Colin Ian King authored Aug 17, 2022

Colin King is working on kernel janitorial fixes in his spare time and
using his Intel email is confusing. Use his gmail account as the default
email address.

Link: https://lkml.kernel.org/r/20220817212753.101109-1-colin.i.king@gmail.comSigned-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ac733f65

asm-generic: sections: refactor memory_intersects · 0c7d7cc2

Quanyang Wang authored Aug 19, 2022

There are two problems with the current code of memory_intersects:

First, it doesn't check whether the region (begin, end) falls inside the
region (virt, vend), that is (virt < begin && vend > end).

The second problem is if vend is equal to begin, it will return true but
this is wrong since vend (virt + size) is not the last address of the
memory region but (virt + size -1) is.  The wrong determination will
trigger the misreporting when the function check_for_illegal_area calls
memory_intersects to check if the dma region intersects with stext region.

The misreporting is as below (stext is at 0x80100000):
 WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168
 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536]
 Modules linked in:
 CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5
 Hardware name: Xilinx Zynq Platform
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0xb0/0x198
  __warn from warn_slowpath_fmt+0x80/0xb4
  warn_slowpath_fmt from check_for_illegal_area+0x130/0x168
  check_for_illegal_area from debug_dma_map_sg+0x94/0x368
  debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128
  __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24
  dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4
  usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214
  usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118
  usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec
  usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70
  usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360
  usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440
  usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238
  usb_stor_control_thread from kthread+0xf8/0x104
  kthread from ret_from_fork+0x14/0x2c

Refactor memory_intersects to fix the two problems above.

Before the 1d7db834 ("dma-debug: use memory_intersects()
directly"), memory_intersects is called only by printk_late_init:

printk_late_init -> init_section_intersects ->memory_intersects.

There were few places where memory_intersects was called.

When commit 1d7db834 ("dma-debug: use memory_intersects()
directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA
subsystem uses it to check for an illegal area and the calltrace above
is triggered.

[akpm@linux-foundation.org: fix nearby comment typo]
Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com
Fixes: 97955936 ("asm/sections: add helpers to check for section data")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thierry Reding <treding@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

0c7d7cc2

bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem · dd0ff4d1

Liu Shixin authored Aug 19, 2022

The vmemmap pages is marked by kmemleak when allocated from memblock. 
Remove it from kmemleak when freeing the page.  Otherwise, when we reuse
the page, kmemleak may report such an error and then stop working.

 kmemleak: Cannot insert 0xffff98fb6eab3d40 into the object search tree (overlaps existing)
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff98fb6be00000 (size 335544320):
 kmemleak:   comm "swapper", pid 0, jiffies 4294892296
 kmemleak:   min_count = 0
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:

Link: https://lkml.kernel.org/r/20220819094005.2928241-1-liushixin2@huawei.com
Fixes: f41f2ed4 (mm: hugetlb: free the vmemmap pages associated with each HugeTLB page)
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

dd0ff4d1

ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown · 550842cc

Heming Zhao authored Aug 15, 2022

After commit 0737e01d ("ocfs2: ocfs2_mount_volume does cleanup job
before return error"), any procedure after ocfs2_dlm_init() fails will
trigger crash when calling ocfs2_dlm_shutdown().

ie: On local mount mode, no dlm resource is initialized.  If
ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call
ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will
trigger kernel crash.

This solution should bypass uninitialized resources in
ocfs2_dlm_shutdown().

Link: https://lkml.kernel.org/r/20220815085754.20417-1-heming.zhao@suse.com
Fixes: 0737e01d ("ocfs2: ocfs2_mount_volume does cleanup job before return error")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

550842cc

Revert "memcg: cleanup racy sum avoidance code" · dbb16df6

Shakeel Butt authored Aug 17, 2022

This reverts commit 96e51ccf.

Recently we started running the kernel with rstat infrastructure on
production traffic and begin to see negative memcg stats values. 
Particularly the 'sock' stat is the one which we observed having negative
value.

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 18446744073708724224

Re-run after couple of seconds

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 53248

For now we are only seeing this issue on large machines (256 CPUs) and
only with 'sock' stat.  I think the networking stack increase the stat on
one cpu and decrease it on another cpu much more often.  So, this negative
sock is due to rstat flusher flushing the stats on the CPU that has seen
the decrement of sock but missed the CPU that has increments.  A typical
race condition.

For easy stable backport, revert is the most simple solution.  For long
term solution, I am thinking of two directions.  First is just reduce the
race window by optimizing the rstat flusher.  Second is if the reader sees
a negative stat value, force flush and restart the stat collection. 
Basically retry but limited.

Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com
Fixes: 96e51ccf ("memcg: cleanup racy sum avoidance code")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>	[5.15]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

dbb16df6

mm/zsmalloc: do not attempt to free IS_ERR handle · a5d21721

Sergey Senozhatsky authored Aug 16, 2022

zsmalloc() now returns ERR_PTR values as handles, which zram accidentally
can pass to zs_free().  Another bad scenario is when zcomp_compress()
fails - handle has default -ENOMEM value, and zs_free() will try to free
that "pointer value".

Add the missing check and make sure that zs_free() bails out when
ERR_PTR() is passed to it.

Link: https://lkml.kernel.org/r/20220816050906.2583956-1-senozhatsky@chromium.org
Fixes: c7e6f17b ("zsmalloc: zs_malloc: return ERR_PTR on failure")
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

a5d21721

binder_alloc: add missing mmap_lock calls when using the VMA · 44e602b4

Liam Howlett authored Aug 10, 2022

Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages()
and when checking for a VMA in binder_alloc_new_buf_locked().

It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock
after it verifies a VMA exists, but may be taken again deeper in the call
stack, if necessary.

Link: https://lkml.kernel.org/r/20220810160209.1630707-1-Liam.Howlett@oracle.com
Fixes: a43cfc87 (android: binder: stop saving a pointer to the VMA)
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Reported-by: <syzbot+a7b60a176ec13cafb793@syzkaller.appspotmail.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Tested-by: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Martijn Coenen <maco@android.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: "Arve Hjønnevåg" <arve@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

44e602b4

mm: re-allow pinning of zero pfns (again) · fcab34b4

Alex Williamson authored Aug 10, 2022

The below referenced commit makes the same error as 1c563432 ("mm: fix
is_pinnable_page against a cma page"), re-interpreting the logic to
exclude pinning of the zero page, which breaks device assignment with
vfio.

To avoid further subtle mistakes, split the logic into discrete tests.

[akpm@linux-foundation.org: simplify comment, per John]
Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen
Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
Fixes: f25cbb7a ("mm: add zone device coherent type memory support")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Slawomir Laba <slawomirx.laba@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fcab34b4