1. 25 Mar, 2021 2 commits
  2. 15 Mar, 2021 2 commits
    • Yury Norov's avatar
      ARM64: enable GENERIC_FIND_FIRST_BIT · 98c5ec77
      Yury Norov authored
      ARM64 doesn't implement find_first_{zero}_bit in arch code and doesn't
      enable it in a config. It leads to using find_next_bit() which is less
      efficient:
      
      0000000000000000 <find_first_bit>:
         0:	aa0003e4 	mov	x4, x0
         4:	aa0103e0 	mov	x0, x1
         8:	b4000181 	cbz	x1, 38 <find_first_bit+0x38>
         c:	f9400083 	ldr	x3, [x4]
        10:	d2800802 	mov	x2, #0x40                  	// #64
        14:	91002084 	add	x4, x4, #0x8
        18:	b40000c3 	cbz	x3, 30 <find_first_bit+0x30>
        1c:	14000008 	b	3c <find_first_bit+0x3c>
        20:	f8408483 	ldr	x3, [x4], #8
        24:	91010045 	add	x5, x2, #0x40
        28:	b50000c3 	cbnz	x3, 40 <find_first_bit+0x40>
        2c:	aa0503e2 	mov	x2, x5
        30:	eb02001f 	cmp	x0, x2
        34:	54ffff68 	b.hi	20 <find_first_bit+0x20>  // b.pmore
        38:	d65f03c0 	ret
        3c:	d2800002 	mov	x2, #0x0                   	// #0
        40:	dac00063 	rbit	x3, x3
        44:	dac01063 	clz	x3, x3
        48:	8b020062 	add	x2, x3, x2
        4c:	eb02001f 	cmp	x0, x2
        50:	9a829000 	csel	x0, x0, x2, ls  // ls = plast
        54:	d65f03c0 	ret
      
        ...
      
      0000000000000118 <_find_next_bit.constprop.1>:
       118:	eb02007f 	cmp	x3, x2
       11c:	540002e2 	b.cs	178 <_find_next_bit.constprop.1+0x60>  // b.hs, b.nlast
       120:	d346fc66 	lsr	x6, x3, #6
       124:	f8667805 	ldr	x5, [x0, x6, lsl #3]
       128:	b4000061 	cbz	x1, 134 <_find_next_bit.constprop.1+0x1c>
       12c:	f8667826 	ldr	x6, [x1, x6, lsl #3]
       130:	8a0600a5 	and	x5, x5, x6
       134:	ca0400a6 	eor	x6, x5, x4
       138:	92800005 	mov	x5, #0xffffffffffffffff    	// #-1
       13c:	9ac320a5 	lsl	x5, x5, x3
       140:	927ae463 	and	x3, x3, #0xffffffffffffffc0
       144:	ea0600a5 	ands	x5, x5, x6
       148:	54000120 	b.eq	16c <_find_next_bit.constprop.1+0x54>  // b.none
       14c:	1400000e 	b	184 <_find_next_bit.constprop.1+0x6c>
       150:	d346fc66 	lsr	x6, x3, #6
       154:	f8667805 	ldr	x5, [x0, x6, lsl #3]
       158:	b4000061 	cbz	x1, 164 <_find_next_bit.constprop.1+0x4c>
       15c:	f8667826 	ldr	x6, [x1, x6, lsl #3]
       160:	8a0600a5 	and	x5, x5, x6
       164:	eb05009f 	cmp	x4, x5
       168:	540000c1 	b.ne	180 <_find_next_bit.constprop.1+0x68>  // b.any
       16c:	91010063 	add	x3, x3, #0x40
       170:	eb03005f 	cmp	x2, x3
       174:	54fffee8 	b.hi	150 <_find_next_bit.constprop.1+0x38>  // b.pmore
       178:	aa0203e0 	mov	x0, x2
       17c:	d65f03c0 	ret
       180:	ca050085 	eor	x5, x4, x5
       184:	dac000a5 	rbit	x5, x5
       188:	dac010a5 	clz	x5, x5
       18c:	8b0300a3 	add	x3, x5, x3
       190:	eb03005f 	cmp	x2, x3
       194:	9a839042 	csel	x2, x2, x3, ls  // ls = plast
       198:	aa0203e0 	mov	x0, x2
       19c:	d65f03c0 	ret
      
       ...
      
      0000000000000238 <find_next_bit>:
       238:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
       23c:	aa0203e3 	mov	x3, x2
       240:	d2800004 	mov	x4, #0x0                   	// #0
       244:	aa0103e2 	mov	x2, x1
       248:	910003fd 	mov	x29, sp
       24c:	d2800001 	mov	x1, #0x0                   	// #0
       250:	97ffffb2 	bl	118 <_find_next_bit.constprop.1>
       254:	a8c17bfd 	ldp	x29, x30, [sp], #16
       258:	d65f03c0 	ret
      
      Enabling find_{first,next}_bit() would also benefit for_each_{set,clear}_bit().
      On A-53 find_first_bit() is almost twice faster than find_next_bit(), according
      to lib/find_bit_benchmark (thanks to Alexey for testing):
      
      GENERIC_FIND_FIRST_BIT=n:
      [7126084.948181] find_first_bit:               47389224 ns,  16357 iterations
      [7126085.032315] find_first_bit:               19048193 ns,    655 iterations
      
      GENERIC_FIND_FIRST_BIT=y:
      [   84.158068] find_first_bit:               27193319 ns,  16406 iterations
      [   84.233005] find_first_bit:               11082437 ns,    656 iterations
      
      GENERIC_FIND_FIRST_BIT=n bloats the kernel despite that it disables generation
      of find_{first,next}_bit():
      
              yury:linux$ scripts/bloat-o-meter vmlinux vmlinux.ffb
              add/remove: 4/1 grow/shrink: 19/251 up/down: 564/-1692 (-1128)
              ...
      
      Overall, GENERIC_FIND_FIRST_BIT=n is harmful both in terms of performance and
      code size, and it's better to have GENERIC_FIND_FIRST_BIT enabled.
      Tested-by: default avatarAlexey Klimov <aklimov@redhat.com>
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20210225135700.1381396-2-yury.norov@gmail.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      98c5ec77
    • Mark Brown's avatar
      arm64: defconfig: Use DEBUG_INFO_REDUCED · ed938a4b
      Mark Brown authored
      We've had DEBUG_INFO enabled for arm64 defconfigs since the initial
      commit.  This is probably not that frequently used but substantially
      inflates the size of the build tree and amount of I/O needed during the
      build.  This was causing issues with storage usage in some automated CI
      environments which don't expect defconfigs to be quite this big, and
      generally increases the resource consumption for both them and people
      doing local builds.  The main use case for the debug info is decoding
      things with scripts/faddr2line but that doesn't need the full
      DEBUG_INFO, DEBUG_INFO_REDUCED is enough for it, so enable that by
      default.
      
      Without this patch my build tree is 6.8G, with it the size drops to 2G
      (smaller than the 6.4G for allmodconfig!).
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarGuillaume Tucker <guillaume.tucker@collabora.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Acked-by: default avatarKevin Hilman <khilman@baylibre.com>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20210304174407.17537-1-broonie@kernel.orgSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ed938a4b
  3. 14 Mar, 2021 14 commits
    • Linus Torvalds's avatar
      Linux 5.12-rc3 · 1e28eed1
      Linus Torvalds authored
      1e28eed1
    • Alexey Dobriyan's avatar
      prctl: fix PR_SET_MM_AUXV kernel stack leak · c995f12a
      Alexey Dobriyan authored
      Doing a
      
      	prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1);
      
      will copy 1 byte from userspace to (quite big) on-stack array
      and then stash everything to mm->saved_auxv.
      AT_NULL terminator will be inserted at the very end.
      
      /proc/*/auxv handler will find that AT_NULL terminator
      and copy original stack contents to userspace.
      
      This devious scheme requires CAP_SYS_RESOURCE.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c995f12a
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 70404fe3
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of irqchip updates:
      
         - Make the GENERIC_IRQ_MULTI_HANDLER configuration correct
      
         - Add a missing DT compatible string for the Ingenic driver
      
         - Remove the pointless debugfs_file pointer from struct irqdomain"
      
      * tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/ingenic: Add support for the JZ4760
        dt-bindings/irq: Add compatible string for the JZ4760B
        irqchip: Do not blindly select CONFIG_GENERIC_IRQ_MULTI_HANDLER
        ARM: ep93xx: Select GENERIC_IRQ_MULTI_HANDLER directly
        irqdomain: Remove debugfs_file from struct irq_domain
      70404fe3
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 802b31c0
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix in for hrtimers to prevent an interrupt storm caused by
        the lack of reevaluation of the timers which expire in softirq context
        under certain circumstances, e.g. when the clock was set"
      
      * tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
      802b31c0
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c72cbc93
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "A set of scheduler updates:
      
         - Prevent a NULL pointer dereference in the migration_stop_cpu()
           mechanims
      
         - Prevent self concurrency of affine_move_task()
      
         - Small fixes and cleanups related to task migration/affinity setting
      
         - Ensure that sync_runqueues_membarrier_state() is invoked on the
           current CPU when it is in the cpu mask"
      
      * tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/membarrier: fix missing local execution of ipi_sync_rq_state()
        sched: Simplify set_affinity_pending refcounts
        sched: Fix affine_move_task() self-concurrency
        sched: Optimize migration_cpu_stop()
        sched: Collate affine_move_task() stoppers
        sched: Simplify migration_cpu_stop()
        sched: Fix migration_cpu_stop() requeueing
      c72cbc93
    • Linus Torvalds's avatar
      Merge tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 19469d2a
      Linus Torvalds authored
      Pull objtool fix from Thomas Gleixner:
       "A single objtool fix to handle the PUSHF/POPF validation correctly for
        the paravirt changes which modified arch_local_irq_restore not to use
        popf"
      
      * tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool,x86: Fix uaccess PUSHF/POPF validation
      19469d2a
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fa509ff8
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "A couple of locking fixes:
      
         - A fix for the static_call mechanism so it handles unaligned
           addresses correctly.
      
         - Make u64_stats_init() a macro so every instance gets a seperate
           lockdep key.
      
         - Make seqcount_latch_init() a macro as well to preserve the static
           variable which is used for the lockdep key"
      
      * tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        seqlock,lockdep: Fix seqcount_latch_init()
        u64_stats,lockdep: Fix u64_stats_init() vs lockdep
        static_call: Fix the module key fixup
      fa509ff8
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 75013c6c
      Linus Torvalds authored
      Pull perf fixes from Borislav Petkov:
      
       - Make sure PMU internal buffers are flushed for per-CPU events too and
         properly handle PID/TID for large PEBS.
      
       - Handle the case properly when there's no PMU and therefore return an
         empty list of perf MSRs for VMX to switch instead of reading random
         garbage from the stack.
      
      * tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case
        perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
        perf/core: Flush PMU internal buffers for per-CPU events
      75013c6c
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-for-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 836d7f05
      Linus Torvalds authored
      Pull EFI fix from Ard Biesheuvel via Borislav Petkov:
       "Fix an oversight in the handling of EFI_RT_PROPERTIES_TABLE, which was
        added v5.10, but failed to take the SetVirtualAddressMap() RT service
        into account"
      
      * tag 'efi-urgent-for-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: stub: omit SetVirtualAddressMap() if marked unsupported in RT_PROP table
      836d7f05
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0a7c10df
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
      
       - A couple of SEV-ES fixes and robustifications: verify usermode stack
         pointer in NMI is not coming from the syscall gap, correctly track
         IRQ states in the #VC handler and access user insn bytes atomically
         in same handler as latter cannot sleep.
      
       - Balance 32-bit fast syscall exit path to do the proper work on exit
         and thus not confuse audit and ptrace frameworks.
      
       - Two fixes for the ORC unwinder going "off the rails" into KASAN
         redzones and when ORC data is missing.
      
      * tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sev-es: Use __copy_from_user_inatomic()
        x86/sev-es: Correctly track IRQ states in runtime #VC handler
        x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack
        x86/sev-es: Introduce ip_within_syscall_gap() helper
        x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
        x86/unwind/orc: Silence warnings caused by missing ORC data
        x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
      0a7c10df
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c3c7579f
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Some more powerpc fixes for 5.12:
      
         - Fix wrong instruction encoding for lis in ppc_function_entry(),
           which could potentially lead to missed kprobes.
      
         - Fix SET_FULL_REGS on 32-bit and 64e, which prevented ptrace of
           non-volatile GPRs immediately after exec.
      
         - Clean up a missed SRR specifier in the recent interrupt rework.
      
         - Don't treat unrecoverable_exception() as an interrupt handler, it's
           called from other handlers so shouldn't do the interrupt entry/exit
           accounting itself.
      
         - Fix build errors caused by missing declarations for
           [en/dis]able_kernel_vsx().
      
        Thanks to Christophe Leroy, Daniel Axtens, Geert Uytterhoeven, Jiri
        Olsa, Naveen N. Rao, and Nicholas Piggin"
      
      * tag 'powerpc-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/traps: unrecoverable_exception() is not an interrupt handler
        powerpc: Fix missing declaration of [en/dis]able_kernel_vsx()
        powerpc/64s/exception: Clean up a missed SRR specifier
        powerpc: Fix inverted SET_FULL_REGS bitop
        powerpc/64s: Use symbolic macros for function entry encoding
        powerpc/64s: Fix instruction encoding for lis in ppc_function_entry()
      c3c7579f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 9d0c8e79
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "More fixes for ARM and x86"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: LAPIC: Advancing the timer expiration on guest initiated write
        KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode
        KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged
        kvm: x86: annotate RCU pointers
        KVM: arm64: Fix exclusive limit for IPA size
        KVM: arm64: Reject VM creation when the default IPA size is unsupported
        KVM: arm64: Ensure I-cache isolation between vcpus of a same VM
        KVM: arm64: Don't use cbz/adr with external symbols
        KVM: arm64: Fix range alignment when walking page tables
        KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
        KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()
        KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available
        KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key
        KVM: arm64: Fix nVHE hyp panic host context restore
        KVM: arm64: Avoid corrupting vCPU context register in guest exit
        KVM: arm64: nvhe: Save the SPE context early
        kvm: x86: use NULL instead of using plain integer as pointer
        KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled'
        KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
      9d0c8e79
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 50eb842f
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "28 patches.
      
        Subsystems affected by this series: mm (memblock, pagealloc, hugetlb,
        highmem, kfence, oom-kill, madvise, kasan, userfaultfd, memcg, and
        zram), core-kernel, kconfig, fork, binfmt, MAINTAINERS, kbuild, and
        ia64"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits)
        zram: fix broken page writeback
        zram: fix return value on writeback_store
        mm/memcg: set memcg when splitting page
        mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
        ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
        ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
        mm/userfaultfd: fix memory corruption due to writeprotect
        kasan: fix KASAN_STACK dependency for HW_TAGS
        kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
        mm/madvise: replace ptrace attach requirement for process_madvise
        include/linux/sched/mm.h: use rcu_dereference in in_vfork()
        kfence: fix reports if constant function prefixes exist
        kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations
        kfence: fix printk format for ptrdiff_t
        linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
        MAINTAINERS: exclude uapi directories in API/ABI section
        binfmt_misc: fix possible deadlock in bm_register_write
        mm/highmem.c: fix zero_user_segments() with start > end
        hugetlb: do early cow when page pinned on src mm
        mm: use is_cow_mapping() across tree where proper
        ...
      50eb842f
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.12-1' of... · b470ebc9
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
        - More compatible strings for the Ingenic irqchip (introducing the
          JZ4760B SoC)
        - Select GENERIC_IRQ_MULTI_HANDLER on the ARM ep93xx platform
        - Drop all GENERIC_IRQ_MULTI_HANDLER selections from the irqchip
          Kconfig, now relying on the architecture to get it right
        - Drop the debugfs_file field from struct irq_domain, now that
          debugfs can track things on its own
      b470ebc9
  4. 13 Mar, 2021 22 commits