1. 16 May, 2023 2 commits
    • Vernon Lovejoy's avatar
      x86/show_trace_log_lvl: Ensure stack pointer is aligned, again · 2e4be0d0
      Vernon Lovejoy authored
      The commit e335bb51 ("x86/unwind: Ensure stack pointer is aligned")
      tried to align the stack pointer in show_trace_log_lvl(), otherwise the
      "stack < stack_info.end" check can't guarantee that the last read does
      not go past the end of the stack.
      
      However, we have the same problem with the initial value of the stack
      pointer, it can also be unaligned. So without this patch this trivial
      kernel module
      
      	#include <linux/module.h>
      
      	static int init(void)
      	{
      		asm volatile("sub    $0x4,%rsp");
      		dump_stack();
      		asm volatile("add    $0x4,%rsp");
      
      		return -EAGAIN;
      	}
      
      	module_init(init);
      	MODULE_LICENSE("GPL");
      
      crashes the kernel.
      
      Fixes: e335bb51 ("x86/unwind: Ensure stack pointer is aligned")
      Signed-off-by: default avatarVernon Lovejoy <vlovejoy@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Link: https://lore.kernel.org/r/20230512104232.GA10227@redhat.comSigned-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      2e4be0d0
    • Josh Poimboeuf's avatar
      vmlinux.lds.h: Discard .note.gnu.property section · f7ba52f3
      Josh Poimboeuf authored
      When tooling reads ELF notes, it assumes each note entry is aligned to
      the value listed in the .note section header's sh_addralign field.
      
      The kernel-created ELF notes in the .note.Linux and .note.Xen sections
      are aligned to 4 bytes.  This causes the toolchain to set those
      sections' sh_addralign values to 4.
      
      On the other hand, the GCC-created .note.gnu.property section has an
      sh_addralign value of 8 for some reason, despite being based on struct
      Elf32_Nhdr which only needs 4-byte alignment.
      
      When the mismatched input sections get linked together into the vmlinux
      .notes output section, the higher alignment "wins", resulting in an
      sh_addralign of 8, which confuses tooling.  For example:
      
        $ readelf -n .tmp_vmlinux.btf
        ...
        readelf: .tmp_vmlinux.btf: Warning: note with invalid namesz and/or descsz found at offset 0x170
        readelf: .tmp_vmlinux.btf: Warning:  type: 0x4, namesize: 0x006e6558, descsize: 0x00008801, alignment: 8
      
      In this case readelf thinks there's alignment padding where there is
      none, so it starts reading an ELF note in the middle.
      
      With newer toolchains (e.g., latest Fedora Rawhide), a similar mismatch
      triggers a build failure when combined with CONFIG_X86_KERNEL_IBT:
      
        btf_encoder__encode: btf__dedup failed!
        Failed to encode BTF
        libbpf: failed to find '.BTF' ELF section in vmlinux
        FAILED: load BTF from vmlinux: No data available
        make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Error 255
      
      This latter error was caused by pahole crashing when it encountered the
      corrupt .notes section.  This crash has been fixed in dwarves version
      1.25.  As Tianyi Liu describes:
      
        "Pahole reads .notes to look for LINUX_ELFNOTE_BUILD_LTO. When LTO is
         enabled, pahole needs to call cus__merge_and_process_cu to merge
         compile units, at which point there should only be one unspecified
         type (used to represent some compilation information) in the global
         context.
      
         However, when the kernel is compiled without LTO, if pahole calls
         cus__merge_and_process_cu due to alignment issues with notes,
         multiple unspecified types may appear after merging the cus, and
         older versions of pahole only support up to one. This is why pahole
         1.24 crashes, while newer versions support multiple. However, the
         latest version of pahole still does not solve the problem of
         incorrect LTO recognition, so compiling the kernel may be slower
         than normal."
      
      Even with the newer pahole, the note section misaligment issue still
      exists and pahole is misinterpreting the LTO note.  Fix it by discarding
      the .note.gnu.property section.  While GNU properties are important for
      user space (and VDSO), they don't seem to have any use for vmlinux.
      
      (In fact, they're already getting (inadvertently) stripped from vmlinux
      when CONFIG_DEBUG_INFO_BTF is enabled.  The BTF data is extracted from
      vmlinux.o with "objcopy --only-section=.BTF" into .btf.vmlinux.bin.o.
      That file doesn't have .note.gnu.property, so when it gets modified and
      linked back into the main object, the linker automatically strips it
      (see "How GNU properties are merged" in the ld man page).)
      Reported-by: default avatarDaniel Xu <dxu@dxuuu.xyz>
      Link: https://lkml.kernel.org/bpf/57830c30-cd77-40cf-9cd1-3bb608aa602e@app.fastmail.comDebugged-by: default avatarTianyi Liu <i.pear@outlook.com>
      Suggested-by: default avatarJoan Bruguera Micó <joanbrugueram@gmail.com>
      Link: https://lore.kernel.org/r/20230418214925.ay3jpf2zhw75kgmd@trebleSigned-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      f7ba52f3
  2. 14 May, 2023 13 commits
  3. 13 May, 2023 17 commits
  4. 12 May, 2023 8 commits
    • Borislav Petkov (AMD)'s avatar
      x86/retbleed: Fix return thunk alignment · 9a48d604
      Borislav Petkov (AMD) authored
      SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout
      (leaving only the last 2 bytes of the address):
      
        3bff <zen_untrain_ret>:
        3bff:       f3 0f 1e fa             endbr64
        3c03:       f6                      test   $0xcc,%bl
      
        3c04 <__x86_return_thunk>:
        3c04:       c3                      ret
        3c05:       cc                      int3
        3c06:       0f ae e8                lfence
      
      However, "the RET at __x86_return_thunk must be on a 64 byte boundary,
      for alignment within the BTB."
      
      Use SYM_START instead.
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a48d604
    • Linus Torvalds's avatar
      Merge tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 76c7f887
      Linus Torvalds authored
      Pull more btrfs fixes from David Sterba:
      
       - fix incorrect number of bitmap entries for space cache if loading is
         interrupted by some error
      
       - fix backref walking, this breaks a mode of LOGICAL_INO_V2 ioctl that
         is used in deduplication tools
      
       - zoned mode fixes:
            - properly finish zone reserved for relocation
            - correctly calculate super block zone end on ZNS
            - properly initialize new extent buffer for redirty
      
       - make mount option clear_cache work with block-group-tree, to rebuild
         free-space-tree instead of temporarily disabling it that would lead
         to a forced read-only mount
      
       - fix alignment check for offset when printing extent item
      
      * tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: make clear_cache mount option to rebuild FST without disabling it
        btrfs: zero the buffer before marking it dirty in btrfs_redirty_list_add
        btrfs: zoned: fix full zone super block reading on ZNS
        btrfs: zoned: zone finish data relocation BG with last IO
        btrfs: fix backref walking not returning all inode refs
        btrfs: fix space cache inconsistency after error loading it from disk
        btrfs: print-tree: parent bytenr must be aligned to sector size
      76c7f887
    • Linus Torvalds's avatar
      Merge tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · fd88f147
      Linus Torvalds authored
      Pull cifs client fixes from Steve French:
      
       - fix for copy_file_range bug for very large files that are multiples
         of rsize
      
       - do not ignore "isolated transport" flag if set on share
      
       - set rasize default better
      
       - three fixes related to shutdown and freezing (fixes 4 xfstests, and
         closes deferred handles faster in some places that were missed)
      
      * tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: release leases for deferred close handles when freezing
        smb3: fix problem remounting a share after shutdown
        SMB3: force unmount was failing to close deferred close files
        smb3: improve parallel reads of large files
        do not reuse connection if share marked as isolated
        cifs: fix pcchunk length type in smb2_copychunk_range
      fd88f147
    • Linus Torvalds's avatar
      Merge tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs · df8c2d13
      Linus Torvalds authored
      Pull vfs fix from Christian Brauner:
       "During the pipe nonblock rework the check for both O_NONBLOCK and
        IOCB_NOWAIT was dropped. Both checks need to be performed to ensure
        that files without O_NONBLOCK but IOCB_NOWAIT don't block when writing
        to or reading from a pipe.
      
        This just contains the fix adding the check for IOCB_NOWAIT back in"
      
      * tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
        pipe: check for IOCB_NOWAIT alongside O_NONBLOCK
      df8c2d13
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux · 584dc5db
      Linus Torvalds authored
      Pull io_uring fix from Jens Axboe:
       "Just a single fix making io_uring_sqe_cmd() available regardless of
        CONFIG_IO_URING, fixing a regression introduced during the merge
        window if nvme was selected but io_uring was not"
      
      * tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux:
        io_uring: make io_uring_sqe_cmd() unconditionally available
      584dc5db
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · ed6a75e3
      Linus Torvalds authored
      Pull RISC-V fix from Palmer Dabbelt:
       "Just a single fix this week for a build issue. That'd usually be a
        good sign, but we've started to get some reports of boot failures on
        some hardware/bootloader configurations. Nothing concrete yet, but
        I've got a funny feeling that's where much of the bug hunting is going
        right now.
      
        Nothing's reproducing on my end, though, and this fixes some pretty
        concrete issues so I figured there's no reason to delay it:
      
         - a fix to the linker script to avoid orpahaned sections in
           kernel/pi"
      
      * tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Fix orphan section warnings caused by kernel/pi
      ed6a75e3
    • Randy Dunlap's avatar
      Documentation/block: drop the request.rst file · 56cdea92
      Randy Dunlap authored
      Documentation/block/request.rst is outdated and should be removed.
      Also delete its entry in the block/index.rst file.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Link: https://lore.kernel.org/r/20230507182606.12647-1-rdunlap@infradead.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      56cdea92
    • Jens Axboe's avatar
      pipe: check for IOCB_NOWAIT alongside O_NONBLOCK · c04fe8e3
      Jens Axboe authored
      Pipe reads or writes need to enable nonblocking attempts, if either
      O_NONBLOCK is set on the file, or IOCB_NOWAIT is set in the iocb being
      passed in. The latter isn't currently true, ensure we check for both
      before waiting on data or space.
      
      Fixes: afed6271 ("pipe: set FMODE_NOWAIT on pipes")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Message-Id: <e5946d67-4e5e-b056-ba80-656bab12d9f6@kernel.dk>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      c04fe8e3