An error occurred fetching the project authors.
- 05 Jun, 2023 1 commit
-
-
Mark Rutland authored
Now that we have raw_atomic*_<op>() definitions, there's no need to use arch_atomic*_<op>() definitions outside of the low-level atomic definitions. Move treewide users of arch_atomic*_<op>() over to the equivalent raw_atomic*_<op>(). There should be no functional change as a result of this patch. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-19-mark.rutland@arm.com
-
- 31 Jan, 2023 1 commit
-
-
Peter Zijlstra authored
In order to re-write Jcc.d32 instructions text_poke_bp() needs to be taught about them. The biggest hurdle is that the whole machinery is currently made for 5 byte instructions and extending this would grow struct text_poke_loc which is currently a nice 16 bytes and used in an array. However, since text_poke_loc contains a full copy of the (s32) displacement, it is possible to map the Jcc.d32 2 byte opcodes to Jcc.d8 1 byte opcode for the int3 emulation. This then leaves the replacement bytes; fudge that by only storing the last 5 bytes and adding the rule that 'length == 6' instruction will be prefixed with a 0x0f byte. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20230123210607.115718513@infradead.org
-
- 05 Jan, 2023 1 commit
-
-
Borislav Petkov (AMD) authored
Add a struct alt_instr.flags field which will contain different flags controlling alternatives patching behavior. The initial idea was to be able to specify it as a separate macro parameter but that would mean touching all possible invocations of the alternatives macros and thus a lot of churn. What is more, as PeterZ suggested, being able to say ALT_NOT(feature) is very readable and explains exactly what is meant. So make the feature field a u32 where the patching flags are the upper u16 part of the dword quantity while the lower u16 word is the feature. The highest feature number currently is 0x26a (i.e., word 19) so there is plenty of space. If that becomes insufficient, the field can be extended to u64 which will then make struct alt_instr of the nice size of 16 bytes (14 bytes currently). There should be no functional changes resulting from this. Suggested-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by:
Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/Y6RCoJEtxxZWwotd@zn.tnic
-
- 15 Dec, 2022 1 commit
-
-
Peter Zijlstra authored
Now that text_poke is available before ftrace, remove the SYSTEM_BOOTING exceptions. Specifically, this cures a W+X case during boot. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221025201057.945960823@infradead.org
-
- 02 Dec, 2022 1 commit
-
-
Miaohe Lin authored
Due to the explicit 'noinline' GCC-7.3 is not able to optimize away the argument setup of: apply_ibt_endbr(__ibt_endbr_seal, __ibt_enbr_seal_end); even when X86_KERNEL_IBT=n and the function is an empty stub, which leads to link errors due to missing __ibt_endbr_seal* symbols: ld: arch/x86/kernel/alternative.o: in function `alternative_instructions': alternative.c:(.init.text+0x15d): undefined reference to `__ibt_endbr_seal_end' ld: alternative.c:(.init.text+0x164): undefined reference to `__ibt_endbr_seal' Remove the explicit 'noinline' to help gcc optimize them away. Signed-off-by:
Miaohe Lin <linmiaohe@huawei.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221011113803.956808-1-linmiaohe@huawei.com
-
- 08 Nov, 2022 1 commit
-
-
Jiapeng Chong authored
Fix: ./arch/x86/kernel/traps.c: asm/proto.h is included more than once. ./arch/x86/kernel/alternative.c:1610:2-3: Unneeded semicolon. [ bp: Merge into a single patch. ] Reported-by:
Abaci Robot <abaci@linux.alibaba.com> Signed-off-by:
Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/1620902768-53822-1-git-send-email-jiapeng.chong@linux.alibaba.com Link: https://lore.kernel.org/r/20220926054628.116957-1-jiapeng.chong@linux.alibaba.com
-
- 01 Nov, 2022 3 commits
-
-
Peter Zijlstra authored
In order to avoid known hashes (from knowing the boot image), randomize the CFI hashes with a per-boot random seed. Suggested-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221027092842.765195516@infradead.org
-
Peter Zijlstra authored
Add the "cfi=" boot parameter to allow people to select a CFI scheme at boot time. Mostly useful for development / debugging. Requested-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221027092842.699804264@infradead.org
-
Peter Zijlstra authored
Implement an alternative CFI scheme that merges both the fine-grained nature of kCFI but also takes full advantage of the coarse grained hardware CFI as provided by IBT. To contrast: kCFI is a pure software CFI scheme and relies on being able to read text -- specifically the instruction *before* the target symbol, and does the hash validation *before* doing the call (otherwise control flow is compromised already). FineIBT is a software and hardware hybrid scheme; by ensuring every branch target starts with a hash validation it is possible to place the hash validation after the branch. This has several advantages: o the (hash) load is avoided; no memop; no RX requirement. o IBT WAIT-FOR-ENDBR state is a speculation stop; by placing the hash validation in the immediate instruction after the branch target there is a minimal speculation window and the whole is a viable defence against SpectreBHB. o Kees feels obliged to mention it is slightly more vulnerable when the attacker can write code. Obviously this patch relies on kCFI, but additionally it also relies on the padding from the call-depth-tracking patches. It uses this padding to place the hash-validation while the call-sites are re-written to modify the indirect target to be 16 bytes in front of the original target, thus hitting this new preamble. Notably, there is no hardware that needs call-depth-tracking (Skylake) and supports IBT (Tigerlake and onwards). Suggested-by:
Joao Moreira (Intel) <joao@overdrivepizza.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221027092842.634714496@infradead.org
-
- 25 Oct, 2022 1 commit
-
-
Dan Carpenter authored
The first argument of WARN() is a condition, so this will use "addr" as the format string and possibly crash. Fixes: 3b6c1747 ("x86/retpoline: Add SKL retthunk retpolines") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/Y1gBoUZrRK5N%2FlCB@kili/
-
- 17 Oct, 2022 4 commits
-
-
Peter Zijlstra authored
Ensure that retpolines do the proper call accounting so that the return accounting works correctly. Specifically; retpolines are used to replace both 'jmp *%reg' and 'call *%reg', however these two cases do not have the same accounting requirements. Therefore split things up and provide two different retpoline arrays for SKL. The 'jmp *%reg' case needs no accounting, the __x86_indirect_jump_thunk_array[] covers this. The retpoline is changed to not use the return thunk; it's a simple call;ret construct. [ strictly speaking it should do: andq $(~0x1f), PER_CPU_VAR(__x86_call_depth) but we can argue this can be covered by the fuzz we already have in the accounting depth (12) vs the RSB depth (16) ] The 'call *%reg' case does need accounting, the __x86_indirect_call_thunk_array[] covers this. Again, this retpoline avoids the use of the return-thunk, in this case to avoid double accounting. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.996634749@infradead.org
-
Peter Zijlstra authored
In preparation for call depth tracking on Intel SKL CPUs, make it possible to patch in a SKL specific return thunk. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.680469665@infradead.org
-
Thomas Gleixner authored
Mitigating the Intel SKL RSB underflow issue in software requires to track the call depth. That is every CALL and every RET need to be intercepted and additional code injected. The existing retbleed mitigations already include means of redirecting RET to __x86_return_thunk; this can be re-purposed and RET can be redirected to another function doing RET accounting. CALL accounting will use the function padding introduced in prior patches. For each CALL instruction, the destination symbol's padding is rewritten to do the accounting and the CALL instruction is adjusted to call into the padding. This ensures only affected CPUs pay the overhead of this accounting. Unaffected CPUs will leave the padding unused and have their 'JMP __x86_return_thunk' replaced with an actual 'RET' instruction. Objtool has been modified to supply a .call_sites section that lists all the 'CALL' instructions. Additionally the paravirt instruction sites are iterated since they will have been patched from an indirect call to direct calls (or direct instructions in which case it'll be ignored). Module handling and the actual thunk code for SKL will be added in subsequent steps. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.470877038@infradead.org
-
Thomas Gleixner authored
The upcoming call thunk patching must hold text_mutex and needs access to text_poke_copy(), which takes text_mutex. Provide a _locked postfixed variant to expose the inner workings. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111147.159977224@infradead.org
-
- 27 Sep, 2022 1 commit
-
-
Nadav Amit authored
I encountered some occasional crashes of poke_int3_handler() when kprobes are set, while accessing desc->vec. The text poke mechanism claims to have an RCU-like behavior, but it does not appear that there is any quiescent state to ensure that nobody holds reference to desc. As a result, the following race appears to be possible, which can lead to memory corruption. CPU0 CPU1 ---- ---- text_poke_bp_batch() -> smp_store_release(&bp_desc, &desc) [ notice that desc is on the stack ] poke_int3_handler() [ int3 might be kprobe's so sync events are do not help ] -> try_get_desc(descp=&bp_desc) desc = __READ_ONCE(bp_desc) if (!desc) [false, success] WRITE_ONCE(bp_desc, NULL); atomic_dec_and_test(&desc.refs) [ success, desc space on the stack is being reused and might have non-zero value. ] arch_atomic_inc_not_zero(&desc->refs) [ might succeed since desc points to stack memory that was freed and might be reused. ] Fix this issue with small backportable patch. Instead of trying to make RCU-like behavior for bp_desc, just eliminate the unnecessary level of indirection of bp_desc, and hold the whole descriptor as a global. Anyhow, there is only a single descriptor at any given moment. Fixes: 1f676247 ("x86/alternatives: Implement a better poke_int3_handler() completion scheme") Signed-off-by:
Nadav Amit <namit@vmware.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@kernel.org Link: https://lkml.kernel.org/r/20220920224743.3089-1-namit@vmware.com
-
- 15 Sep, 2022 1 commit
-
-
Peter Zijlstra authored
Both AMD and Intel recommend using INT3 after an indirect JMP. Make sure to emit one when rewriting the retpoline JMP irrespective of compiler SLS options or even CONFIG_SLS. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Alexei Starovoitov <alexei.starovoitov@gmail.com> Link: https://lkml.kernel.org/r/Yxm+QkFPOhrVSH6q@hirez.programming.kicks-ass.net
-
- 20 Jul, 2022 1 commit
-
-
Kees Cook authored
Debugging missing return thunks is easier if we can see where they're happening. Suggested-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/lkml/Ys66hwtFcGbYmoiZ@hirez.programming.kicks-ass.net/
-
- 29 Jun, 2022 1 commit
-
-
Peter Zijlstra authored
Do fine-grained Kconfig for all the various retbleed parts. NOTE: if your compiler doesn't support return thunks this will silently 'upgrade' your mitigation to IBPB, you might not like this. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de>
-
- 27 Jun, 2022 2 commits
-
-
Peter Zijlstra authored
In addition to teaching static_call about the new way to spell 'RET', there is an added complication in that static_call() is allowed to rewrite text before it is known which particular spelling is required. In order to deal with this; have a static_call specific fixup in the apply_return() 'alternative' patching routine that will rewrite the static_call trampoline to match the definite sequence. This in turn creates the problem of uniquely identifying static call trampolines. Currently trampolines are 8 bytes, the first 5 being the jmp.d32/ret sequence and the final 3 a byte sequence that spells out 'SCT'. This sequence is used in __static_call_validate() to ensure it is patching a trampoline and not a random other jmp.d32. That is, false-positives shouldn't be plenty, but aren't a big concern. OTOH the new __static_call_fixup() must not have false-positives, and 'SCT' decodes to the somewhat weird but semi plausible sequence: push %rbx rex.XB push %r12 Additionally, there are SLS concerns with immediate jumps. Combined it seems like a good moment to change the signature to a single 3 byte trap instruction that is unique to this usage and will not ever get generated by accident. As such, change the signature to: '0x0f, 0xb9, 0xcc', which decodes to: ud1 %esp, %ecx Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de>
-
Peter Zijlstra authored
Introduce X86_FEATURE_RETHUNK for those afflicted with needing this. [ bp: Do only INT3 padding - simpler. ] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de>
-
- 23 May, 2022 1 commit
-
-
Song Liu authored
Introduce a memset like API for text_poke. This will be used to fill the unused RX memory with illegal instructions. Suggested-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20220520235758.1858153-3-song@kernel.org
-
- 22 Apr, 2022 1 commit
-
-
Josh Poimboeuf authored
Now that stack validation is an optional feature of objtool, add CONFIG_OBJTOOL and replace most usages of CONFIG_STACK_VALIDATION with it. CONFIG_STACK_VALIDATION can now be considered to be frame-pointer specific. CONFIG_UNWINDER_ORC is already inherently valid for live patching, so no need to "validate" it. Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/939bf3d85604b2a126412bf11af6e3bd3b872bcb.1650300597.git.jpoimboe@redhat.com
-
- 15 Mar, 2022 3 commits
-
-
Peter Zijlstra authored
Objtool's --ibt option generates .ibt_endbr_seal which lists superfluous ENDBR instructions. That is those instructions for which the function is never indirectly called. Overwrite these ENDBR instructions with a NOP4 such that these function can never be indirect called, reducing the number of viable ENDBR targets in the kernel. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
-
Peter Zijlstra authored
Annotate away some of the generic code references. This is things where we take the address of a symbol for exception handling or return addresses (eg. context switch). Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.877758523@infradead.org
-
Peter Zijlstra authored
Similar to ibt_selftest_ip, apply the same pattern. Suggested-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.700456643@infradead.org
-
- 21 Feb, 2022 1 commit
-
-
Peter Zijlstra (Intel) authored
The RETPOLINE_AMD name is unfortunate since it isn't necessarily AMD only, in fact Hygon also uses it. Furthermore it will likely be sufficient for some Intel processors. Therefore rename the thing to RETPOLINE_LFENCE to better describe what it is. Add the spectre_v2=retpoline,lfence option as an alias to spectre_v2=retpoline,amd to preserve existing setups. However, the output of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed. [ bp: Fix typos, massage. ] Co-developed-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de>
-
- 08 Feb, 2022 1 commit
-
-
Song Liu authored
This will be used by BPF jit compiler to dump JITed binary to a RX huge page, and thus allow multiple BPF programs sharing the a huge (2MB) page. Signed-off-by:
Song Liu <song@kernel.org> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20220204185742.271030-6-song@kernel.org
-
- 09 Dec, 2021 1 commit
-
-
Peter Zijlstra authored
Currently, text_poke_bp() is very strict to only allow patching a single instruction; however with straight-line-speculation it will be required to patch: ret; int3, which is two instructions. As such, relax the constraints a little to allow int3 padding for all instructions that do not imply the execution of the next instruction, ie: RET, JMP.d8 and JMP.d32. While there, rename the text_poke_loc::rel32 field to ::disp. Note: this fills up the text_poke_loc structure which is now a round 16 bytes big. [ bp: Put comments ontop instead of on the side. ] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211204134908.082342723@infradead.org
-
- 08 Dec, 2021 1 commit
-
-
Peter Zijlstra authored
Replace all ret/retq instructions with ASM_RET in preparation of making it more than a single instruction. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211204134907.964635458@infradead.org
-
- 28 Oct, 2021 4 commits
-
-
Peter Zijlstra authored
Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Borislav Petkov <bp@suse.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Tested-by:
Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.org
-
Peter Zijlstra authored
Try and replace retpoline thunk calls with: LFENCE CALL *%\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP *%\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Borislav Petkov <bp@suse.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Tested-by:
Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org
-
Peter Zijlstra authored
Handle the rare cases where the compiler (clang) does an indirect conditional tail-call using: Jcc __x86_indirect_thunk_\reg For the !RETPOLINE case this can be rewritten to fit the original (6 byte) instruction like: Jncc.d8 1f JMP *%\reg NOP 1: Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Borislav Petkov <bp@suse.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Tested-by:
Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.org
-
Peter Zijlstra authored
Rewrite retpoline thunk call sites to be indirect calls for spectre_v2=off. This ensures spectre_v2=off is as near to a RETPOLINE=n build as possible. This is the replacement for objtool writing alternative entries to ensure the same and achieves feature-parity with the previous approach. One noteworthy feature is that it relies on the thunks to be in machine order to compute the register index. Specifically, this does not yet address the Jcc __x86_indirect_thunk_* calls generated by clang, a future patch will add this. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Borislav Petkov <bp@suse.de> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Tested-by:
Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.232495794@infradead.org
-
- 03 Jun, 2021 2 commits
-
-
Borislav Petkov authored
Up until now the assumption was that an alternative patching site would have some instructions at the beginning and trailing single-byte NOPs (0x90) padding. Therefore, the patching machinery would go and optimize those single-byte NOPs into longer ones. However, this assumption is broken on 32-bit when code like hv_do_hypercall() in hyperv_init() would use the ratpoline speculation killer CALL_NOSPEC. The 32-bit version of that macro would align certain insns to 16 bytes, leading to the compiler issuing a one or more single-byte NOPs, depending on the holes it needs to fill for alignment. That would lead to the warning in optimize_nops() to fire: ------------[ cut here ]------------ Not a NOP at 0xc27fb598 WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:211 optimize_nops.isra.13 due to that function verifying whether all of the following bytes really are single-byte NOPs. Therefore, carve out the NOP padding into a separate function and call it for each NOP range beginning with a single-byte NOP. Fixes: 23c1ad53 ("x86/alternatives: Optimize optimize_nops()") Reported-by:
Richard Narron <richard@aaazen.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213301 Link: https://lkml.kernel.org/r/20210601212125.17145-1-bp@alien8.de
-
Borislav Petkov authored
For easier inspection which bytes have changed. For example: feat: 7*32+12, old: (__x86_indirect_thunk_r10+0x0/0x20 (ffffffff81c02480) len: 17), repl: (ffffffff897813aa, len: 17) ffffffff81c02480: old_insn: 41 ff e2 90 90 90 90 90 90 90 90 90 90 90 90 90 90 ffffffff897813aa: rpl_insn: e8 07 00 00 00 f3 90 0f ae e8 eb f9 4c 89 14 24 c3 ffffffff81c02480: final_insn: e8 07 00 00 00 f3 90 0f ae e8 eb f9 4c 89 14 24 c3 No functional changes. Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210601193713.16190-1-bp@alien8.de
-
- 12 May, 2021 1 commit
-
-
Pavel Skripkin authored
Sparse says: arch/x86/kernel/alternative.c:78:21: warning: symbol 'x86nops' was not declared. Should it be static? Since x86nops[] is not used outside this file, Sparse is right and it can be made static. Signed-off-by:
Pavel Skripkin <paskripkin@gmail.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210506190726.15575-1-paskripkin@gmail.com
-
- 02 Apr, 2021 1 commit
-
-
Peter Zijlstra authored
Currently, optimize_nops() scans to see if the alternative starts with NOPs. However, the emit pattern is: 141: \oldinstr 142: .skip (len-(142b-141b)), 0x90 That is, when 'oldinstr' is short, the tail is padded with NOPs. This case never gets optimized. Rewrite optimize_nops() to replace any trailing string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that 'padlen' becomes superfluous, so remove it. [ bp: - Adjust commit message - remove a stale comment about needing to pad - add a comment in optimize_nops() - exit early if the NOP verif. loop catches a mismatch - function should not not add NOPs in that case - fix the "optimized NOPs" offsets output ] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
-
- 31 Mar, 2021 1 commit
-
-
Peter Zijlstra authored
Add a helper to decode kernel instructions; there's no point in endlessly repeating those last two arguments. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210326151259.379242587@infradead.org
-
- 15 Mar, 2021 2 commits
-
-
Peter Zijlstra authored
This ensures that a NOP is a NOP and not a random other instruction that is also a NOP. It allows simplification of dynamic code patching that wants to verify existing code before writing new instructions (ftrace, jump_label, static_call, etc..). Differentiating on NOPs is not a feature. This pessimises 32bit (DONTCARE) and 32bit on 64bit CPUs (CARELESS). 32bit is not a performance target. Everything x86_64 since AMD K10 (2007) and Intel IvyBridge (2012) is fine with using NOPL (as opposed to prefix NOP). And per FEATURE_NOPL being required for x86_64, all x86_64 CPUs can use NOPL. So stop caring about NOPs, simplify things and get on with life. [ The problem seems to be that some uarchs can only decode NOPL on a single front-end port while others have severe decode penalties for excessive prefixes. All modern uarchs can handle both, except Atom, which has prefix penalties. ] [ Also, much doubt you can actually measure any of this on normal workloads. ] After this, FEATURE_NOPL is unused except for required-features for x86_64. FEATURE_K8 is only used for PTI. [ bp: Kernel build measurements showed ~0.3s slowdown on Sandybridge which is hardly a slowdown. Get rid of X86_FEATURE_K7, while at it. ] Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> # bpf Acked-by:
Linus Torvalds <torvalds@linuxfoundation.org> Link: https://lkml.kernel.org/r/20210312115749.065275711@infradead.org
-
Borislav Petkov authored
No functional changes, just simplification. Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-10-bp@alien8.de
-