x86,pm: Force out-of-line memcpy()
GCC fancies inlining memcpy(), and because it cannot prove the
destination is page-aligned (it is) it ends up generating atrocious
code like:
19e: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 1a5 <relocate_restore_code+0x25> 1a1: R_X86_64_PC32 core_restore_code-0x4
1a5: 48 8d 78 08 lea 0x8(%rax),%rdi
1a9: 48 89 c1 mov %rax,%rcx
1ac: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 1af: R_X86_64_32S core_restore_code
1b3: 48 83 e7 f8 and $0xfffffffffffffff8,%rdi
1b7: 48 29 f9 sub %rdi,%rcx
1ba: 48 89 10 mov %rdx,(%rax)
1bd: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 1c4 <relocate_restore_code+0x44> 1c0: R_X86_64_PC32 core_restore_code+0xff4
1c4: 48 29 ce sub %rcx,%rsi
1c7: 81 c1 00 10 00 00 add $0x1000,%ecx
1cd: 48 89 90 f8 0f 00 00 mov %rdx,0xff8(%rax)
1d4: c1 e9 03 shr $0x3,%ecx
1d7: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi)
Notably the alignment code generates a text reference to
code_restore_code+0xff8, for which objtool raises the objection:
vmlinux.o: warning: objtool: relocate_restore_code+0x3d: relocation to !ENDBR: next_arg+0x18
Applying some __assume_aligned(PAGE_SIZE) improve code-gen to:
19e: 48 89 c7 mov %rax,%rdi
1a1: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 1a4: R_X86_64_32S core_restore_code
1a8: b9 00 02 00 00 mov $0x200,%ecx
1ad: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi)
And resolve the problem, however, none of this is important code and
a much simpler solution still is to force a memcpy() call:
1a1: ba 00 10 00 00 mov $0x1000,%edx
1a6: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 1a9: R_X86_64_32S core_restore_code
1ad: e8 00 00 00 00 call 1b2 <relocate_restore_code+0x32> 1ae: R_X86_64_PLT32 __memcpy-0x4
Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
Showing
Please register or sign in to comment