1. 19 May, 2015 40 commits
    • Ingo Molnar's avatar
      x86/fpu: Rename save_user_xstate() to copy_fpregs_to_sigframe() · 2a52af8b
      Ingo Molnar authored
      Move the naming in line with existing names, so that we now have:
      
        copy_fpregs_to_fpstate()
        copy_fpstate_to_sigframe()
        copy_fpregs_to_sigframe()
      
      ... where each function does what its name suggests.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2a52af8b
    • Ingo Molnar's avatar
      x86/fpu: Rename save_xstate_sig() to copy_fpstate_to_sigframe() · c8e14041
      Ingo Molnar authored
      Standardize the naming of save_xstate_sig() by renaming it to
      copy_fpstate_to_sigframe(): this tells us at a glance that
      the function copies an FPU fpstate to a signal frame.
      
      This naming also follows the naming of copy_fpregs_to_fpstate().
      
      Don't put 'xstate' into the name: since this is a generic name,
      it's expected that the function is able to handle xstate frames
      as well, beyond legacy frames.
      
      xstate used to be the odd case in the x86 FPU code - now it's the
      common case.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c8e14041
    • Ingo Molnar's avatar
      x86/fpu: Pass 'struct fpu' to fpstate_sanitize_xstate() · 36e49e7f
      Ingo Molnar authored
      Currently fpstate_sanitize_xstate() has a task_struct input parameter,
      but it only uses the fpu structure from it - so pass in a 'struct fpu'
      pointer only and update all call sites.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      36e49e7f
    • Ingo Molnar's avatar
      x86/fpu: Simplify fpstate_sanitize_xstate() calls · 1ac91a76
      Ingo Molnar authored
      Remove the extra layer of __fpstate_sanitize_xstate():
      
      	if (!use_xsaveopt())
      		return;
      	__fpstate_sanitize_xstate(tsk);
      
      and move the check for use_xsaveopt() into fpstate_sanitize_xstate().
      
      In general we optimize for the presence of CPU features, not for
      the absence of them. Furthermore there's little point in this inlining,
      as the call sites are not super hot code paths.
      
      Doing this uninlining shrinks the code a bit:
      
         text    data     bss     dec     hex filename
         14108751        2573624 1634304 18316679        1177d87 vmlinux.before
         14108627        2573624 1634304 18316555        1177d0b vmlinux.after
      
      Also remove a pointless '!fx' check from fpstate_sanitize_xstate().
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1ac91a76
    • Ingo Molnar's avatar
      x86/fpu: Rename sanitize_i387_state() to fpstate_sanitize_xstate() · d0903193
      Ingo Molnar authored
      So the sanitize_i387_state() function has the following purpose:
      on CPUs that support optimized xstate saving instructions, an
      FPU fpstate might end up having partially uninitialized data.
      
      This function initializes that data.
      
      Note that the function name is a misnomer and confusing on two levels,
      not only is it not i387 specific at all, but it is the exact opposite:
      it only matters on xstate CPUs.
      
      So rename sanitize_i387_state() and __sanitize_i387_state() to
      fpstate_sanitize_xstate() and __fpstate_sanitize_xstate(),
      to clearly express the purpose and usage of the function.
      
      We'll further clean up this function in the next patch.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d0903193
    • Ingo Molnar's avatar
      x86/fpu: Move asm/xcr.h to asm/fpu/internal.h · befc61ad
      Ingo Molnar authored
      Now that all FPU internals using drivers are converted to public APIs,
      move xcr.h's definitions into fpu/internal.h and remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      befc61ad
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/sha1_mb: Remove FPU internal headers from sha1_mb.c · 57dd083e
      Ingo Molnar authored
      This file only uses the public FPU APIs, so remove the xcr.h, fpu/xstate.h
      and fpu/internal.h headers and add the fpu/api.h include.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      57dd083e
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/serpent_avx2: Simplify the init() xfeature checks · 534ff06e
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      534ff06e
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/sha1_ssse3: Simplify the sha1_ssse3_mod_init() xfeature checks · d1e50966
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d1e50966
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/cast6_avx: Simplify the cast6_init() xfeature checks · 1debf7db
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1debf7db
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/sha512_ssse3: Simplify the sha512_ssse3_mod_init() xfeature checks · c93b8a39
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c93b8a39
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/cast5_avx: Simplify the cast5_init() xfeature checks · d5d34d98
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d5d34d98
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/serpent_avx: Simplify the serpent_init() xfeature checks · c1c23f7e
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c1c23f7e
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/twofish_avx: Simplify the twofish_init() xfeature checks · 4eecd261
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4eecd261
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/camellia_aesni_avx2: Simplify the camellia_aesni_init() xfeature checks · 7bc371fa
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7bc371fa
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/sha256_ssse3: Simplify the sha256_ssse3_mod_init() xfeature checks · 70d51eb6
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit.
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      70d51eb6
    • Ingo Molnar's avatar
      x86/fpu, crypto x86/camellia_aesni_avx: Simplify the camellia_aesni_init() xfeature checks · ce4f5f9b
      Ingo Molnar authored
      Use the new 'cpu_has_xfeatures()' function to query AVX CPU support.
      
      This has the following advantages to the driver:
      
       - Decouples the driver from FPU internals: it's now only using <asm/fpu/api.h>.
      
       - Removes detection complexity from the driver, no more raw XGETBV instruction
      
       - Shrinks the code a bit:
      
           text    data     bss     dec     hex filename
           2128    2896       0    5024    13a0 camellia_aesni_avx_glue.o.before
           2067    2896       0    4963    1363 camellia_aesni_avx_glue.o.after
      
       - Standardizes feature name error message printouts across drivers
      
      There are also advantages to the x86 FPU code: once all drivers
      are decoupled from internals we can move them out of common
      headers and we'll also be able to remove xcr.h.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ce4f5f9b
    • Ingo Molnar's avatar
      x86/fpu: Move xfeature type enumeration to fpu/types.h · 91969d69
      Ingo Molnar authored
      So xsave.h is an internal header that FPU using drivers commonly include,
      to get access to the xstate feature names, amongst other things.
      
      Move these type definitions to fpu/fpu.h to allow simplification
      of FPU using driver code.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      91969d69
    • Ingo Molnar's avatar
      x86/fpu: Enumerate xfeature bits · 677b98bd
      Ingo Molnar authored
      Transform the xstate masks into an enumerated list of xfeature bits.
      
      This removes the hard coding of XFEATURES_NR_MAX.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      677b98bd
    • Ingo Molnar's avatar
      x86/fpu: Simplify print_xstate_features() · 33588b52
      Ingo Molnar authored
      We do a boot time printout of xfeatures in print_xstate_features(),
      simplify this code to make use of the recently introduced cpu_has_xfeature()
      method.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      33588b52
    • Ingo Molnar's avatar
      x86/fpu: Introduce cpu_has_xfeatures(xfeatures_mask, feature_name) · 5b073430
      Ingo Molnar authored
      A lot of FPU using driver code is querying complex CPU features to be
      able to figure out whether a given set of xstate features is supported
      by the CPU or not.
      
      Introduce a simplified API function that can be used on any CPU type
      to get this information. Also add an error string return pointer,
      so that the driver can print a meaningful error message with a
      standardized feature name.
      
      Also mark xfeatures_mask as __read_only.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5b073430
    • Ingo Molnar's avatar
      x86/fpu: Rename fpu/xsave.c to fpu/xstate.c · 62784854
      Ingo Molnar authored
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      62784854
    • Ingo Molnar's avatar
      x86/fpu: Rename fpu/xsave.h to fpu/xstate.h · 669ebabb
      Ingo Molnar authored
      'xsave' is an x86 instruction name to most people - but xsave.h is
      about a lot more than just the XSAVE instruction: it includes
      definitions and support, both internal and external, related to
      xstate and xfeatures support.
      
      As a first step in cleaning up the various xstate uses rename this
      header to 'fpu/xstate.h' to better reflect what this header file
      is about.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      669ebabb
    • Ingo Molnar's avatar
      x86/fpu: Optimize fpu_copy() some more on lazy switching systems · b1652900
      Ingo Molnar authored
      The current fpu_copy() code on lazy switching CPUs always saves
      into the current fpstate and then copies it over into the child
      context:
      
      		preempt_disable();
      		if (!copy_fpregs_to_fpstate(src_fpu))
      			fpregs_deactivate(src_fpu);
      		preempt_enable();
      		memcpy(&dst_fpu->state, &src_fpu->state, xstate_size);
      
      That memcpy() can be avoided on all lazy switching setups except
      really old FNSAVE-only systems: change fpu_copy() to directly save
      into the child context, for both the lazy and the eager context
      switching case.
      
      Note that we still have to do a memcpy() back into the parent
      context in the FNSAVE case, but this won't be executed on the
      majority of x86 systems that got built in the last 10 years or so.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b1652900
    • Ingo Molnar's avatar
      x86/fpu: Optimize fpu_copy() · 68271c6a
      Ingo Molnar authored
      Optimize fpu_copy() a bit by expanding the ->fpstate_active == 1
      portion of fpu__save() into it.
      
      ( The main purpose of this change is to enable another, larger
        optimization that will be done in the next patch. )
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      68271c6a
    • Ingo Molnar's avatar
      x86/fpu: Optimize fpu__save() · 48c4717f
      Ingo Molnar authored
      So fpu__save() does this currently:
      
      		copy_fpregs_to_fpstate(fpu);
      		if (!use_eager_fpu())
      			fpregs_deactivate(fpu);
      
      ... which deactivates the FPU on lazy switching systems unconditionally.
      
      Both usecases of fpu__save() use this function to save the
      FPU state into a fpstate: fork()/clone() and math error signal handling.
      
      The unconditional disabling of FPU registers in the lazy switching
      case is probably a mistaken conversion of old FNSAVE code (that had
      to disable FPU registers).
      
      So speed up this code by only disabling FPU registers when absolutely
      necessary: when indicated by the copy_fpregs_to_fpstate() return
      code:
      
      		if (!copy_fpregs_to_fpstate(fpu))
      			fpregs_deactivate(fpu);
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      48c4717f
    • Ingo Molnar's avatar
      x86/fpu: Simplify fpu__save() · fea435a2
      Ingo Molnar authored
      Factor out a common call.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fea435a2
    • Ingo Molnar's avatar
      x86/fpu: Eliminate __save_fpu() · 9f876d67
      Ingo Molnar authored
      The current implementation of __save_fpu():
      
      	if (use_xsave()) {
      		xsave_state(&fpu->state.xsave);
      	} else {
      		fpu_fxsave(fpu);
      	}
      
      Is actually a simplified version of copy_fpregs_to_fpstate(),
      if use_eager_fpu() is true.
      
      But all call sites of __save_fpu() call it only it when use_eager_fpu()
      is true.
      
      So we can eliminate __save_fpu() altogether and use the standard
      copy_fpregs_to_fpstate() function. This cleans up the code
      by making it use fewer variants of FPU register saving.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9f876d67
    • Ingo Molnar's avatar
      x86/fpu: Simplify __save_fpu() · 72ee6f87
      Ingo Molnar authored
      __save_fpu() has this pattern:
      
      		if (unlikely(system_state == SYSTEM_BOOTING))
      			xsave_state_booting(&fpu->state.xsave);
      		else
      			xsave_state(&fpu->state.xsave);
      
      ... but it does not actually get called during system bootup.
      
      So remove the complication and always call xsave_state().
      
      To make sure this assumption is correct, add a WARN_ONCE()
      debug check to xsave_state().
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      72ee6f87
    • Ingo Molnar's avatar
      x86/fpu: Factor out FPU hw activation/deactivation · 32b49b3c
      Ingo Molnar authored
      We have repeat patterns of:
      
      	if (!use_eager_fpu())
      		clts();
      
      ... to activate FPU registers, and:
      
      	if (!use_eager_fpu())
      		stts();
      
      ... to deactivate them.
      
      Encapsulate these in:
      
      	__fpregs_activate_hw();
      	__fpregs_activate_hw();
      
      and use them accordingly.
      
      Doing this synchronizes the idiom with the fpu->fpregs_active
      software-flag's handling functions, creating clear patterns of:
      
      	__fpregs_activate_hw();
      	__fpregs_activate(fpu);
      
      etc., which improves readability.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      32b49b3c
    • Ingo Molnar's avatar
      x86/fpu: Rename fpu__unlazy_stopped() to fpu__activate_stopped() · 67ee658e
      Ingo Molnar authored
      In line with the fpstate_activate() change, name
      fpu__unlazy_stopped() in a similar fashion as well: its purpose
      is to make the fpstate of a stopped task the current and active FPU
      context, which may require unlazying and initialization.
      
      The unlazying is just part of the job, the main concept is to make
      the fpstate active.
      
      Also clarify the function's description to clarify its exact
      usage and the background behind it all.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      67ee658e
    • Ingo Molnar's avatar
      x86/fpu: Simplify fpstate_init_curr() usage · c4d72e2d
      Ingo Molnar authored
      Now that fpstate_init_curr() is not doing implicit allocations
      anymore, almost all uses of it involve a very simple pattern:
      
      	if (!fpu->fpstate_active)
      		fpstate_init_curr(fpu);
      
      which is basically activating the FPU fpstate if it was not active
      before.
      
      So propagate the check into the function itself, and rename the
      function according to its new purpose:
      
      	fpu__activate_curr(fpu);
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c4d72e2d
    • Ingo Molnar's avatar
      x86/fpu, kvm: Simplify fx_init() · 0ee6a517
      Ingo Molnar authored
      Now that fpstate_init() cannot fail the error return of fx_init()
      has lost its purpose. Eliminate the error return and propagate this
      change to all callers.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0ee6a517
    • Ingo Molnar's avatar
      x86/fpu: Simplify fpu__unlazy_stopped() error handling · 2fb29fc7
      Ingo Molnar authored
      Now that FPU contexts are always allocated, fpu__unlazy_stopped()
      cannot fail. Remove its error return and propagate the changes to
      the callers.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2fb29fc7
    • Ingo Molnar's avatar
      x86/fpu: Rename fpstate_alloc_init() to fpstate_init_curr() · e62bb3d8
      Ingo Molnar authored
      Now that there are no FPU context allocations, rename fpstate_alloc_init()
      to fpstate_init_curr(), to signal that it initializes the fpstate and
      marks it active, for the current task.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e62bb3d8
    • Ingo Molnar's avatar
      x86/fpu: Remove failure return from fpstate_alloc_init() · 91d93d0e
      Ingo Molnar authored
      Remove the failure code and propagate this down to callers.
      
      Note that this function still has an 'init' aspect, which must be
      called.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      91d93d0e
    • Ingo Molnar's avatar
      x86/fpu: Remove failure paths from fpstate-alloc low level functions · c4d6ee6e
      Ingo Molnar authored
      Now that we always allocate the FPU context as part of task_struct there's
      no need for separate allocations - remove them and their primary failure
      handling code.
      
      ( Note that there's still secondary error codes that have become superfluous,
        those will be removed in separate patches. )
      
      Move the somewhat misplaced setup_xstate_comp() call to the core.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c4d6ee6e
    • Ingo Molnar's avatar
      x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again) · 7366ed77
      Ingo Molnar authored
      So 6 years ago we made the FPU fpstate dynamically allocated:
      
        aa283f49 ("x86, fpu: lazy allocation of FPU area - v5")
        61c4628b ("x86, fpu: split FPU state from task struct - v5")
      
      In hindsight this was a mistake:
      
         - it complicated context allocation failure handling, such as:
      
      		/* kthread execs. TODO: cleanup this horror. */
      		if (WARN_ON(fpstate_alloc_init(fpu)))
      			force_sig(SIGKILL, tsk);
      
         - it caused us to enable irqs in fpu__restore():
      
                      local_irq_enable();
                      /*
                       * does a slab alloc which can sleep
                       */
                      if (fpstate_alloc_init(fpu)) {
                              /*
                               * ran out of memory!
                               */
                              do_group_exit(SIGKILL);
                              return;
                      }
                      local_irq_disable();
      
         - it (slightly) slowed down task creation/destruction by adding
           slab allocation/free pattens.
      
         - it made access to context contents (slightly) slower by adding
           one more pointer dereference.
      
      The motivation for the dynamic allocation was two-fold:
      
         - reduce memory consumption by non-FPU tasks
      
         - allocate and handle only the necessary amount of context for
           various XSAVE processors that have varying hardware frame
           sizes.
      
      These days, with glibc using SSE memcpy by default and GCC optimizing
      for SSE/AVX by default, the scope of FPU using apps on an x86 system is
      much larger than it was 6 years ago.
      
      For example on a freshly installed Fedora 21 desktop system, with a
      recent kernel, all non-kthread tasks have used the FPU shortly after
      bootup.
      
      Also, even modern embedded x86 CPUs try to support the latest vector
      instruction set - so they'll too often use the larger xstate frame
      sizes.
      
      So remove the dynamic allocation complication by embedding the FPU
      fpstate in task_struct again. This should make the FPU a lot more
      accessible to all sorts of atomic contexts.
      
      We could still optimize for the xstate frame size in the future,
      by moving the state structure to the last element of task_struct,
      and allocating only a part of that.
      
      This change is kept minimal by still keeping the ctx_alloc()/free()
      routines (that now do nothing substantial) - we'll remove them in
      the following patches.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7366ed77
    • Ingo Molnar's avatar
      x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX... · 1bc6b056
      Ingo Molnar authored
      x86/fpu: Optimize copy_fpregs_to_fpstate() by removing the FNCLEX synchronization with FP exceptions
      
      So we have the following ancient code in copy_fpregs_to_fpstate():
      
      	if (unlikely(fpu->state->fxsave.swd & X87_FSW_ES)) {
      		asm volatile("fnclex");
      		goto drop_fpregs;
      	}
      
      which clears pending FPU exceptions and then drops registers, which
      causes the next FP instruction of the saved context to re-load the
      saved FPU state, with all pending exceptions marked properly, and
      will re-start the exception handling mechanism in the hardware.
      
      Since FPU exceptions are always issued on instruction boundaries,
      in particular on the next FP instruction following the exception
      generating instruction, there's no fear of getting an FP exception
      asynchronously.
      
      They were truly asynchronous back in the IRQ13 days, when the FPU was
      a weird and expensive co-processor that did its own processing, and we
      had to synchronize with them, but that code is not working anymore:
      we don't have IRQ13 mapped in the IDT anymore.
      
      With the introduction of optimized XSAVE support there's a new
      complication: if the xstate features bit indicates that a particular
      state component is unused (in 'init state'), then the hardware does
      not guarantee that the XSAVE (et al) instruction keeps the underlying
      FPU state image in memory valid and current. In practice this means
      that the hardware won't write it, and the exceptions flag in the
      state might be an older version, with it still being set. This
      meant that we had to check the xfeatures flag as well, adding
      another memory load and branch to a critical hot path of the scheduler.
      
      So optimize all this by removing both the old quirk and the new check,
      and straight-line optimizing the most common cases with likely()
      hints. Quite a bit of code gets removed this way:
      
        arch/x86/kernel/process_64.o:
      
          text    data     bss     dec     filename
          5484       8       0    5492     process_64.o.before
          5416       8       0    5424     process_64.o.after
      
      Now there's also a chance that some weird behavior or erratum was
      masked by our IRQ13 handling quirk (or that I misunderstood the
      nature of the quirk), and that this change triggers some badness.
      
      There's no real good way to protect against that possibility other
      than keeping this change well isolated, well commented and well
      bisectable. If you bisect a weird (or not so weird) breakage to
      this commit then please let us know!
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1bc6b056
    • Ingo Molnar's avatar
      x86/fpu: Rename fpu_save_init() to copy_fpregs_to_fpstate() · 4f836347
      Ingo Molnar authored
      So fpu_save_init() is a historic name that got its name when the only
      way the FPU state was FNSAVE, which cleared (well, destroyed) the FPU
      state after saving it.
      
      Nowadays the name is misleading, because ever since the introduction of
      FXSAVE (and more modern FPU saving instructions) the 'we need to reload
      the FPU state' part is only true if there's a pending FPU exception [*],
      which is almost never the case.
      
      So rename it to copy_fpregs_to_fpstate() to make it clear what's
      happening. Also add a few comments about why we cannot keep registers
      in certain cases.
      
      Also clean up the control flow a bit, to make it more apparent when
      we are dropping/keeping FP registers, and to optimize the common
      case (of keeping fpregs) some more.
      
      [*] Probably not true anymore, modern instructions always leave the FPU
          state intact, even if exceptions are pending: because pending FP
          exceptions are posted on the next FP instruction, not asynchronously.
      
          They were truly asynchronous back in the IRQ13 case, and we had to
          synchronize with them, but that code is not working anymore: we don't
          have IRQ13 mapped in the IDT anymore.
      
          But a cleanup patch is obviously not the place to change subtle behavior.
      Reviewed-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4f836347