1. 20 Sep, 2016 8 commits
  2. 06 Sep, 2016 3 commits
  3. 01 Sep, 2016 1 commit
  4. 29 Aug, 2016 13 commits
    • Martin Schwidefsky's avatar
      s390/crypto: simplify CPACF encryption / decryption functions · 7bac4f5b
      Martin Schwidefsky authored
      The double while loops of the CTR mode encryption / decryption functions
      are overly complex for little gain. Simplify the functions to a single
      while loop at the cost of an additional memcpy of a few bytes for every
      4K page worth of data.
      Adapt the other crypto functions to make them all look alike.
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      7bac4f5b
    • Martin Schwidefsky's avatar
      s390/crypto: cpacf function detection · 69c0e360
      Martin Schwidefsky authored
      The CPACF code makes some assumptions about the availablity of hardware
      support. E.g. if the machine supports KM(AES-256) without chaining it is
      assumed that KMC(AES-256) with chaining is available as well. For the
      existing CPUs this is true but the architecturally correct way is to
      check each CPACF functions on its own. This is what the query function
      of each instructions is all about.
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      69c0e360
    • Martin Schwidefsky's avatar
      s390/crypto: simplify init / exit functions · d863d594
      Martin Schwidefsky authored
      The aes and the des module register multiple crypto algorithms
      dependent on the availability of specific CPACF instructions.
      To simplify the deregistration with crypto_unregister_alg add
      an array with pointers to the successfully registered algorithms
      and use it for the error handling in the init function and in
      the module exit function.
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      d863d594
    • Martin Schwidefsky's avatar
      s390/crypto: simplify return code handling · 0177db01
      Martin Schwidefsky authored
      The CPACF instructions can complete with three different condition codes:
      CC=0 for successful completion, CC=1 if the protected key verification
      failed, and CC=3 for partial completion.
      
      The inline functions will restart the CPACF instruction for partial
      completion, this removes the CC=3 case. The CC=1 case is only relevant
      for the protected key functions of the KM, KMC, KMAC and KMCTR
      instructions. As the protected key functions are not used by the
      current code, there is no need for any kind of return code handling.
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      0177db01
    • Martin Schwidefsky's avatar
      s390/crypto: cleanup cpacf function codes · edc63a37
      Martin Schwidefsky authored
      Use a separate define for the decryption modifier bit instead of
      duplicating the function codes for encryption / decrypton.
      In addition use an unsigned type for the function code.
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      edc63a37
    • Martin Schwidefsky's avatar
      RAID/s390: add SIMD implementation for raid6 gen/xor · 474fd6e8
      Martin Schwidefsky authored
      Using vector registers is slightly faster:
      
      raid6: vx128x8  gen() 19705 MB/s
      raid6: vx128x8  xor() 11886 MB/s
      raid6: using algorithm vx128x8 gen() 19705 MB/s
      raid6: .... xor() 11886 MB/s, rmw enabled
      
      vs the software algorithms:
      
      raid6: int64x1  gen()  3018 MB/s
      raid6: int64x1  xor()  1429 MB/s
      raid6: int64x2  gen()  4661 MB/s
      raid6: int64x2  xor()  3143 MB/s
      raid6: int64x4  gen()  5392 MB/s
      raid6: int64x4  xor()  3509 MB/s
      raid6: int64x8  gen()  4441 MB/s
      raid6: int64x8  xor()  3207 MB/s
      raid6: using algorithm int64x4 gen() 5392 MB/s
      raid6: .... xor() 3509 MB/s, rmw enabled
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      474fd6e8
    • Martin Schwidefsky's avatar
      s390/nmi: improve revalidation of fpu / vector registers · 8f149ea6
      Martin Schwidefsky authored
      The machine check handler will do one of two things if the floating-point
      control, a floating point register or a vector register can not be
      revalidated:
      1) if the PSW indicates user mode the process is terminated
      2) if the PSW indicates kernel mode the system is stopped
      
      To unconditionally stop the system for 2) is incorrect.
      
      There are three possible outcomes if the floating-point control, a
      floating point register or a vector registers can not be revalidated:
      1) The kernel is inside a kernel_fpu_begin/kernel_fpu_end block and
         needs the register. The system is stopped.
      2) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_CPU bit
         is not set. The user space process needs the register and is killed.
      3) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_FPU bit
         is set. Neither the kernel nor the user space process needs the
         lost register. Just revalidate it and continue.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      8f149ea6
    • Martin Schwidefsky's avatar
      s390/fpu: improve kernel_fpu_[begin|end] · 7f79695c
      Martin Schwidefsky authored
      In case of nested user of the FPU or vector registers in the kernel
      the current code uses the mask of the FPU/vector registers of the
      previous contexts to decide which registers to save and restore.
      E.g. if the previous context used KERNEL_VXR_V0V7 and the next
      context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
      are stored to the FPU state structure. But this is not necessary
      as the next context does not use these registers.
      
      Rework the FPU/vector register save and restore code. The new code
      does a few things differently:
      1) A lowcore field is used instead of a per-cpu variable.
      2) The kernel_fpu_end function now has two parameters just like
         kernel_fpu_begin. The register flags are required by both
         functions to save / restore the minimal register set.
      3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
         update of the register masks. If the user space FPU registers
         have already been stored neither save_fpu_regs nor the
         __kernel_fpu_begin/__kernel_fpu_end functions have to be called
         for the first context. In this case kernel_fpu_begin adds 7
         instructions and kernel_fpu_end adds 4 instructions.
      3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
         to save / restore the vector registers are simplified a bit.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f79695c
    • Martin Schwidefsky's avatar
      s390/vx: allow to include vx-insn.h with .include · 0eab11c7
      Martin Schwidefsky authored
      To make the vx-insn.h more versatile avoid cpp preprocessor macros
      and allow to use plain numbers for vector and general purpose register
      operands. With that you can emit an .include from a C file into the
      assembler text and then use the vx-insn macros in inline assemblies.
      
      For example:
      
      asm (".include \"asm/vx-insn.h\"");
      
      static inline void xor_vec(int x, int y, int z)
      {
      	asm volatile("VX %0,%1,%2"
      		     : : "i" (x), "i" (y), "i" (z));
      }
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      0eab11c7
    • David Hildenbrand's avatar
      s390/time: avoid races when updating tb_update_count · 67f03de5
      David Hildenbrand authored
      The increment might not be atomic and we're not holding the
      timekeeper_lock. Therefore we might lose an update to count, resulting in
      VDSO being trapped in a loop. As other archs also simply update the
      values and count doesn't seem to have an impact on reloading of these
      values in VDSO code, let's just remove the update of tb_update_count.
      Suggested-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      67f03de5
    • David Hildenbrand's avatar
      s390/time: fixup the clock comparator on all cpus · 0c00b1e0
      David Hildenbrand authored
      By leaving fixup_cc unset, only the clock comparator of the cpu actually
      doing the sync is fixed up until now.
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      0c00b1e0
    • David Hildenbrand's avatar
      s390/time: cleanup etr leftovers · ca64f639
      David Hildenbrand authored
      There are still some etr leftovers and wrong comments, let's clean that up.
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      ca64f639
    • David Hildenbrand's avatar
      s390/time: simplify stp time syncs · 41ad0220
      David Hildenbrand authored
      The way we call do_adjtimex() today is broken. It has 0 effect, as
      ADJ_OFFSET_SINGLESHOT (0x0001) in the kernel maps to !ADJ_ADJTIME
      (in contrast to user space where it maps to  ADJ_OFFSET_SINGLESHOT |
      ADJ_ADJTIME - 0x8001). !ADJ_ADJTIME will silently ignore all adjustments
      without STA_PLL being active. We could switch to ADJ_ADJTIME or turn
      STA_PLL on, but still we would run into some problems:
      
      - Even when switching to nanoseconds, we lose accuracy.
      - Successive calls to do_adjtimex() will simply overwrite any leftovers
        from the previous call (if not fully handled)
      - Anything that NTP does using the sysctl heavily interferes with our
        use.
      - !ADJ_ADJTIME will silently round stuff > or < than 0.5 seconds
      
      Reusing do_adjtimex() here just feels wrong. The whole STP synchronization
      works right now *somehow* only, as do_adjtimex() does nothing and our
      TOD clock jumps in time, although it shouldn't. This is especially bad
      as the clock could jump backwards in time. We will have to find another
      way to fix this up.
      
      As leap seconds are also not properly handled yet, let's just get rid of
      all this complex logic altogether and use the correct clock_delta for
      fixing up the clock comparator and keeping the sched_clock monotonic.
      
      This change should have 0 effect on the current STP mechanism. Once we
      know how to best handle sync events and leap second updates, we'll start
      with a fresh implementation.
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      41ad0220
  5. 26 Aug, 2016 1 commit
  6. 25 Aug, 2016 1 commit
  7. 24 Aug, 2016 8 commits
  8. 23 Aug, 2016 4 commits
    • Linus Torvalds's avatar
      Merge tag 'usercopy-v4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 7a1dcf6a
      Linus Torvalds authored
      Pull hardened usercopy fixes from Kees Cook:
       - avoid signed math problems on unexpected compilers
       - avoid false positives at very end of kernel text range checks
      
      * tag 'usercopy-v4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        usercopy: fix overlap check for kernel text
        usercopy: avoid potentially undefined behavior in pointer math
      7a1dcf6a
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · d1fdafa1
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes a number of memory corruption bugs in the newly added
        sha256-mb/sha256-mb code"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: sha512-mb - fix ctx pointer
        crypto: sha256-mb - fix ctx pointer and digest copy
      d1fdafa1
    • Josh Poimboeuf's avatar
      usercopy: fix overlap check for kernel text · 94cd97af
      Josh Poimboeuf authored
      When running with a local patch which moves the '_stext' symbol to the
      very beginning of the kernel text area, I got the following panic with
      CONFIG_HARDENED_USERCOPY:
      
        usercopy: kernel memory exposure attempt detected from ffff88103dfff000 (<linear kernel text>) (4096 bytes)
        ------------[ cut here ]------------
        kernel BUG at mm/usercopy.c:79!
        invalid opcode: 0000 [#1] SMP
        ...
        CPU: 0 PID: 4800 Comm: cp Not tainted 4.8.0-rc3.after+ #1
        Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.5.4 01/22/2016
        task: ffff880817444140 task.stack: ffff880816274000
        RIP: 0010:[<ffffffff8121c796>] __check_object_size+0x76/0x413
        RSP: 0018:ffff880816277c40 EFLAGS: 00010246
        RAX: 000000000000006b RBX: ffff88103dfff000 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff88081f80dfa8 RDI: ffff88081f80dfa8
        RBP: ffff880816277c90 R08: 000000000000054c R09: 0000000000000000
        R10: 0000000000000005 R11: 0000000000000006 R12: 0000000000001000
        R13: ffff88103e000000 R14: ffff88103dffffff R15: 0000000000000001
        FS:  00007fb9d1750800(0000) GS:ffff88081f800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000021d2000 CR3: 000000081a08f000 CR4: 00000000001406f0
        Stack:
         ffff880816277cc8 0000000000010000 000000043de07000 0000000000000000
         0000000000001000 ffff880816277e60 0000000000001000 ffff880816277e28
         000000000000c000 0000000000001000 ffff880816277ce8 ffffffff8136c3a6
        Call Trace:
         [<ffffffff8136c3a6>] copy_page_to_iter_iovec+0xa6/0x1c0
         [<ffffffff8136e766>] copy_page_to_iter+0x16/0x90
         [<ffffffff811970e3>] generic_file_read_iter+0x3e3/0x7c0
         [<ffffffffa06a738d>] ? xfs_file_buffered_aio_write+0xad/0x260 [xfs]
         [<ffffffff816e6262>] ? down_read+0x12/0x40
         [<ffffffffa06a61b1>] xfs_file_buffered_aio_read+0x51/0xc0 [xfs]
         [<ffffffffa06a6692>] xfs_file_read_iter+0x62/0xb0 [xfs]
         [<ffffffff812224cf>] __vfs_read+0xdf/0x130
         [<ffffffff81222c9e>] vfs_read+0x8e/0x140
         [<ffffffff81224195>] SyS_read+0x55/0xc0
         [<ffffffff81003a47>] do_syscall_64+0x67/0x160
         [<ffffffff816e8421>] entry_SYSCALL64_slow_path+0x25/0x25
        RIP: 0033:[<00007fb9d0c33c00>] 0x7fb9d0c33c00
        RSP: 002b:00007ffc9c262f28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
        RAX: ffffffffffffffda RBX: fffffffffff8ffff RCX: 00007fb9d0c33c00
        RDX: 0000000000010000 RSI: 00000000021c3000 RDI: 0000000000000004
        RBP: 00000000021c3000 R08: 0000000000000000 R09: 00007ffc9c264d6c
        R10: 00007ffc9c262c50 R11: 0000000000000246 R12: 0000000000010000
        R13: 00007ffc9c2630b0 R14: 0000000000000004 R15: 0000000000010000
        Code: 81 48 0f 44 d0 48 c7 c6 90 4d a3 81 48 c7 c0 bb b3 a2 81 48 0f 44 f0 4d 89 e1 48 89 d9 48 c7 c7 68 16 a3 81 31 c0 e8 f4 57 f7 ff <0f> 0b 48 8d 90 00 40 00 00 48 39 d3 0f 83 22 01 00 00 48 39 c3
        RIP  [<ffffffff8121c796>] __check_object_size+0x76/0x413
         RSP <ffff880816277c40>
      
      The checked object's range [ffff88103dfff000, ffff88103e000000) is
      valid, so there shouldn't have been a BUG.  The hardened usercopy code
      got confused because the range's ending address is the same as the
      kernel's text starting address at 0xffff88103e000000.  The overlap check
      is slightly off.
      
      Fixes: f5509cc1 ("mm: Hardened usercopy")
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      94cd97af
    • Eric Biggers's avatar
      usercopy: avoid potentially undefined behavior in pointer math · 7329a655
      Eric Biggers authored
      check_bogus_address() checked for pointer overflow using this expression,
      where 'ptr' has type 'const void *':
      
      	ptr + n < ptr
      
      Since pointer wraparound is undefined behavior, gcc at -O2 by default
      treats it like the following, which would not behave as intended:
      
      	(long)n < 0
      
      Fortunately, this doesn't currently happen for kernel code because kernel
      code is compiled with -fno-strict-overflow.  But the expression should be
      fixed anyway to use well-defined integer arithmetic, since it could be
      treated differently by different compilers in the future or could be
      reported by tools checking for undefined behavior.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      7329a655
  9. 22 Aug, 2016 1 commit