1. 15 Aug, 2017 13 commits
    • Mark Rutland's avatar
      arm64: add on_accessible_stack() · 12964443
      Mark Rutland authored
      Both unwind_frame() and dump_backtrace() try to check whether a stack
      address is sane to access, with very similar logic. Both will need
      updating in order to handle overflow stacks.
      
      Factor out this logic into a helper, so that we can avoid further
      duplication when we add overflow stacks.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      12964443
    • Mark Rutland's avatar
      arm64: add basic VMAP_STACK support · e3067861
      Mark Rutland authored
      This patch enables arm64 to be built with vmap'd task and IRQ stacks.
      
      As vmap'd stacks are mapped at page granularity, stacks must be a multiple of
      PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in
      size.
      
      To minimize the increase in Image size, IRQ stacks are dynamically allocated at
      boot time, rather than embedding the boot CPU's IRQ stack in the kernel image.
      
      This patch was co-authored by Ard Biesheuvel and Mark Rutland.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      e3067861
    • Mark Rutland's avatar
      arm64: use an irq stack pointer · f60fe78f
      Mark Rutland authored
      We allocate our IRQ stacks using a percpu array. This allows us to generate our
      IRQ stack pointers with adr_this_cpu, but bloats the kernel Image with the boot
      CPU's IRQ stack. Additionally, these are packed with other percpu variables,
      and aren't guaranteed to have guard pages.
      
      When we enable VMAP_STACK we'll want to vmap our IRQ stacks also, in order to
      provide guard pages and to permit more stringent alignment requirements. Doing
      so will require that we use a percpu pointer to each IRQ stack, rather than
      allocating a percpu IRQ stack in the kernel image.
      
      This patch updates our IRQ stack code to use a percpu pointer to the base of
      each IRQ stack. This will allow us to change the way the stack is allocated
      with minimal changes elsewhere. In some cases we may try to backtrace before
      the IRQ stack pointers are initialised, so on_irq_stack() is updated to account
      for this.
      
      In testing with cyclictest, there was no measureable difference between using
      adr_this_cpu (for irq_stack) and ldr_this_cpu (for irq_stack_ptr) in the IRQ
      entry path.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      f60fe78f
    • Ard Biesheuvel's avatar
      arm64: assembler: allow adr_this_cpu to use the stack pointer · 8ea41b11
      Ard Biesheuvel authored
      Given that adr_this_cpu already requires a temp register in addition
      to the destination register, tweak the instruction sequence so that sp
      may be used as well.
      
      This will simplify switching to per-cpu stacks in subsequent patches. While
      this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
      adr_this_cpu in modules, and this is not problematic for the main kernel image.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: add more commit text]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      8ea41b11
    • Mark Rutland's avatar
      arm64: factor out entry stack manipulation · b11e5759
      Mark Rutland authored
      In subsequent patches, we will detect stack overflow in our exception
      entry code, by verifying the SP after it has been decremented to make
      space for the exception regs.
      
      This verification code is small, and we can minimize its impact by
      placing it directly in the vectors. To avoid redundant modification of
      the SP, we also need to move the initial decrement of the SP into the
      vectors.
      
      As a preparatory step, this patch introduces kernel_ventry, which
      performs this decrement, and updates the entry code accordingly.
      Subsequent patches will fold SP verification into kernel_ventry.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: turn into prep patch, expand commit msg]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      b11e5759
    • Mark Rutland's avatar
      efi/arm64: add EFI_KIMG_ALIGN · 170976bc
      Mark Rutland authored
      The EFI stub is intimately coupled with the kernel, and takes advantage
      of this by relocating the kernel at a weaker alignment than the
      documented boot protocol mandates.
      
      However, it does so by assuming it can align the kernel to the segment
      alignment, and assumes that this is 64K. In subsequent patches, we'll
      have to consider other details to determine this de-facto alignment
      constraint.
      
      This patch adds a new EFI_KIMG_ALIGN definition that will track the
      kernel's de-facto alignment requirements. Subsequent patches will modify
      this as required.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      170976bc
    • Mark Rutland's avatar
      arm64: move SEGMENT_ALIGN to <asm/memory.h> · 8018ba4e
      Mark Rutland authored
      Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S.
      
      This is unfortunate, as the EFI stub currently open-codes the same
      number, and in future we'll want to fiddle with this.
      
      This patch moves the definition to our <asm/memory.h>, where it can be
      used by both vmlinux.lds.S and the EFI stub code.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      8018ba4e
    • Mark Rutland's avatar
      arm64: clean up irq stack definitions · f60ad4ed
      Mark Rutland authored
      Before we add yet another stack to the kernel, it would be nice to
      ensure that we consistently organise stack definitions and related
      helper functions.
      
      This patch moves the basic IRQ stack defintions to <asm/memory.h> to
      live with their task stack counterparts. Helpers used for unwinding are
      moved into <asm/stacktrace.h>, where subsequent patches will add helpers
      for other stacks. Includes are fixed up accordingly.
      
      This patch is a pure refactoring -- there should be no functional
      changes as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      f60ad4ed
    • Mark Rutland's avatar
      arm64: clean up THREAD_* definitions · dbc9344a
      Mark Rutland authored
      Currently we define THREAD_SIZE and THREAD_SIZE_ORDER separately, with
      the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
      This is somewhat opaque, and will get in the way of future modifications
      to THREAD_SIZE.
      
      This patch cleans this up, defining both in terms of a common
      THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
      rather than using a number of definitions dependent on config symbols.
      Subsequent patches will make use of this to alter the stack size used in
      some configurations.
      
      At the same time, these are moved into <asm/memory.h>, which will avoid
      circular include issues in subsequent patches. To ensure that existing
      code isn't adversely affected, <asm/thread_info.h> is updated to
      transitively include these definitions.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      dbc9344a
    • Mark Rutland's avatar
      arm64: factor out PAGE_* and CONT_* definitions · b6531456
      Mark Rutland authored
      Some headers rely on PAGE_* definitions from <asm/page.h>, but cannot
      include this due to potential circular includes. For example, a number
      of definitions in <asm/memory.h> rely on PAGE_SHIFT, and <asm/page.h>
      includes <asm/memory.h>.
      
      This requires users of these definitions to include both headers, which
      is fragile and error-prone.
      
      This patch ameliorates matters by moving the basic definitions out to a
      new header, <asm/page-def.h>. Both <asm/page.h> and <asm/memory.h> are
      updated to include this, avoiding this fragility, and avoiding the
      possibility of circular include dependencies.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      b6531456
    • Ard Biesheuvel's avatar
      arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP · 34be98f4
      Ard Biesheuvel authored
      For historical reasons, we leave the top 16 bytes of our task and IRQ
      stacks unused, a practice used to ensure that the SP can always be
      masked to find the base of the current stack (historically, where
      thread_info could be found).
      
      However, this is not necessary, as:
      
      * When an exception is taken from a task stack, we decrement the SP by
        S_FRAME_SIZE and stash the exception registers before we compare the
        SP against the task stack. In such cases, the SP must be at least
        S_FRAME_SIZE below the limit, and can be safely masked to determine
        whether the task stack is in use.
      
      * When transitioning to an IRQ stack, we'll place a dummy frame onto the
        IRQ stack before enabling asynchronous exceptions, or executing code
        we expect to trigger faults. Thus, if an exception is taken from the
        IRQ stack, the SP must be at least 16 bytes below the limit.
      
      * We no longer mask the SP to find the thread_info, which is now found
        via sp_el0. Note that historically, the offset was critical to ensure
        that cpu_switch_to() found the correct stack for new threads that
        hadn't yet executed ret_from_fork().
      
      Given that, this initial offset serves no purpose, and can be removed.
      This brings us in-line with other architectures (e.g. x86) which do not
      rely on this masking.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: rebase, kill THREAD_START_SP, commit msg additions]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      34be98f4
    • Mark Rutland's avatar
      fork: allow arch-override of VMAP stack alignment · 48ac3c18
      Mark Rutland authored
      In some cases, an architecture might wish its stacks to be aligned to a
      boundary larger than THREAD_SIZE. For example, using an alignment of
      double THREAD_SIZE can allow for stack overflows smaller than
      THREAD_SIZE to be detected by checking a single bit of the stack
      pointer.
      
      This patch allows architectures to override the alignment of VMAP'd
      stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
      THREAD_SIZE, as is the case today.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: linux-kernel@vger.kernel.org
      48ac3c18
    • Mark Rutland's avatar
      arm64: remove __die()'s stack dump · c5bc503c
      Mark Rutland authored
      Our __die() implementation tries to dump the stack memory, in addition
      to a backtrace, which is problematic.
      
      For contemporary 16K stacks, this can be a lot of data, which can take a
      long time to dump, and can push other useful context out of the kernel's
      printk ringbuffer (and/or a user's scrollback buffer on an attached
      console).
      
      Additionally, the code implicitly assumes that the SP is on the task's
      stack, and tries to dump everything between the SP and the highest task
      stack address. When the SP points at an IRQ stack (or is corrupted),
      this makes the kernel attempt to dump vast amounts of VA space. With
      vmap'd stacks, this may result in erroneous accesses to peripherals.
      
      This patch removes the memory dump, leaving us to rely on the backtrace,
      and other means of dumping stack memory such as kdump.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      c5bc503c
  2. 09 Aug, 2017 2 commits
    • Ard Biesheuvel's avatar
      arm64: unwind: remove sp from struct stackframe · 31e43ad3
      Ard Biesheuvel authored
      The unwind code sets the sp member of struct stackframe to
      'frame pointer + 0x10' unconditionally, without regard for whether
      doing so produces a legal value. So let's simply remove it now that
      we have stopped using it anyway.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      31e43ad3
    • Ard Biesheuvel's avatar
      arm64: unwind: reference pt_regs via embedded stack frame · 73267498
      Ard Biesheuvel authored
      As it turns out, the unwind code is slightly broken, and probably has
      been for a while. The problem is in the dumping of the exception stack,
      which is intended to dump the contents of the pt_regs struct at each
      level in the call stack where an exception was taken and routed to a
      routine marked as __exception (which means its stack frame is right
      below the pt_regs struct on the stack).
      
      'Right below the pt_regs struct' is ill defined, though: the unwind
      code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
      struct at each level, and dump_backtrace() happily dereferences that as
      the pt_regs pointer when encountering an __exception routine. However,
      the actual size of the stack frame created by this routine (which could
      be one of many __exception routines we have in the kernel) is not known,
      and so frame.sp is pretty useless to figure out where struct pt_regs
      really is.
      
      So it seems the only way to ensure that we can find our struct pt_regs
      when walking the stack frames is to put it at a known fixed offset of
      the stack frame pointer that is passed to such __exception routines.
      The simplest way to do that is to put it inside pt_regs itself, which is
      the main change implemented by this patch. As a bonus, doing this allows
      us to get rid of a fair amount of cruft related to walking from one stack
      to the other, which is especially nice since we intend to introduce yet
      another stack for overflow handling once we add support for vmapped
      stacks. It also fixes an inconsistency where we only add a stack frame
      pointing to ELR_EL1 if we are executing from the IRQ stack but not when
      we are executing from the task stack.
      
      To consistly identify exceptions regs even in the presence of exceptions
      taken from entry code, we must check whether the next frame was created
      by entry text, rather than whether the current frame was crated by
      exception text.
      
      To avoid backtracing using PCs that fall in the idmap, or are controlled
      by userspace, we must explcitly zero the FP and LR in startup paths, and
      must ensure that the frame embedded in pt_regs is zeroed upon entry from
      EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
      is updated to avoid them.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: compare current frame against .entry.text, avoid bogus PCs]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      73267498
  3. 08 Aug, 2017 5 commits
    • Ard Biesheuvel's avatar
      arm64: unwind: disregard frame.sp when validating frame pointer · c7365330
      Ard Biesheuvel authored
      Currently, when unwinding the call stack, we validate the frame pointer
      of each frame against frame.sp, whose value is not clearly defined, and
      which makes it more difficult to link stack frames together across
      different stacks. It is far better to simply check whether the frame
      pointer itself points into a valid stack.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      c7365330
    • Mark Rutland's avatar
      arm64: unwind: avoid percpu indirection for irq stack · 09668372
      Mark Rutland authored
      Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument,
      used to generate a percpu address. In all cases, they are passed
      {raw_,}smp_processor_id(), so this parameter is redundant.
      
      Since {raw_,}smp_processor_id() use a percpu variable internally, this
      approach means we generate a percpu offset to find the current cpu, then
      use this to index an array of percpu offsets, which we then use to find
      the current CPU's IRQ stack pointer. Thus, most of the work is
      redundant.
      
      Instead, we can consistently use raw_cpu_ptr() to generate the CPU's
      irq_stack pointer by simply adding the percpu offset to the irq_stack
      address, which is simpler in both respects.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      09668372
    • Mark Rutland's avatar
      arm64: move non-entry code out of .entry.text · ed84b4e9
      Mark Rutland authored
      Currently, cpu_switch_to and ret_from_fork both live in .entry.text,
      though neither form the critical path for an exception entry.
      
      In subsequent patches, we will require that code in .entry.text is part
      of the critical path for exception entry, for which we can assume
      certain properties (e.g. the presence of exception regs on the stack).
      
      Neither cpu_switch_to nor ret_from_fork will meet these requirements, so
      we must move them out of .entry.text. To ensure that neither are kprobed
      after being moved out of .entry.text, we must explicitly blacklist them,
      requiring a new NOKPROBE() asm helper.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      ed84b4e9
    • Mark Rutland's avatar
      arm64: consistently use bl for C exception entry · 2d0e751a
      Mark Rutland authored
      In most cases, our exception entry assembly branches to C handlers with
      a BL instruction, but in cases where we do not expect to return, we use
      B instead.
      
      While this is correct today, it means that backtraces for fatal
      exceptions miss the entry assembly (as the LR is stale at the point we
      call C code), while non-fatal exceptions have the entry assembly in the
      LR. In subsequent patches, we will need the LR to be set in these cases
      in order to backtrace reliably.
      
      This patch updates these sites to use a BL, ensuring consistency, and
      preparing for backtrace rework. An ASM_BUG() is added after each of
      these new BLs, which both catches unexpected returns, and ensures that
      the LR value doesn't point to another function label.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      2d0e751a
    • Mark Rutland's avatar
      arm64: Add ASM_BUG() · db44e9c5
      Mark Rutland authored
      Currently. we can only use BUG() from C code, though there are
      situations where we would like an equivalent mechanism in assembly code.
      
      This patch refactors our BUG() definition such that it can be used in
      either C or assembly, in the form of a new ASM_BUG().
      
      The refactoring requires the removal of escape sequences, such as '\n'
      and '\t', but these aren't strictly necessary as we can use ';' to
      terminate assembler statements.
      
      The low-level assembly is factored out into <asm/asm-bug.h>, with
      <asm/bug.h> retained as the C wrapper.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      db44e9c5
  4. 30 Jul, 2017 5 commits
  5. 29 Jul, 2017 1 commit
  6. 28 Jul, 2017 14 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.13-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · 286ba844
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "More NFS client bugfixes for 4.13.
      
        Most of these fix locking bugs that Ben and Neil noticed, but I also
        have a patch to fix one more access bug that was reported after last
        week.
      
        Stable fixes:
         - Fix a race where CB_NOTIFY_LOCK fails to wake a waiter
         - Invalidate file size when taking a lock to prevent corruption
      
        Other fixes:
         - Don't excessively generate tiny writes with fallocate
         - Use the raw NFS access mask in nfs4_opendata_access()"
      
      * tag 'nfs-for-4.13-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter
        NFS: Optimize fallocate by refreshing mapping when needed.
        NFS: invalidate file size when taking a lock.
        NFS: Use raw NFS access mask in nfs4_opendata_access()
      286ba844
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.13-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 19993e73
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
      
       - fix firstfsb variables that we left uninitialized, which could lead
         to locking problems.
      
       - check for NULL metadata buffer pointers before using them.
      
       - don't allow btree cursor manipulation if the btree block is corrupt.
         Better to just shut down.
      
       - fix infinite loop problems in quotacheck.
      
       - fix buffer overrun when validating directory blocks.
      
       - fix deadlock problem in bunmapi.
      
      * tag 'xfs-4.13-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix multi-AG deadlock in xfs_bunmapi
        xfs: check that dir block entries don't off the end of the buffer
        xfs: fix quotacheck dquot id overflow infinite loop
        xfs: check _alloc_read_agf buffer pointer before using
        xfs: set firstfsb to NULLFSBLOCK before feeding it to _bmapi_write
        xfs: check _btree_check_block value
      19993e73
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 81554693
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "s390:
         - SRCU fix
      
        PPC:
         - host crash fixes
      
        x86:
         - bugfixes, including making nested posted interrupts really work
      
        Generic:
         - tweaks to kvm_stat and to uevents"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: LAPIC: Fix reentrancy issues with preempt notifiers
        tools/kvm_stat: add '-f help' to get the available event list
        tools/kvm_stat: use variables instead of hard paths in help output
        KVM: nVMX: Fix loss of L2's NMI blocking state
        KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode
        x86: irq: Define a global vector for nested posted interrupts
        KVM: x86: do mask out upper bits of PAE CR3
        KVM: make pid available for uevents without debugfs
        KVM: s390: take srcu lock when getting/setting storage keys
        KVM: VMX: remove unused field
        KVM: PPC: Book3S HV: Fix host crash on changing HPT size
        KVM: PPC: Book3S HV: Enable TM before accessing TM registers
      81554693
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 8562e89e
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "Three minor cleanups for xen related drivers"
      
      * tag 'for-linus-4.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: dont fiddle with event channel masking in suspend/resume
        xen: selfballoon: remove unnecessary static in frontswap_selfshrink()
        xen: Drop un-informative message during boot
      8562e89e
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 3d9d7405
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "I'd been collecting these whilst we debugged a CPU hotplug failure,
        but we ended up diagnosing that one to tglx, who has taken a fix via
        the -tip tree separately.
      
        We're seeing some NFS issues that we haven't gotten to the bottom of
        yet, and we've uncovered some issues with our backtracing too so there
        might be another fixes pull before we're done.
      
        Summary:
      
         - Ensure we have a guard page after the kernel image in vmalloc
      
         - Fix incorrect prefetch stride in copy_page
      
         - Ensure irqs are disabled in die()
      
         - Fix for event group validation in QCOM L2 PMU driver
      
         - Fix requesting of PMU IRQs on AMD Seattle
      
         - Minor cleanups and fixes"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: mmu: Place guard page after mapping of kernel image
        drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPU
        arm64: sysreg: Fix unprotected macro argmuent in write_sysreg
        perf: qcom_l2: fix column exclusion check
        arm64/lib: copy_page: use consistent prefetch stride
        arm64/numa: Drop duplicate message
        perf: Convert to using %pOF instead of full_name
        arm64: Convert to using %pOF instead of full_name
        arm64: traps: disable irq in die()
        arm64: atomics: Remove '&' from '+&' asm constraint in lse atomics
        arm64: uaccess: Remove redundant __force from addr cast in __range_ok
      3d9d7405
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 080012ba
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "The highlight is Ben's patch to work around a host killing bug when
        running KVM guests with the Radix MMU on Power9. See the long change
        log of that commit for more detail.
      
        And then three fairly minor fixes:
      
         - fix of_node_put() underflow during reconfig remove, using old DLPAR
           tools.
      
         - fix recently introduced ld version check with 64-bit LE-only
           toolchain.
      
         - free the subpage_prot_table correctly, avoiding a memory leak.
      
        Thanks to: Aneesh Kumar K.V, Benjamin Herrenschmidt, Laurent Vivier"
      
      * tag 'powerpc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/mm/hash: Free the subpage_prot_table correctly
        powerpc/Makefile: Fix ld version check with 64-bit LE-only toolchain
        powerpc/pseries: Fix of_node_put() underflow during reconfig remove
        powerpc/mm/radix: Workaround prefetch issue with KVM
      080012ba
    • Benjamin Coddington's avatar
      NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter · b7dbcc0e
      Benjamin Coddington authored
      nfs4_retry_setlk() sets the task's state to TASK_INTERRUPTIBLE within the
      same region protected by the wait_queue's lock after checking for a
      notification from CB_NOTIFY_LOCK callback.  However, after releasing that
      lock, a wakeup for that task may race in before the call to
      freezable_schedule_timeout_interruptible() and set TASK_WAKING, then
      freezable_schedule_timeout_interruptible() will set the state back to
      TASK_INTERRUPTIBLE before the task will sleep.  The result is that the task
      will sleep for the entire duration of the timeout.
      
      Since we've already set TASK_INTERRUPTIBLE in the locked section, just use
      freezable_schedule_timout() instead.
      
      Fixes: a1d617d8 ("nfs: allow blocking locks to be awoken by lock callbacks")
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      b7dbcc0e
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e26f1bea
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - remove broken dt bindings in inside-secure
      
       - fix authencesn crash when used with digest_null
      
       - fix cavium/nitrox firmware path
      
       - fix SHA3 failure in brcm
      
       - fix Kconfig dependency for brcm
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: authencesn - Fix digest_null crash
        crypto: brcm - remove BCM_PDC_MBOX dependency in Kconfig
        Documentation/bindings: crypto: remove the dma-mask property
        crypto: inside-secure - do not parse the dma mask from dt
        crypto: cavium/nitrox - Change in firmware path.
        crypto: brcm - Fix SHA3-512 algorithm failure
      e26f1bea
    • Linus Torvalds's avatar
      Merge branch 'for-4.13-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 0a2a1330
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Fixes addressing problems reported by users, and there's one more
        regression fix"
      
      * 'for-4.13-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: round down size diff when shrinking/growing device
        Btrfs: fix early ENOSPC due to delalloc
        btrfs: fix lockup in find_free_extent with read-only block groups
        Btrfs: fix dir item validation when replaying xattr deletes
      0a2a1330
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md · 9583f1c9
      Linus Torvalds authored
      Pull MD fixes from Shaohua Li:
       "This fixes several bugs, three of them are marked for stable:
      
         - an initialization issue fixed by Ming
      
         - a bio clone race issue fixed by me
      
         - an async tx flush issue fixed by Ofer
      
         - other cleanups"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
        MD: fix warnning for UP case
        md/raid5: add thread_group worker async_tx_issue_pending_all
        md: simplify code with bio_io_error
        md/raid1: fix writebehind bio clone
        md: raid1-10: move raid1/raid10 common code into raid1-10.c
        md: raid1/raid10: initialize bvec table via bio_add_page()
        md: remove 'idx' from 'struct resync_pages'
      9583f1c9
    • Linus Torvalds's avatar
      Merge tag 'for-4.13/dm-fixes' of... · 1731a474
      Linus Torvalds authored
      Merge tag 'for-4.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - a few DM integrity fixes that improve performance. One that address
         inefficiencies in the on-disk journal device layout. Another that
         makes use of the block layer's on-stack plugging when writing the
         journal.
      
       - a dm-bufio fix for the blk_status_t conversion that went in during
         the merge window.
      
       - a few DM raid fixes that address correctness when suspending the
         device and a validation fix for validation that occurs during device
         activation.
      
       - a couple DM zoned target fixes. Important one being the fix to not
         use GFP_KERNEL in the IO path due to concerns about deadlock in
         low-memory conditions (e.g. swap over a DM zoned device, etc).
      
       - a DM DAX device fix to make sure dm_dax_flush() is called if the
         underlying DAX device is operating as a write cache.
      
      * tag 'for-4.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm, dax: Make sure dm_dax_flush() is called if device supports it
        dm verity fec: fix GFP flags used with mempool_alloc()
        dm zoned: use GFP_NOIO in I/O path
        dm zoned: remove test for impossible REQ_OP_FLUSH conditions
        dm raid: bump target version
        dm raid: avoid mddev->suspended access
        dm raid: fix activation check in validate_raid_redundancy()
        dm raid: remove WARN_ON() in raid10_md_layout_to_format()
        dm bufio: fix error code in dm_bufio_write_dirty_buffers()
        dm integrity: test for corrupted disk format during table load
        dm integrity: WARN_ON if variables representing journal usage get out of sync
        dm integrity: use plugging when writing the journal
        dm integrity: fix inefficient allocation of journal space
      1731a474
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 0fa8dc42
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A small collection of fixes that should go into this series. This
        contains:
      
         - NVMe pull request from Christoph, with various fixes for nvme
           proper and nvme-fc.
      
         - disable runtime PM for blk-mq for now.
      
           With scsi now defaulting to using blk-mq, this reared its head as
           an issue. Longer term we'll fix up runtime PM for blk-mq, for now
           just disable it to prevent a hang on laptop resume for some folks.
      
         - blk-mq CPU <-> hw queue map fix from Christoph.
      
         - xen/blkfront pull request from Konrad, with two small fixes for the
           blkfront driver.
      
         - a few fixups for nbd from Joseph.
      
         - a stable fix for pblk from Javier"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        lightnvm: pblk: advance bio according to lba index
        nvme: validate admin queue before unquiesce
        nbd: clear disconnected on reconnect
        nvme-pci: fix HMB size calculation
        nvme-fc: revise TRADDR parsing
        nvme-fc: address target disconnect race conditions in fcp io submit
        nvme: fabrics commands should use the fctype field for data direction
        nvme: also provide a UUID in the WWID sysfs attribute
        xen/blkfront: always allocate grants first from per-queue persistent grants
        xen-blkfront: fix mq start/stop race
        blk-mq: map queues to all present CPUs
        block: disable runtime-pm for blk-mq
        xen-blkfront: Fix handling of non-supported operations
        nbd: only set sndtimeo if we have a timeout set
        nbd: take tx_lock before disconnecting
        nbd: allow multiple disconnects to be sent
      0fa8dc42
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · a2d48756
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "Here are a couple of mmc fixes intended for v4.13-rc1.
      
        I have also included a couple of cleanup patches in this pull request
        for OMAP2+, related to the omap_hsmmc driver. The reason is because of
        the changes are also depending on OMAP SoC specific code, so this
        simplifies how to deal with this.
      
        Summary:
      
        MMC host:
         - sunxi: Correct time phase settings
         - omap_hsmmc: Clean up some dead code
         - dw_mmc: Fix message printed for deprecated num-slots DT binding
         - dw_mmc: Fix DT documentation"
      
      * tag 'mmc-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        Documentation: dw-mshc: deprecate num-slots
        mmc: dw_mmc: fix the wrong condition check of getting num-slots from DT
        mmc: host: omap_hsmmc: remove unused platform callbacks
        ARM: OMAP2+: hsmmc.c: Remove dead code
        mmc: sunxi: Keep default timing phase settings for new timing mode
      a2d48756
    • Javier González's avatar
      lightnvm: pblk: advance bio according to lba index · 75cb8e93
      Javier González authored
      When a lba either hits the cache or corresponds to an empty entry in the
      L2P table, we need to advance the bio according to the position in which
      the lba is located. Otherwise, we will copy data in the wrong page, thus
      causing data corruption for the application.
      
      In case of a cache hit, we assumed that bio->bi_iter.bi_idx would
      contain the correct index, but this is no necessarily true. Instead, use
      the local bio advance counter and iterator. This guarantees that lbas
      hitting the cache are copied into the right bv_page.
      
      In case of an empty L2P entry, we omitted to advance the bio. In the
      cases when the same I/O also contains a cache hit, data corresponding
      to this lba will be copied to the wrong bv_page. Fix this by advancing
      the bio as we do in the case of a cache hit.
      
      Fixes: a4bd217b lightnvm: physical block device (pblk) target
      Signed-off-by: default avatarJavier González <javier@javigon.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      75cb8e93