1. 25 Feb, 2016 12 commits
    • Peter Zijlstra's avatar
      perf: Robustify task_function_call() · 0da4cf3e
      Peter Zijlstra authored
      Since there is no serialization between task_function_call() doing
      task_curr() and the other CPU doing context switches, we could end
      up not sending an IPI even if we had to.
      
      And I'm not sure I still buy my own argument we're OK.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.340031200@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0da4cf3e
    • Peter Zijlstra's avatar
      perf: Fix scaling vs. perf_install_in_context() · a096309b
      Peter Zijlstra authored
      Completely reworks perf_install_in_context() (again!) in order to
      ensure that there will be no ctx time hole between add_event_to_ctx()
      and any potential ctx_sched_in().
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.279399438@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a096309b
    • Peter Zijlstra's avatar
      perf: Fix scaling vs. perf_event_enable() · bd2afa49
      Peter Zijlstra authored
      Similar to the perf_enable_on_exec(), ensure that event timings are
      consistent across perf_event_enable().
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.218288698@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      bd2afa49
    • Peter Zijlstra's avatar
      perf: Fix scaling vs. perf_event_enable_on_exec() · 7fce2509
      Peter Zijlstra authored
      The recent commit 3e349507 ("perf: Fix perf_enable_on_exec() event
      scheduling") caused this by moving task_ctx_sched_out() from before
      __perf_event_mask_enable() to after it.
      
      The overlooked consequence of that change is that task_ctx_sched_out()
      would update the ctx time fields, and now __perf_event_mask_enable()
      uses stale time.
      
      In order to fix this, explicitly stop our context's time before
      enabling the event(s).
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Fixes: 3e349507 ("perf: Fix perf_enable_on_exec() event scheduling")
      Link: http://lkml.kernel.org/r/20160224174948.159242158@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7fce2509
    • Peter Zijlstra's avatar
      perf: Fix ctx time tracking by introducing EVENT_TIME · 3cbaa590
      Peter Zijlstra authored
      Currently any ctx_sched_in() call will re-start the ctx time tracking,
      this means that calls like:
      
      	ctx_sched_in(.event_type = EVENT_PINNED);
      	ctx_sched_in(.event_type = EVENT_FLEXIBLE);
      
      will have a hole in their ctx time tracking. This is likely harmless
      but can confuse things a little. By adding EVENT_TIME, we can have the
      first ctx_sched_in() (is_active: 0 -> !0) start the time and any
      further ctx_sched_in() will leave the timestamps alone.
      
      Secondly, this allows for an early disable like:
      
      	ctx_sched_out(.event_type = EVENT_TIME);
      
      which would update the ctx time (if the ctx is active) and any further
      calls to ctx_sched_out() would not further modify the ctx time.
      
      For ctx_sched_in() any 0 -> !0 transition will automatically include
      EVENT_TIME.
      
      For ctx_sched_out(), any transition that clears EVENT_ALL will
      automatically clear EVENT_TIME.
      
      These two rules ensure that under normal circumstances we need not
      bother with EVENT_TIME and get natural ctx time behaviour.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.100446561@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3cbaa590
    • Peter Zijlstra's avatar
      perf: Cure event->pending_disable race · 28a967c3
      Peter Zijlstra authored
      Because event_sched_out() checks event->pending_disable _before_
      actually disabling the event, it can happen that the event fires after
      it checks but before it gets disabled.
      
      This would leave event->pending_disable set and the queued irq_work
      will try and process it.
      
      However, if the event trigger was during schedule(), the event might
      have been de-scheduled by the time the irq_work runs, and
      perf_event_disable_local() will fail.
      
      Fix this by checking event->pending_disable _after_ we call
      event->pmu->del(). This depends on the latter being a compiler
      barrier, such that the compiler does not lift the load and re-creates
      the problem.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      28a967c3
    • Peter Zijlstra's avatar
      perf: Fix race between event install and jump_labels · 9107c89e
      Peter Zijlstra authored
      perf_install_in_context() relies upon the context switch hooks to have
      scheduled in events when the IPI misses its target -- after all, if
      the task has moved from the CPU (or wasn't running at all), it will
      have to context switch to run elsewhere.
      
      This however doesn't appear to be happening.
      
      It is possible for the IPI to not happen (task wasn't running) only to
      later observe the task running with an inactive context.
      
      The only possible explanation is that the context switch hooks are not
      called. Therefore put in a sync_sched() after toggling the jump_label
      to guarantee all CPUs will have them enabled before we install an
      event.
      
      A simple if (0->1) sync_sched() will not in fact work, because any
      further increment can race and complete before the sync_sched().
      Therefore we must jump through some hoops.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9107c89e
    • Peter Zijlstra's avatar
      perf: Fix cloning · a69b0ca4
      Peter Zijlstra authored
      Alexander reported that when the 'original' context gets destroyed, no
      new clones happen.
      
      This can happen irrespective of the ctx switch optimization, any task
      can die, even the parent, and we want to continue monitoring the task
      hierarchy until we either close the event or no tasks are left in the
      hierarchy.
      
      perf_event_init_context() will attempt to pin the 'parent' context
      during clone(). At that point current is the parent, and since current
      cannot have exited while executing clone(), its context cannot have
      passed through perf_event_exit_task_context(). Therefore
      perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE.
      
      However, since inherit_event() does:
      
      	if (parent_event->parent)
      		parent_event = parent_event->parent;
      
      it looks at the 'original' event when it does: is_orphaned_event().
      This can return true if the context that contains the this event has
      passed through perf_event_exit_task_context(). And thus we'll fail to
      clone the perf context.
      
      Fix this by adding a new state: STATE_DEAD, which is set by
      perf_release() to indicate that the filedesc (or kernel reference) is
      dead and there are no observers for our data left.
      
      Only for STATE_DEAD will is_orphaned_event() be true and inhibit
      cloning.
      
      STATE_EXIT is otherwise preserved such that is_event_hup() remains
      functional and will report when the observed task hierarchy becomes
      empty.
      Reported-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Fixes: c6e5b732 ("perf: Synchronously clean up child events")
      Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a69b0ca4
    • Peter Zijlstra's avatar
      perf: Only update context time when active · 6f932e5b
      Peter Zijlstra authored
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.860690919@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f932e5b
    • Peter Zijlstra's avatar
      perf: Allow perf_release() with !event->ctx · a4f4bb6d
      Peter Zijlstra authored
      In the err_file: fput(event_file) case, the event will not yet have
      been attached to a context. However perf_release() does assume it has
      been. Cure this.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.793996260@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a4f4bb6d
    • Peter Zijlstra's avatar
      perf: Do not double free · 13005627
      Peter Zijlstra authored
      In case of: err_file: fput(event_file), we'll end up calling
      perf_release() which in turn will free the event.
      
      Do not then free the event _again_.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      13005627
    • Peter Zijlstra's avatar
      perf: Close install vs. exit race · 84c4e620
      Peter Zijlstra authored
      Consider the following scenario:
      
        CPU0					CPU1
      
        ctx = find_get_ctx();
      					perf_event_exit_task_context()
        mutex_lock(&ctx->mutex);
        perf_install_in_context(ctx, ...);
          /* NO-OP */
        mutex_unlock(&ctx->mutex);
      
        ...
      
        perf_release()
          WARN_ON_ONCE(event->state != STATE_EXIT);
      
      Since the event doesn't pass through perf_remove_from_context()
      because perf_install_in_context() NO-OPs because the ctx is dead, and
      perf_event_exit_task_context() will not observe the event because its
      not attached yet, the event->state will not be set.
      
      Solve this by revalidating ctx->task after we acquire ctx->mutex and
      failing the event creation as a whole.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.626853419@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      84c4e620
  2. 24 Feb, 2016 13 commits
    • Linus Torvalds's avatar
      Merge tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 6dc390ad
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
       - Fix for csd deadlock due to missing self IPI
       - Accompanying IPI cleanups / optimization
       - Brown paper bag bug in one of the cleanups above
       - Boot reporting updates for new hardware features
       - Don't force DEVTMPFS if INITRAMFS
      
      * tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        arc: SMP: CONFIG_ARC_IPI_DBG cleanup
        ARC: SMP: No need for CONFIG_ARC_IPI_DBG
        ARCv2: Elide sending new cross core intr if receiver didn't ack prev
        ARCv2: SMP: Push IPI_IRQ into IPI provider
        ARC: [intc-compact] Remove IPI setup from ARCompact port
        ARCv2: SMP: Emulate IPI to self using software triggered interrupt
        arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCE
        ARCv2: boot report CCMs (Closely Coupled Memories)
        ARCv2: boot print Low Latency Memory
        ARC: Assume multiplier is always present
      6dc390ad
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · aa263c43
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "Assorted fixes - xattr one from this cycle, the rest - stable fodder"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs/pnode.c: treat zero mnt_group_id-s as unequal
        affs_do_readpage_ofs(): just use kmap_atomic() around memcpy()
        xattr handlers: plug a lock leak in simple_xattr_list
        fs: allow no_seek_end_llseek to actually seek
      aa263c43
    • Kirill A. Shutemov's avatar
      thp: call pmdp_invalidate() with correct virtual address · 2ac015e2
      Kirill A. Shutemov authored
      Sebastian Ott and Gerald Schaefer reported random crashes on s390.
      It was bisected to my THP refcounting patchset.
      
      The problem is that pmdp_invalidated() called with wrong virtual
      address. It got offset up by HPAGE_PMD_SIZE by loop over ptes.
      
      The solution is to introduce new variable to be used in loop and don't
      touch 'haddr'.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-and-tested-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Reported-and-tested-by Sebastian Ott <sebott@linux.vnet.ibm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2ac015e2
    • Valentin Rothberg's avatar
      arc: SMP: CONFIG_ARC_IPI_DBG cleanup · 9ef2d8be
      Valentin Rothberg authored
      Previous Commit ("ARC: SMP: No need for CONFIG_ARC_IPI_DBG") removed
      the Kconfig option ARC_IPI_DBG.  Remove the last reference on this
      option.
      Signed-off-by: default avatarValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      9ef2d8be
    • Vineet Gupta's avatar
      ARC: SMP: No need for CONFIG_ARC_IPI_DBG · d73b73f5
      Vineet Gupta authored
      This was more relevant during SMP bringup.
      
      The warning for bogus msg better be visible always.
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      d73b73f5
    • Vineet Gupta's avatar
      ARCv2: Elide sending new cross core intr if receiver didn't ack prev · 3dea30ca
      Vineet Gupta authored
      ARConnect/MCIP IPI sending has a retry-wait loop in case caller had
      not seen a previous such interrupt. Turns out that it is not needed at
      all. Linux cross core calling allows coalescing multiple IPIs to same
      receiver - it is fine as long as there is one.
      
      This logic is built into upper layer already, at a higher level of
      abstraction. ipi_send_msg_one() sets the actual msg payload, but it only
      calls MCIP IPI sending if msg holder was empty (using
      atomic-set-new-and-get-old construct). Thus it is unlikely that the
      retry-wait looping was ever getting exercised at all.
      
      Cc: Chuck Jordan <cjordan@synopsys.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      3dea30ca
    • Vineet Gupta's avatar
      96817879
    • Vineet Gupta's avatar
      ARC: [intc-compact] Remove IPI setup from ARCompact port · dbcbc7e7
      Vineet Gupta authored
      There is no real ARC700 based SMP SoC so remove IPI definition.
      EZChip's SMP ARC700 is going to use a different intc and IPI provider
      anyways.
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      dbcbc7e7
    • Vineet Gupta's avatar
      ARCv2: SMP: Emulate IPI to self using software triggered interrupt · bb143f81
      Vineet Gupta authored
      ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to
      local core. So use core intc capability to trigger software
      interrupt to self, using an unsued IRQ #21.
      
      This showed up as csd deadlock with LTP trace_sched on a dual core
      system. This test acts as scheduler fuzzer, triggering all sorts of
      schedulting activity. Trouble starts with IPI to self, which doesn't get
      delivered (effectively lost due to H/w capability), but the msg intended
      to be sent remain enqueued in per-cpu @ipi_data.
      
      All subsequent IPIs to this core from other cores get elided due to the
      IPI coalescing optimization in ipi_send_msg_one() where a pending msg
      implies an IPI already sent and assumes other core is yet to ack it.
      After the elided IPI, other core simply goes into csd_lock_wait()
      but never comes out as this core never sees the interrupt.
      
      Fixes STAR 9001008624
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>        [4.2]
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      bb143f81
    • Linus Torvalds's avatar
      Merge tag 'dm-4.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 84e54c46
      Linus Torvalds authored
      Pull device mapper fix from Mike Snitzer:
       "Fix a 112 byte leak for each IO request that is requeued while DM
        multipath is handling faults due to path failures.
      
        This leak does not happen if blk-mq DM multipath is used.  It only
        occurs if .request_fn DM multipath is stacked ontop of blk-mq paths
        (e.g. scsi-mq devices)"
      
      * tag 'dm-4.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
      84e54c46
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmc · 0ecdcd3a
      Linus Torvalds authored
      Pull MMC fix from Ulf Hansson:
       "Here's an mmc fix intended for v4.5 rc6.
      
        MMC host:
         - omap_hsmmc: Fix PM regression for deferred probe"
      
      * tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
        mmc: omap_hsmmc: Fix PM regression with deferred probe for pm_runtime_reinit
      0ecdcd3a
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 420eb6d7
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Stable bugfixes:
         - Fix nfs_size_to_loff_t
         - NFSv4: Fix a dentry leak on alias use
      
        Other bugfixes:
         - Don't schedule a layoutreturn if the layout segment can be freed
           immediately.
         - Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode
         - rpcrdma_bc_receive_call() should init rq_private_buf.len
         - fix stateid handling for the NFS v4.2 operations
         - pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page
         - fix panic in gss_pipe_downcall() in fips mode
         - Fix a race between layoutget and pnfs_destroy_layout
         - Fix a race between layoutget and bulk recalls"
      
      * tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls
        NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layout
        auth_gss: fix panic in gss_pipe_downcall() in fips mode
        pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page
        nfs4: fix stateid handling for the NFS v4.2 operations
        NFSv4: Fix a dentry leak on alias use
        xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.len
        pNFS: Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode
        pNFS: Fix pnfs_mark_matching_lsegs_return()
        nfs: fix nfs_size_to_loff_t
      420eb6d7
    • Linus Torvalds's avatar
      x86: fix SMAP in 32-bit environments · de9e478b
      Linus Torvalds authored
      In commit 11f1a4b9 ("x86: reorganize SMAP handling in user space
      accesses") I changed how the stac/clac instructions were generated
      around the user space accesses, which then made it possible to do
      batched accesses efficiently for user string copies etc.
      
      However, in doing so, I completely spaced out, and didn't even think
      about the 32-bit case.  And nobody really even seemed to notice, because
      SMAP doesn't even exist until modern Skylake processors, and you'd have
      to be crazy to run 32-bit kernels on a modern CPU.
      
      Which brings us to Andy Lutomirski.
      
      He actually tested the 32-bit kernel on new hardware, and noticed that
      it doesn't work.  My bad.  The trivial fix is to add the required
      uaccess begin/end markers around the raw accesses in <asm/uaccess_32.h>.
      
      I feel a bit bad about this patch, just because that header file really
      should be cleaned up to avoid all the duplicated code in it, and this
      commit just expands on the problem.  But this just fixes the bug without
      any bigger cleanup surgery.
      Reported-and-tested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de9e478b
  3. 23 Feb, 2016 1 commit
    • Alexey Brodkin's avatar
      arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCE · 3e5177c1
      Alexey Brodkin authored
      Even though DEVTMPFS is required when our pre-built initramfs
      is used it is not the case in general. It is perfectly possible
      to use initramfs with device nodes already populated or there
      could be other usages, see discussion below for more detials:
      http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/37819/focus=37821
      
      This change removes mentioned dependency from arch/arc/Kconfig
      updating instead those defconfigs that are usually used with this
      kind of pre-build initramfs.
      
      And while at it all touched defconfigs were regenerated via
      savedefconfig and some options were removed:
       * USB is selected by other options implicitly
       * VGA_CONSOLE is disableb for ARC since
         031e29b5
       * EXT3_FS automatically selects EXT4_FS
       * MTDxxx and JFFS2_FS make no sense for AXS because
         AXS NAND controller is not upstreamed
       * NET_OSCI_LAN is not in upstream as well
       * ARCPGU_xxx options make no sense because ARC PGU is not yet
         in upstream and when it gets there all config options would
         be taken from devicetree
      Signed-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      3e5177c1
  4. 22 Feb, 2016 14 commits
    • Trond Myklebust's avatar
      NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls · 9fd4b9fc
      Trond Myklebust authored
      Replace another case where the layout 'plh_block_lgets' can trigger
      infinite loops in send_layoutget().
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      9fd4b9fc
    • Trond Myklebust's avatar
      NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layout · 2454dfea
      Trond Myklebust authored
      If the server reboots while there is a layoutget outstanding, then
      the call to pnfs_choose_layoutget_stateid() will fail with an EAGAIN
      error, which causes an infinite loop in send_layoutget(). The reason
      why we never break out of the loop is that the layout 'plh_block_lgets'
      field is never cleared.
      
      Fix is to replace plh_block_lgets with NFS_LAYOUT_INVALID_STID, which
      can be reset after a new layoutget.
      
      Fixes: ab7d763e ("pNFS: Ensure nfs4_layoutget_prepare returns...")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      2454dfea
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v4.5-rc5' of... · 4de8ebef
      Linus Torvalds authored
      Merge tag 'trace-fixes-v4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull tracing fixes from Steven Rostedt:
       "Two more small fixes.
      
        One is by Yang Shi who added a READ_ONCE_NOCHECK() to the scan of the
        stack made by the stack tracer.  As the stack tracer scans the entire
        kernel stack, KASAN triggers seeing it as a "stack out of bounds"
        error.  As the scan is looking at the contents of the stack from
        parent functions.  The NOCHECK() tells KASAN that this is done on
        purpose, and is not some kind of stack overflow.
      
        The second fix is to the ftrace selftests, to retrieve the PID of
        executed commands from the shell with '$!' and not by parsing 'jobs'"
      
      * tag 'trace-fixes-v4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing, kasan: Silence Kasan warning in check_stack of stack_tracer
        ftracetest: Fix instance test to use proper shell command for pids
      4de8ebef
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 692b8c66
      Linus Torvalds authored
      Pull xen bug fixes from David Vrabel:
      
       - Two scsiback fixes (resource leak and spurious warning).
      
       - Fix DMA mapping of compound pages on arm/arm64.
      
       - Fix some pciback regressions in MSI-X handling.
      
       - Fix a pcifront crash due to some uninitialize state.
      
      * tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
        xen/pcifront: Report the errors better.
        xen/pciback: Save the number of MSI-X entries to be copied later.
        xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY
        xen: fix potential integer overflow in queue_reply
        xen/arm: correctly handle DMA mapping of compound pages
        xen/scsiback: avoid warnings when adding multiple LUNs to a domain
        xen/scsiback: correct frontend counting
      692b8c66
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · dea08e60
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Looks like a lot, but mostly driver fixes scattered all over as usual.
      
        Of note:
      
         1) Add conditional sched in nf conntrack in cleanup to avoid NMI
            watchdogs.  From Florian Westphal.
      
         2) Fix deadlock in nfnetlink cttimeout, also from Floarian.
      
         3) Fix handling of slaves in bonding ARP monitor validation, from Jay
            Vosburgh.
      
         4) Callers of ip_cmsg_send() are responsible for freeing IP options,
            some were not doing so.  Fix from Eric Dumazet.
      
         5) Fix per-cpu bugs in mvneta driver, from Gregory CLEMENT.
      
         6) Fix vlan handling in mv88e6xxx DSA driver, from Vivien Didelot.
      
         7) bcm7xxx PHY driver bug fixes from Florian Fainelli.
      
         8) Avoid unaligned accesses to protocol headers wrt.  GRE, from
            Alexander Duyck.
      
         9) SKB leaks and other problems in arc_emac driver, from Alexander
            Kochetkov.
      
        10) tcp_v4_inbound_md5_hash() releases listener socket instead of
            request socket on error path, oops.  Fix from Eric Dumazet.
      
        11) Missing socket release in pppoe_rcv_core() that seems to have
            existed basically forever.  From Guillaume Nault.
      
        12) Missing slave_dev unregister in dsa_slave_create() error path,
            from Florian Fainelli.
      
        13) crypto_alloc_hash() never returns NULL, fix return value check in
            __tcp_alloc_md5sig_pool.  From Insu Yun.
      
        14) Properly expire exception route entries in ipv4, from Xin Long.
      
        15) Fix races in tcp/dccp listener socket dismantle, from Eric
            Dumazet.
      
        16) Don't set IFF_TX_SKB_SHARING in vxlan, geneve, or GRE, it's not
            legal.  These drivers modify the SKB on transmit.  From Jiri Benc.
      
        17) Fix regression in the initialziation of netdev->tx_queue_len.
            From Phil Sutter.
      
        18) Missing unlock in tipc_nl_add_bc_link() error path, from Insu Yun.
      
        19) SCTP port hash sizing does not properly ensure that table is a
            power of two in size.  From Neil Horman.
      
        20) Fix initializing of software copy of MAC address in fmvj18x_cs
            driver, from Ken Kawasaki"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (129 commits)
        bnx2x: Fix 84833 phy command handler
        bnx2x: Fix led setting for 84858 phy.
        bnx2x: Correct 84858 PHY fw version
        bnx2x: Fix 84833 RX CRC
        bnx2x: Fix link-forcing for KR2
        net: ethernet: davicom: fix devicetree irq resource
        fmvj18x_cs: fix incorrect indexing of dev->dev_addr[] when copying the MAC address
        Driver: Vmxnet3: Update Rx ring 2 max size
        net: netcp: rework the code for get/set sw_data in dma desc
        soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data
        net: ti: netcp: restore get/set_pad_info() functionality
        MAINTAINERS: Drop myself as xen netback maintainer
        sctp: Fix port hash table size computation
        can: ems_usb: Fix possible tx overflow
        Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb
        net: bcmgenet: Fix internal PHY link state
        af_unix: Don't use continue to re-execute unix_stream_read_generic loop
        unix_diag: fix incorrect sign extension in unix_lookup_by_ino
        bnxt_en: Failure to update PHY is not fatal condition.
        bnxt_en: Remove unnecessary call to update PHY settings.
        ...
      dea08e60
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus-v4.5-rc6' of... · 5c102d0e
      Linus Torvalds authored
      Merge tag 'hwmon-for-linus-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
       "Two fixes headed for stable:
      
         - Remove an unnecessary speed_index lookup for thermal hook in the
           gpio-fan driver.  The unnecessary speed lookup can hog the system.
      
         - Handle negative conversion values correctly in the ads1015 driver"
      
      * tag 'hwmon-for-linus-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook
        hwmon: (ads1015) Handle negative conversion values correctly
      5c102d0e
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · a16152c8
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "One ocrdma fix:
      
         - The new CQ API support was added to ocrdma, but they got the arming
           logic wrong, so without this, transfers eventually fail when they
           fail to arm the interrupt properly under load
      
        Two related fixes for mlx4:
      
         - When we added the 64bit extended counters support to the core IB
           code, they forgot to update the RoCE side of the mlx4 driver (the
           IB side they properly updated).
      
           I debated whether or not to include these patches as they could be
           considered feature enablement patches, but the existing code will
           blindy copy the 32bit counters, whether any counters were requested
           at all (a bug).
      
           These two patches make it (a) check to see that counters were
           requested and (b) copy the right counters (the 64bit support is
           new, the 32bit is not).  For that reason I went ahead and took
           them"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        IB/mlx4: Add support for the port info class for RoCE ports
        IB/mlx4: Add support for extended counters over RoCE ports
        RDMA/ocrdma: Fix arm logic to align with new cq API
      a16152c8
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 7ee302f6
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Some bugfixes from I2C for you:
      
        A fix for a RuntimePM regression with OMAP, a fix to enable TCO for
        Lewisburg platforms, and a typo fix while we are here"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: i801: Adding Intel Lewisburg support for iTCO
        i2c: uniphier: fix typos in error messages
        i2c: omap: Fix PM regression with deferred probe for pm_runtime_reinit
      7ee302f6
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-4.5-20160221' of... · d856626d
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-4.5-20160221' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2016-02-21
      
      this is a pull reqeust of one patch for net/master.
      
      The patch is by Gerhard Uttenthaler and fixes a potential tx overflow in the
      ems_usb driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d856626d
    • David S. Miller's avatar
      Merge branch 'bnx2x-848xx-phy-fixes' · dd78dac8
      David S. Miller authored
      Yuval Mintz says:
      
      ====================
      bnx2x: Fix 848xx phys
      
      This series contains link-related fixes, mostly for the 848xx phys
      [2 patches are for 84833, and 2 patches are for 84858].
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd78dac8
    • Yuval Mintz's avatar
      bnx2x: Fix 84833 phy command handler · 4ec0b6d5
      Yuval Mintz authored
      Current initialization sequence is lacking, causing some configurations
      to fail.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ec0b6d5
    • Yuval Mintz's avatar
      bb1187af
    • Yuval Mintz's avatar
      bnx2x: Correct 84858 PHY fw version · 27ba2d2d
      Yuval Mintz authored
      The phy's firmware version isn't being parsed properly as it's
      currently parsed like the rest of the 848xx phys.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27ba2d2d
    • Yuval Mintz's avatar
      bnx2x: Fix 84833 RX CRC · 512ab9a0
      Yuval Mintz authored
      There's a problem in current 84833 phy configuration -
      in case 1Gb link is configured and jumbo-sized packets are being
      used, device will experience RX crc errors.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      512ab9a0