1. 04 May, 2018 4 commits
    • Rohit Jain's avatar
      sched/core: Don't schedule threads on pre-empted vCPUs · 247f2f6f
      Rohit Jain authored
      In paravirt configurations today, spinlocks figure out whether a vCPU is
      running to determine whether or not spinlock should bother spinning. We
      can use the same logic to prioritize CPUs when scheduling threads. If a
      vCPU has been pre-empted, it will incur the extra cost of VMENTER and
      the time it actually spends to be running on the host CPU. If we had
      other vCPUs which were actually running on the host CPU and idle we
      should schedule threads there.
      
      Performance numbers:
      
      Note: With patch is referred to as Paravirt in the following and without
      patch is referred to as Base.
      
      1) When only 1 VM is running:
      
          a) Hackbench test on KVM 8 vCPUs, 10,000 loops (lower is better):
      
      	+-------+-----------------+----------------+
      	|Number |Paravirt         |Base            |
      	|of     +---------+-------+-------+--------+
      	|Threads|Average  |Std Dev|Average| Std Dev|
      	+-------+---------+-------+-------+--------+
      	|1      |1.817    |0.076  |1.721  | 0.067  |
      	|2      |3.467    |0.120  |3.468  | 0.074  |
      	|4      |6.266    |0.035  |6.314  | 0.068  |
      	|8      |11.437   |0.105  |11.418 | 0.132  |
      	|16     |21.862   |0.167  |22.161 | 0.129  |
      	|25     |33.341   |0.326  |33.692 | 0.147  |
      	+-------+---------+-------+-------+--------+
      
      2) When two VMs are running with same CPU affinities:
      
          a) tbench test on VM 8 cpus
      
          Base:
      
      	VM1:
      
      	Throughput 220.59 MB/sec   1 clients  1 procs  max_latency=12.872 ms
      	Throughput 448.716 MB/sec  2 clients  2 procs  max_latency=7.555 ms
      	Throughput 861.009 MB/sec  4 clients  4 procs  max_latency=49.501 ms
      	Throughput 1261.81 MB/sec  7 clients  7 procs  max_latency=76.990 ms
      
      	VM2:
      
      	Throughput 219.937 MB/sec  1 clients  1 procs  max_latency=12.517 ms
      	Throughput 470.99 MB/sec   2 clients  2 procs  max_latency=12.419 ms
      	Throughput 841.299 MB/sec  4 clients  4 procs  max_latency=37.043 ms
      	Throughput 1240.78 MB/sec  7 clients  7 procs  max_latency=77.489 ms
      
          Paravirt:
      
      	VM1:
      
      	Throughput 222.572 MB/sec  1 clients  1 procs  max_latency=7.057 ms
      	Throughput 485.993 MB/sec  2 clients  2 procs  max_latency=26.049 ms
      	Throughput 947.095 MB/sec  4 clients  4 procs  max_latency=45.338 ms
      	Throughput 1364.26 MB/sec  7 clients  7 procs  max_latency=145.124 ms
      
      	VM2:
      
      	Throughput 224.128 MB/sec  1 clients  1 procs  max_latency=4.564 ms
      	Throughput 501.878 MB/sec  2 clients  2 procs  max_latency=11.061 ms
      	Throughput 965.455 MB/sec  4 clients  4 procs  max_latency=45.370 ms
      	Throughput 1359.08 MB/sec  7 clients  7 procs  max_latency=168.053 ms
      
          b) Hackbench with 4 fd 1,000,000 loops
      
      	+-------+--------------------------------------+----------------------------------------+
      	|Number |Paravirt                              |Base                                    |
      	|of     +----------+--------+---------+--------+----------+--------+---------+----------+
      	|Threads|Average1  |Std Dev1|Average2 | Std Dev|Average1  |Std Dev1|Average2 | Std Dev 2|
      	+-------+----------+--------+---------+--------+----------+--------+---------+----------+
      	|  1    | 3.748    | 0.620  | 3.576   | 0.432  | 4.006    | 0.395  | 3.446   | 0.787    |
      	+-------+----------+--------+---------+--------+----------+--------+---------+----------+
      
          Note that this test was run just to show the interference effect
          over-subscription can have in baseline
      
          c) schbench results with 2 message groups on 8 vCPU VMs
      
      	+-----------+-------+---------------+--------------+------------+
      	|           |       | Paravirt      | Base         |            |
      	+-----------+-------+-------+-------+-------+------+------------+
      	|           |Threads| VM1   | VM2   |  VM1  | VM2  |%Improvement|
      	+-----------+-------+-------+-------+-------+------+------------+
      	|50.0000th  |    1  | 52    | 53    |  58   | 54   |  +6.25%    |
      	|75.0000th  |    1  | 69    | 61    |  83   | 59   |  +8.45%    |
      	|90.0000th  |    1  | 80    | 80    |  89   | 83   |  +6.98%    |
      	|95.0000th  |    1  | 83    | 83    |  93   | 87   |  +7.78%    |
      	|*99.0000th |    1  | 92    | 94    |  99   | 97   |  +5.10%    |
      	|99.5000th  |    1  | 95    | 100   |  102  | 103  |  +4.88%    |
      	|99.9000th  |    1  | 107   | 123   |  105  | 203  |  +25.32%   |
      	+-----------+-------+-------+-------+-------+------+------------+
      	|50.0000th  |    2  | 56    | 62    |  67   | 59   |  +6.35%    |
      	|75.0000th  |    2  | 69    | 75    |  80   | 71   |  +4.64%    |
      	|90.0000th  |    2  | 80    | 82    |  90   | 81   |  +5.26%    |
      	|95.0000th  |    2  | 85    | 87    |  97   | 91   |  +8.51%    |
      	|*99.0000th |    2  | 98    | 99    |  107  | 109  |  +8.79%    |
      	|99.5000th  |    2  | 107   | 105   |  109  | 116  |  +5.78%    |
      	|99.9000th  |    2  | 9968  | 609   |  875  | 3116 | -165.02%   |
      	+-----------+-------+-------+-------+-------+------+------------+
      	|50.0000th  |    4  | 78    | 77    |  78   | 79   |  +1.27%    |
      	|75.0000th  |    4  | 98    | 106   |  100  | 104  |   0.00%    |
      	|90.0000th  |    4  | 987   | 1001  |  995  | 1015 |  +1.09%    |
      	|95.0000th  |    4  | 4136  | 5368  |  5752 | 5192 |  +13.16%   |
      	|*99.0000th |    4  | 11632 | 11344 |  11024| 10736|  -5.59%    |
      	|99.5000th  |    4  | 12624 | 13040 |  12720| 12144|  -3.22%    |
      	|99.9000th  |    4  | 13168 | 18912 |  14992| 17824|  +2.24%    |
      	+-----------+-------+-------+-------+-------+------+------------+
      
          Note: Improvement is measured for (VM1+VM2)
      Signed-off-by: default avatarRohit Jain <rohit.k.jain@oracle.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dhaval.giani@oracle.com
      Cc: matt@codeblueprint.co.uk
      Cc: steven.sistare@oracle.com
      Cc: subhra.mazumdar@oracle.com
      Link: http://lkml.kernel.org/r/1525294330-7759-1-git-send-email-rohit.k.jain@oracle.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      247f2f6f
    • Viresh Kumar's avatar
      sched/fair: Avoid calling sync_entity_load_avg() unnecessarily · c976a862
      Viresh Kumar authored
      Call sync_entity_load_avg() directly from find_idlest_cpu() instead of
      select_task_rq_fair(), as that's where we need to use task's utilization
      value. And call sync_entity_load_avg() only after making sure sched
      domain spans over one of the allowed CPUs for the task.
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Link: http://lkml.kernel.org/r/cd019d1753824c81130eae7b43e2bbcec47cc1ad.1524738578.git.viresh.kumar@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c976a862
    • Viresh Kumar's avatar
      sched/fair: Rearrange select_task_rq_fair() to optimize it · f1d88b44
      Viresh Kumar authored
      Rearrange select_task_rq_fair() a bit to avoid executing some
      conditional statements in few specific code-paths. That gets rid of the
      goto as well.
      
      This shouldn't result in any functional changes.
      Tested-by: default avatarRohit Jain <rohit.k.jain@oracle.com>
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Link: http://lkml.kernel.org/r/20831b8d237bf3a20e4e328286f678b425ff04c9.1524738578.git.viresh.kumar@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f1d88b44
    • Peter Zijlstra's avatar
      sched/core: Introduce set_special_state() · b5bf9a90
      Peter Zijlstra authored
      Gaurav reported a perceived problem with TASK_PARKED, which turned out
      to be a broken wait-loop pattern in __kthread_parkme(), but the
      reported issue can (and does) in fact happen for states that do not do
      condition based sleeps.
      
      When the 'current->state = TASK_RUNNING' store of a previous
      (concurrent) try_to_wake_up() collides with the setting of a 'special'
      sleep state, we can loose the sleep state.
      
      Normal condition based wait-loops are immune to this problem, but for
      sleep states that are not condition based are subject to this problem.
      
      There already is a fix for TASK_DEAD. Abstract that and also apply it
      to TASK_STOPPED and TASK_TRACED, both of which are also without
      condition based wait-loop.
      Reported-by: default avatarGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b5bf9a90
  2. 03 May, 2018 7 commits
    • Peter Zijlstra's avatar
      kthread, sched/wait: Fix kthread_parkme() completion issue · 85f1abe0
      Peter Zijlstra authored
      Even with the wait-loop fixed, there is a further issue with
      kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
      smpboot_park_threads() can return before all those threads are in fact
      blocked, due to the placement of the complete() in __kthread_parkme().
      
      When that happens, sched_cpu_dying() -> migrate_tasks() can end up
      migrating such a still runnable task onto another CPU.
      
      Normally the task will have hit schedule() and gone to sleep by the
      time we do kthread_unpark(), which will then do __kthread_bind() to
      re-bind the task to the correct CPU.
      
      However, when we loose the initial TASK_PARKED store to the concurrent
      wakeup issue described previously, do the complete(), get migrated, it
      is possible to either:
      
       - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
         the park and set TASK_RUNNING, or
      
       - __kthread_bind()'s wait_task_inactive() to observe the competing
         TASK_RUNNING store.
      
      Either way the WARN() in __kthread_bind() will trigger and fail to
      correctly set the CPU affinity.
      
      Fix this by only issuing the complete() when the kthread has scheduled
      out. This does away with all the icky 'still running' nonsense.
      
      The alternative is to promote TASK_PARKED to a special state, this
      guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
      and we'll end up doing the right thing, but this preserves the whole
      icky business of potentially migating the still runnable thing.
      Reported-by: default avatarGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      85f1abe0
    • Peter Zijlstra's avatar
      kthread, sched/wait: Fix kthread_parkme() wait-loop · 741a76b3
      Peter Zijlstra authored
      Gaurav reported a problem with __kthread_parkme() where a concurrent
      try_to_wake_up() could result in competing stores to ->state which,
      when the TASK_PARKED store got lost bad things would happen.
      
      The comment near set_current_state() actually mentions this competing
      store, but only mentions the case against TASK_RUNNING. This same
      store, with different timing, can happen against a subsequent !RUNNING
      store.
      
      This normally is not a problem, because as per that same comment, the
      !RUNNING state store is inside a condition based wait-loop:
      
        for (;;) {
          set_current_state(TASK_UNINTERRUPTIBLE);
          if (!need_sleep)
            break;
          schedule();
        }
        __set_current_state(TASK_RUNNING);
      
      If we loose the (first) TASK_UNINTERRUPTIBLE store to a previous
      (concurrent) wakeup, the schedule() will NO-OP and we'll go around the
      loop once more.
      
      The problem here is that the TASK_PARKED store is not inside the
      KTHREAD_SHOULD_PARK condition wait-loop.
      
      There is a genuine issue with sleeps that do not have a condition;
      this is addressed in a subsequent patch.
      Reported-by: default avatarGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      741a76b3
    • Vincent Guittot's avatar
      sched/fair: Fix the update of blocked load when newly idle · 457be908
      Vincent Guittot authored
      With commit:
      
        31e77c93 ("sched/fair: Update blocked load when newly idle")
      
      ... we release the rq->lock when updating blocked load of idle CPUs.
      
      This opens a time window during which another CPU can add a task to this
      CPU's cfs_rq.
      
      The check for newly added task of idle_balance() is not in the common path.
      Move the out label to include this check.
      Reported-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 31e77c93 ("sched/fair: Update blocked load when newly idle")
      Link: http://lkml.kernel.org/r/20180426103133.GA6953@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      457be908
    • Peter Zijlstra's avatar
      stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock · 0b26351b
      Peter Zijlstra authored
      Matt reported the following deadlock:
      
      CPU0					CPU1
      
      schedule(.prev=migrate/0)		<fault>
        pick_next_task()			  ...
          idle_balance()			    migrate_swap()
            active_balance()			      stop_two_cpus()
      						spin_lock(stopper0->lock)
      						spin_lock(stopper1->lock)
      						ttwu(migrate/0)
      						  smp_cond_load_acquire() -- waits for schedule()
              stop_one_cpu(1)
      	  spin_lock(stopper1->lock) -- waits for stopper lock
      
      Fix this deadlock by taking the wakeups out from under stopper->lock.
      This allows the active_balance() to queue the stop work and finish the
      context switch, which in turn allows the wakeup from migrate_swap() to
      observe the context and complete the wakeup.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reported-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180420095005.GH4064@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0b26351b
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · f4ef6a43
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Various fixes in tracing:
      
         - Tracepoints should not give warning on OOM failures
      
         - Use special field for function pointer in trace event
      
         - Fix igrab issues in uprobes
      
         - Fixes to the new histogram triggers"
      
      * tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracepoint: Do not warn on ENOMEM
        tracing: Add field modifier parsing hist error for hist triggers
        tracing: Add field parsing hist error for hist triggers
        tracing: Restore proper field flag printing when displaying triggers
        tracing: initcall: Ordered comparison of function pointers
        tracing: Remove igrab() iput() call from uprobes.c
        tracing: Fix bad use of igrab in trace_uprobe.c
      f4ef6a43
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · ecd649b3
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
       "Just a few driver fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: atmel_mxt_ts - add missing compatible strings to OF device table
        Input: atmel_mxt_ts - fix the firmware update
        Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
        MAINTAINERS: Rakesh Iyer can't be reached anymore
        Input: hideep_ts - fix a typo in Kconfig
        Input: alps - fix reporting pressure of v3 trackstick
        Input: leds - fix out of bound access
        Input: synaptics-rmi4 - fix an unchecked out of memory error path
      ecd649b3
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 3b6f9793
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small bug fixes: an illegally overlapping memcmp in target code,
        a potential infinite loop in isci under certain rare phy conditions
        and an ATA queue depth (performance) correction for storvsc"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: target: Fix fortify_panic kernel exception
        scsi: isci: Fix infinite loop in while loop
        scsi: storvsc: Set up correct queue depth values for IDE devices
      3b6f9793
  3. 02 May, 2018 1 commit
  4. 01 May, 2018 6 commits
  5. 30 Apr, 2018 5 commits
  6. 29 Apr, 2018 6 commits
    • Linus Torvalds's avatar
      Linux v4.17-rc3 · 6da6c0db
      Linus Torvalds authored
      6da6c0db
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c61a56ab
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Another set of x86 related updates:
      
         - Fix the long broken x32 version of the IPC user space headers which
           was noticed by Arnd Bergman in course of his ongoing y2038 work.
           GLIBC seems to have non broken private copies of these headers so
           this went unnoticed.
      
         - Two microcode fixlets which address some more fallout from the
           recent modifications in that area:
      
            - Unconditionally save the microcode patch, which was only saved
              when CPU_HOTPLUG was enabled causing failures in the late
              loading mechanism
      
            - Make the later loader synchronization finally work under all
              circumstances. It was exiting early and causing timeout failures
              due to a missing synchronization point.
      
         - Do not use mwait_play_dead() on AMD systems to prevent excessive
           power consumption as the CPU cannot go into deep power states from
           there.
      
         - Address an annoying sparse warning due to lost type qualifiers of
           the vmemmap and vmalloc base address constants.
      
         - Prevent reserving crash kernel region on Xen PV as this leads to
           the wrong perception that crash kernels actually work there which
           is not the case. Xen PV has its own crash mechanism handled by the
           hypervisor.
      
         - Add missing TLB cpuid values to the table to make the printout on
           certain machines correct.
      
         - Enumerate the new CLDEMOTE instruction
      
         - Fix an incorrect SPDX identifier
      
         - Remove stale macros"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds
        x86/setup: Do not reserve a crash kernel region if booted on Xen PV
        x86/cpu/intel: Add missing TLB cpuid values
        x86/smpboot: Don't use mwait_play_dead() on AMD systems
        x86/mm: Make vmemmap and vmalloc base address constants unsigned long
        x86/vector: Remove the unused macro FPU_IRQ
        x86/vector: Remove the macro VECTOR_OFFSET_START
        x86/cpufeatures: Enumerate cldemote instruction
        x86/microcode: Do not exit early from __reload_late()
        x86/microcode/intel: Save microcode patch unconditionally
        x86/jailhouse: Fix incorrect SPDX identifier
      c61a56ab
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 65f4d6d0
      Linus Torvalds authored
      Pull x86 pti fixes from Thomas Gleixner:
       "A set of updates for the x86/pti related code:
      
         - Preserve r8-r11 in int $0x80. r8-r11 need to be preserved, but the
           int$80 entry code removed that quite some time ago. Make it correct
           again.
      
         - A set of fixes for the Global Bit work which went into 4.17 and
           caused a bunch of interesting regressions:
      
            - Triggering a BUG in the page attribute code due to a missing
              check for early boot stage
      
            - Warnings in the page attribute code about holes in the kernel
              text mapping which are caused by the freeing of the init code.
              Handle such holes gracefully.
      
            - Reduce the amount of kernel memory which is set global to the
              actual text and do not incidentally overlap with data.
      
            - Disable the global bit when RANDSTRUCT is enabled as it
              partially defeats the hardening.
      
            - Make the page protection setup correct for vma->page_prot
              population again. The adjustment of the protections fell through
              the crack during the Global bit rework and triggers warnings on
              machines which do not support certain features, e.g. NX"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry/64/compat: Preserve r8-r11 in int $0x80
        x86/pti: Filter at vma->vm_page_prot population
        x86/pti: Disallow global kernel text with RANDSTRUCT
        x86/pti: Reduce amount of kernel text allowed to be Global
        x86/pti: Fix boot warning from Global-bit setting
        x86/pti: Fix boot problems from Global-bit setting
      65f4d6d0
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 810fb07a
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "Two fixes from the timer departement:
      
         - Fix a long standing issue in the NOHZ tick code which causes RB
           tree corruption, delayed timers and other malfunctions. The cause
           for this is code which modifies the expiry time of an enqueued
           hrtimer.
      
         - Revert the CLOCK_MONOTONIC/CLOCK_BOOTTIME unification due to
           regression reports. Seems userspace _is_ relying on the documented
           behaviour despite our hope that it wont"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME
        tick/sched: Do not mess with an enqueued hrtimer
      810fb07a
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7d9e55fe
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "The perf update contains the following bits:
      
        x86:
         - Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP
      
        perf stat:
         - Keep the '/' event modifier separator in fallback, for example when
           fallbacking from 'cpu/cpu-cycles/' to user level only, where it
           should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri
           Olsa)
      
         - Fix PMU events parsing rule, improving error reporting for invalid
           events (Jiri Olsa)
      
         - Disable write_backward and other event attributes for !group events
           in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S'
           that has leader sampling (:S) and where just the 'cycles', the
           leader event, should have the write_backward attribute set, in this
           case it all fails because the PMU where 'msr/aperf/' lives doesn't
           accepts write_backward style sampling (Jiri Olsa)
      
         - Only fall back group read for leader (Kan Liang)
      
         - Fix core PMU alias list for x86 platform (Kan Liang)
      
         - Print out hint for mixed PMU group error (Kan Liang)
      
         - Fix duplicate PMU name for interval print (Kan Liang)
      
        Core:
         - Set main kernel end address properly when reading kernel and module
           maps (Namhyung Kim)
      
        perf mem:
         - Fix incorrect entries and add missing man options (Sangwon Hong)
      
        s/390:
         - Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter)
      
         - Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390
      
         - Fix s390 undefined record__auxtrace_init() return value in 'perf
           record' (Thomas Richter)"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
        perf stat: Fix duplicate PMU name for interval print
        perf evsel: Only fall back group read for leader
        perf stat: Print out hint for mixed PMU group error
        perf pmu: Fix core PMU alias list for X86 platform
        perf record: Fix s390 undefined record__auxtrace_init() return value
        perf mem: Document incorrect and missing options
        perf evsel: Disable write_backward for leader sampling group events
        perf pmu: Fix pmu events parsing rule
        perf stat: Keep the / modifier separator in fallback
        perf test: Adapt test case record+probe_libc_inet_pton.sh for s390
        perf list: Remove s390 specific strcmp_cpuid_cmp function
        perf machine: Set main kernel end address properly
      7d9e55fe
    • Linus Torvalds's avatar
      Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · cdface52
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Fix misc bugs and a regression for ext4"
      
      * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: add MODULE_SOFTDEP to ensure crc32c is included in the initramfs
        ext4: fix bitmap position validation
        ext4: set h_journal if there is a failure starting a reserved handle
        ext4: prevent right-shifting extents beyond EXT_MAX_BLOCKS
      cdface52
  7. 28 Apr, 2018 6 commits
    • Amir Goldstein's avatar
      <linux/stringhash.h>: fix end_name_hash() for 64bit long · 19b9ad67
      Amir Goldstein authored
      The comment claims that this helper will try not to loose bits, but for
      64bit long it looses the high bits before hashing 64bit long into 32bit
      int.  Use the helper hash_long() to do the right thing for 64bit long.
      For 32bit long, there is no change.
      
      All the callers of end_name_hash() either assign the result to
      qstr->hash, which is u32 or return the result as an int value (e.g.
      full_name_hash()).  Change the helper return type to int to conform to
      its users.
      
      [ It took me a while to apply this, because my initial reaction to it
        was - incorrectly - that it could make for slower code.
      
        After having looked more at it, I take back all my complaints about
        the patch, Amir was right and I was mis-reading things or just being
        stupid.
      
        I also don't worry too much about the possible performance impact of
        this on 64-bit, since most architectures that actually care about
        performance end up not using this very much (the dcache code is the
        most performance-critical, but the word-at-a-time case uses its own
        hashing anyway).
      
        So this ends up being mostly used for filesystems that do their own
        degraded hashing (usually because they want a case-insensitive
        comparison function).
      
        A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
        and then this potentially makes things more expensive on 64-bit
        architectures with slow or lacking multipliers even for the normal
        case.
      
        That said, realistically the only such architecture I can think of is
        PA-RISC. Nobody really cares about performance on that, it's more of a
        "look ma, I've got warts^W an odd machine" platform.
      
        So the patch is fine, and all my initial worries were just misplaced
        from not looking at this properly.   - Linus ]
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19b9ad67
    • David Sterba's avatar
      MAINTAINERS: add myself as maintainer of AFFS · bf8f5de1
      David Sterba authored
      The AFFS filesystem is still in use by m68k community (Link #2), but as
      there was no code activity and no maintainer, the filesystem appeared on
      the list of candidates for staging/removal (Link #1).
      
      I volunteer to act as a maintainer of AFFS to collect any fixes that
      might show up and to guard fs/affs/ against another spring cleaning.
      
      Link: https://lkml.kernel.org/r/20180425154602.GA8546@bombadil.infradead.org
      Link: https://lkml.kernel.org/r/1613268.lKBQxPXt8J@merkaba
      CC: Martin Steigerwald <martin@lichtvoll.de>
      CC: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bf8f5de1
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · a97d8efd
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
      
       - two driver fixes
      
       - better parameter check for the core
      
       - Documentation updates
      
       - part of a tree-wide HAS_DMA cleanup
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: sprd: Fix the i2c count issue
        i2c: sprd: Prevent i2c accesses after suspend is called
        i2c: dev: prevent ZERO_SIZE_PTR deref in i2cdev_ioctl_rdwr()
        Documentation/i2c: adopt kernel commenting style in examples
        Documentation/i2c: sync docs with current state of i2c-tools
        Documentation/i2c: whitespace cleanup
        i2c: Remove depends on HAS_DMA in case of platform dependency
      a97d8efd
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 6e041ffc
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - crypto API regression that may cause sporadic alloc failures
      
       - double-free bug in drbg
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: drbg - set freed buffers to NULL
        crypto: api - fix finding algorithm currently being tested
      6e041ffc
    • Linus Torvalds's avatar
      Merge tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6 · cac26428
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "A few security related fixes for SMB3, most importantly for SMB3.11
        encryption"
      
      * tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: smbd: Avoid allocating iov on the stack
        cifs: smbd: Don't use RDMA read/write when signing is used
        SMB311: Fix reconnect
        SMB3: Fix 3.11 encryption to Windows and handle encrypted smb3 tcon
        CIFS: set *resp_buf_type to NO_BUFFER on error
      cac26428
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 0d95cfa9
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "A bunch of fixes, mostly for existing code and going to stable.
      
        Our memory hot-unplug path wasn't flushing the cache before removing
        memory. That is a problem now that we are doing memory hotplug on bare
        metal.
      
        Three fixes for the NPU code that supports devices connected via
        NVLink (ie. GPUs). The main one tweaks the TLB flush algorithm to
        avoid soft lockups for large flushes.
      
        A fix for our memory error handling where we would loop infinitely,
        returning back to the bad access and hard lockup the CPU.
      
        Fixes for the OPAL RTC driver, which wasn't handling some error cases
        correctly.
      
        A fix for a hardlockup in the powernv cpufreq driver.
      
        And finally two fixes to our smp_send_stop(), required due to a recent
        change to use it on shutdown.
      
        Thanks to: Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh
        Salgaonkar, Mark Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri
        G Bhat"
      
      * tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/kvm/booke: Fix altivec related build break
        powerpc: Fix deadlock with multiple calls to smp_send_stop
        cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt
        powerpc: Fix smp_send_stop NMI IPI handling
        rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
        powerpc/mce: Fix a bug where mce loops on memory UE.
        powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
        powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters
        powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy
        powerpc/powernv/memtrace: Let the arch hotunplug code flush cache
        powerpc/mm: Flush cache on memory hot(un)plug
      0d95cfa9
  8. 27 Apr, 2018 5 commits
    • Linus Torvalds's avatar
      rMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 46dc111d
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - PSCI selection API, a leftover from 4.16 (for stable)
         - Kick vcpu on active interrupt affinity change
         - Plug a VMID allocation race on oversubscribed systems
         - Silence debug messages
         - Update Christoffer's email address (linaro -> arm)
      
        x86:
         - Expose userspace-relevant bits of a newly added feature
         - Fix TLB flushing on VMX with VPID, but without EPT"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI
        kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use
        arm/arm64: KVM: Add PSCI version selection API
        KVM: arm/arm64: vgic: Kick new VCPU on interrupt migration
        arm64: KVM: Demote SVE and LORegion warnings to debug only
        MAINTAINERS: Update e-mail address for Christoffer Dall
        KVM: arm/arm64: Close VMID generation race
      46dc111d
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 19b522db
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Nothing too bad, but the spectre updates to smatch identified a few
        places that may need sanitising so we've got those covered.
      
        Details:
      
         - Close some potential spectre-v1 vulnerabilities found by smatch
      
         - Add missing list sentinel for CPUs that don't require KPTI
      
         - Removal of unused 'addr' parameter for I/D cache coherency
      
         - Removal of redundant set_fs(KERNEL_DS) calls in ptrace
      
         - Fix single-stepping state machine handling in response to kernel
           traps
      
         - Clang support for 128-bit integers
      
         - Avoid instrumenting our out-of-line atomics in preparation for
           enabling LSE atomics by default in 4.18"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: avoid instrumenting atomic_ll_sc.o
        KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_mmio_read_apr()
        KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_get_irq()
        arm64: fix possible spectre-v1 in ptrace_hbp_get_event()
        arm64: support __int128 with clang
        arm64: only advance singlestep for user instruction traps
        arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp
        arm64: ptrace: remove addr_limit manipulation
        arm64: mm: drop addr parameter from sync icache and dcache
        arm64: add sentinel to kpti_safe_list
      19b522db
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · 7b87308e
      Linus Torvalds authored
      Pull modules fix from Jessica Yu:
       "Fix display of module section addresses in sysfs, which were getting
        hashed with %pK and breaking tools like perf"
      
      * tag 'modules-for-v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        module: Fix display of wrong module .text address
      7b87308e
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.17-rc3' of git://github.com/ceph/ceph-client · 64ebe312
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A CephFS quota follow-up and fixes for two older issues in the
        messenger layer, marked for stable"
      
      * tag 'ceph-for-4.17-rc3' of git://github.com/ceph/ceph-client:
        libceph: validate con->state at the top of try_write()
        libceph: reschedule a tick in finish_hunting()
        libceph: un-backoff on tick when we have a authenticated session
        ceph: check if mds create snaprealm when setting quota
      64ebe312
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · d8a33273
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small char and misc driver fixes for 4.17-rc3
      
        A variety of small things that have fallen out after 4.17-rc1 was out.
        Some vboxguest fixes for systems with lots of memory, amba bus fixes,
        some MAINTAINERS updates, uio_hv_generic driver fixes, and a few other
        minor things that resolve problems that people reported.
      
        The amba bus fixes took twice to get right, the first time I messed up
        applying the patches in the wrong order, hence the revert and later
        addition again with the correct fix, sorry about that.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'char-misc-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        ARM: amba: Fix race condition with driver_override
        ARM: amba: Make driver_override output consistent with other buses
        Revert "ARM: amba: Fix race condition with driver_override"
        ARM: amba: Don't read past the end of sysfs "driver_override" buffer
        ARM: amba: Fix race condition with driver_override
        virt: vbox: Log an error when we fail to get the host version
        virt: vbox: Use __get_free_pages instead of kmalloc for DMA32 memory
        virt: vbox: Add vbg_req_free() helper function
        virt: vbox: Move declarations of vboxguest private functions to private header
        slimbus: Fix out-of-bounds access in slim_slicesize()
        MAINTAINERS: add dri-devel&linaro-mm for Android ION
        fpga-manager: altera-ps-spi: preserve nCONFIG state
        MAINTAINERS: update my email address
        uio_hv_generic: fix subchannel ring mmap
        uio_hv_generic: use correct channel in isr
        uio_hv_generic: make ring buffer attribute for primary channel
        uio_hv_generic: set size of ring buffer attribute
        ANDROID: binder: prevent transactions into own process.
      d8a33273