1. 07 Jul, 2016 20 commits
    • Thomas Gleixner's avatar
      timers: Only wake softirq if necessary · 4e85876a
      Thomas Gleixner authored
      With the wheel forwading in place and with the HZ=1000 4ms folding we can
      avoid running the softirq at all.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.607650550@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4e85876a
    • Thomas Gleixner's avatar
      timers: Forward the wheel clock whenever possible · a683f390
      Thomas Gleixner authored
      The wheel clock is stale when a CPU goes into a long idle sleep. This has the
      side effect that timers which are queued end up in the outer wheel levels.
      That results in coarser granularity.
      
      To solve this, we keep track of the idle state and forward the wheel clock
      whenever possible.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a683f390
    • Thomas Gleixner's avatar
      timers/nohz: Remove pointless tick_nohz_kick_tick() function · ff006732
      Thomas Gleixner authored
      This was a failed attempt to optimize the timer expiry in idle, which was
      disabled and never revisited. Remove the cruft.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.431073782@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff006732
    • Anna-Maria Gleixner's avatar
      timers: Optimize collect_expired_timers() for NOHZ · 23696838
      Anna-Maria Gleixner authored
      After a NOHZ idle sleep the timer wheel must be forwarded to current jiffies.
      There might be expired timers so the current code loops and checks the expired
      buckets for timers. This can take quite some time for long NOHZ idle periods.
      
      The pending bitmask in the timer base allows us to do a quick search for the
      next expiring timer and therefore a fast forward of the base time which
      prevents pointless long lasting loops.
      
      For a 3 seconds idle sleep this reduces the catchup time from ~1ms to 5us.
      Signed-off-by: default avatarAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.351296290@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      23696838
    • Anna-Maria Gleixner's avatar
      timers: Move __run_timers() function · 73420fea
      Anna-Maria Gleixner authored
      Move __run_timers() below __next_timer_interrupt() and next_pending_bucket()
      in preparation for __run_timers() NOHZ optimization.
      
      No functional change.
      Signed-off-by: default avatarAnna-Maria Gleixner <anna-maria@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.271872665@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      73420fea
    • Thomas Gleixner's avatar
      timers: Remove set_timer_slack() leftovers · 53bf837b
      Thomas Gleixner authored
      We now have implicit batching in the timer wheel. The slack API is no longer
      used, so remove it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Andrew F. Davis <afd@ti.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jaehoon Chung <jh80.chung@samsung.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathias Nyman <mathias.nyman@intel.com>
      Cc: Pali Rohár <pali.rohar@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sebastian Reichel <sre@kernel.org>
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Cc: linux-block@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: linux-usb@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      53bf837b
    • Thomas Gleixner's avatar
      timers: Switch to a non-cascading wheel · 500462a9
      Thomas Gleixner authored
      The current timer wheel has some drawbacks:
      
      1) Cascading:
      
         Cascading can be an unbound operation and is completely pointless in most
         cases because the vast majority of the timer wheel timers are canceled or
         rearmed before expiration. (They are used as timeout safeguards, not as
         real timers to measure time.)
      
      2) No fast lookup of the next expiring timer:
      
         In NOHZ scenarios the first timer soft interrupt after a long NOHZ period
         must fast forward the base time to the current value of jiffies. As we
         have no way to find the next expiring timer fast, the code loops linearly
         and increments the base time one by one and checks for expired timers
         in each step. This causes unbound overhead spikes exactly in the moment
         when we should wake up as fast as possible.
      
      After a thorough analysis of real world data gathered on laptops,
      workstations, webservers and other machines (thanks Chris!) I came to the
      conclusion that the current 'classic' timer wheel implementation can be
      modified to address the above issues.
      
      The vast majority of timer wheel timers is canceled or rearmed before
      expiry. Most of them are timeouts for networking and other I/O tasks. The
      nature of timeouts is to catch the exception from normal operation (TCP ack
      timed out, disk does not respond, etc.). For these kinds of timeouts the
      accuracy of the timeout is not really a concern. Timeouts are very often
      approximate worst-case values and in case the timeout fires, we already
      waited for a long time and performance is down the drain already.
      
      The few timers which actually expire can be split into two categories:
      
       1) Short expiry times which expect halfways accurate expiry
      
       2) Long term expiry times are inaccurate today already due to the
          batching which is done for NOHZ automatically and also via the
          set_timer_slack() API.
      
      So for long term expiry timers we can avoid the cascading property and just
      leave them in the less granular outer wheels until expiry or
      cancelation. Timers which are armed with a timeout larger than the wheel
      capacity are no longer cascaded. We expire them with the longest possible
      timeout (6+ days). We have not observed such timeouts in our data collection,
      but at least we handle them, applying the rule of the least surprise.
      
      To avoid extending the wheel levels for HZ=1000 so we can accomodate the
      longest observed timeouts (5 days in the network conntrack code) we reduce the
      first level granularity on HZ=1000 to 4ms, which effectively is the same as
      the HZ=250 behaviour. From our data analysis there is nothing which relies on
      that 1ms granularity and as a side effect we get better batching and timer
      locality for the networking code as well.
      
      Contrary to the classic wheel the granularity of the next wheel is not the
      capacity of the first wheel. The granularities of the wheels are in the
      currently chosen setting 8 times the granularity of the previous wheel.
      
      So for HZ=250 we end up with the following granularity levels:
      
       Level Offset   Granularity                  Range
           0      0          4 ms                 0 ms -        252 ms
           1     64         32 ms               256 ms -       2044 ms (256ms - ~2s)
           2    128        256 ms              2048 ms -      16380 ms (~2s   - ~16s)
           3    192       2048 ms (~2s)       16384 ms -     131068 ms (~16s  - ~2m)
           4    256      16384 ms (~16s)     131072 ms -    1048572 ms (~2m   - ~17m)
           5    320     131072 ms (~2m)     1048576 ms -    8388604 ms (~17m  - ~2h)
           6    384    1048576 ms (~17m)    8388608 ms -   67108863 ms (~2h   - ~18h)
           7    448    8388608 ms (~2h)    67108864 ms -  536870911 ms (~18h  - ~6d)
      
      That's a worst case inaccuracy of 12.5% for the timers which are queued at the
      beginning of a level.
      
      So the new wheel concept addresses the old issues:
      
      1) Cascading is avoided completely
      
      2) By keeping the timers in the bucket until expiry/cancelation we can track
         the buckets which have timers enqueued in a bucket bitmap and therefore can
         look up the next expiring timer very fast and O(1).
      
      A further benefit of the concept is that the slack calculation which is done
      on every timer start is no longer necessary because the granularity levels
      provide natural batching already.
      
      Our extensive testing with various loads did not show any performance
      degradation vs. the current wheel implementation.
      
      This patch does not address the 'fast lookup' issue as we wanted to make sure
      that there is no regression introduced by the wheel redesign. The
      optimizations are in follow up patches.
      
      This patch contains fixes from Anna-Maria Gleixner and Richard Cochran.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      500462a9
    • Thomas Gleixner's avatar
      timers: Reduce the CPU index space to 256k · b0d6e2dc
      Thomas Gleixner authored
      We want to store the array index in the flags space. 256k CPUs should be
      enough for a while.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094342.030144293@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b0d6e2dc
    • Thomas Gleixner's avatar
      timers: Give a few structs and members proper names · 494af3ed
      Thomas Gleixner authored
      Some of the names in the internal implementation of the timer code
      are not longer correct and others are simply too long to type.
      
      Clean it up before we switch the wheel implementation over to
      the new scheme.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.948752516@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      494af3ed
    • Thomas Gleixner's avatar
      hlist: Add hlist_is_singular_node() helper · 15dba1e3
      Thomas Gleixner authored
      Required to figure out whether the entry is the only one in the hlist.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.867631372@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      15dba1e3
    • Thomas Gleixner's avatar
      signals: Use hrtimer for sigtimedwait() · 2b1ecc3d
      Thomas Gleixner authored
      We've converted most timeout related syscalls to hrtimers, but
      sigtimedwait() did not get this treatment.
      
      Convert it so we get a reasonable accuracy and remove the
      user space exposure to the timer wheel properties.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Cyril Hrubis <chrubis@suse.cz>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.787164909@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2b1ecc3d
    • Thomas Gleixner's avatar
      timers: Remove the deprecated mod_timer_pinned() API · 177ec0a0
      Thomas Gleixner authored
      We switched all users to initialize the timers as pinned and call
      mod_timer(). Remove the now unused timer API function.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      177ec0a0
    • Thomas Gleixner's avatar
      timers, net/ipv4/inet: Initialize connection request timers as pinned · f3438bc7
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.617891430@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f3438bc7
    • Thomas Gleixner's avatar
      timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned · 853f90d4
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.537448301@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      853f90d4
    • Thomas Gleixner's avatar
      timers, drivers/tty/metag_da: Initialize the poll timer as pinned · 01bab536
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.456452642@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      01bab536
    • Thomas Gleixner's avatar
      timers, driver/net/ethernet/tile: Initialize the egress timer as pinned · 25df5760
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.376394205@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      25df5760
    • Thomas Gleixner's avatar
      timers, cpufreq/powernv: Initialize the gpstate timer as pinned · 7bc54b65
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.297014487@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7bc54b65
    • Thomas Gleixner's avatar
      timers, x86/mce: Initialize MCE restart timer as pinned · f9c287ba
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.215783439@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f9c287ba
    • Thomas Gleixner's avatar
      timers, x86/apic/uv: Initialize the UV heartbeat timer as pinned · 920a4a70
      Thomas Gleixner authored
      Pinned timers must carry the pinned attribute in the timer structure
      itself, so convert the code to the new API.
      
      No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.133837204@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      920a4a70
    • Thomas Gleixner's avatar
      timers: Make 'pinned' a timer property · e675447b
      Thomas Gleixner authored
      We want to move the timer migration logic from a 'push' to a 'pull' model.
      
      Under the current 'push' model pinned timers are handled via
      a runtime API variant: mod_timer_pinned().
      
      The 'pull' model requires us to store the pinned attribute of a timer
      in the timer_list structure itself, as a new TIMER_PINNED bit in
      timer->flags.
      
      This flag must be set at initialization time and the timer APIs
      recognize the flag.
      
      This patch:
      
       - Implements the new flag and associated new-style initialization
         methods
      
       - makes mod_timer() recognize new-style pinned timers,
      
       - and adds some migration helper facility to allow
         step by step conversion of old-style to new-style
         pinned timers.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160704094341.049338558@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e675447b
  2. 04 Jul, 2016 1 commit
  3. 03 Jul, 2016 5 commits
  4. 02 Jul, 2016 6 commits
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux · 99b0f54e
      Linus Torvalds authored
      Pull drm fixes frlm Dave Airlie:
       "Just some AMD and Intel fixes, the AMD ones are further production
        Polaris fixes, and the Intel ones fix some early timeouts, some PCI ID
        changes and a couple of other fixes.
      
        Still a bit Internet challenged here, hopefully end of next week will
        solve it"
      
      * tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Fix missing unlock on error in i915_ppgtt_info()
        drm/amd/powerplay: workaround for UVD clock issue
        drm/amdgpu: add ACLK_CNTL setting for polaris10
        drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
        drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
        drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
        drm/i915: Add more Kabylake PCI IDs.
        drm/i915: Avoid early timeout during AUX transfers
        drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
        drm/i915/lpt: Avoid early timeout during FDI PHY reset
        drm/i915/bxt: Avoid early timeout during PLL enable
        drm/i915: Refresh cached DP port register value on resume
        drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
        drm/amd/powerplay: disable FFC.
        drm/amd/powerplay: add some definition for FFC feature on polaris.
      99b0f54e
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 467ce769
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A few small driver-specific fixes for SPI, all in the normal important
        if you hit them category especially the rockchip driver fix which
        addresses a race which has been exposed more frequently with some
        recent performance improvements"
      
      * tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: sunxi: fix transfer timeout
        spi: sun4i: fix FIFO limit
        spi: rockchip: Signal unfinished DMA transfers
        spi: spi-ti-qspi: Suspend the queue before removing the device
      467ce769
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v4.7-rc5' of... · a2b0db5b
      Linus Torvalds authored
      Merge tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "Two small fixes for the regulator subsystem - one fixing a crash with
        one of the devices supported by the max77620 driver, another fixing
        startup for the anatop regulator when it starts up with the regulator
        in bypass mode"
      
      * tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: max77620: check for valid regulator info
        regulator: anatop: allow regulator to be in bypass mode
      a2b0db5b
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 44385120
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A small fix for the newly added oxnas clk driver and a handful of
        rockchip clk driver fixes for newly added rk3399 support"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: Fix return value check in oxnas_stdclk_probe()
        clk: rockchip: release io resource when failing to init clk on rk3399
        clk: rockchip: fix cpuclk registration error handling
        clk: rockchip: Revert "clk: rockchip: reset init state before mmc card initialization"
        clk: rockchip: fix incorrect parent for rk3399's {c,g}pll_aclk_perihp_src
        clk: rockchip: mark rk3399 GIC clocks as critical
        clk: rockchip: initialize flags of clk_init_data in mmc-phase clock
      44385120
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 88c08710
      Dave Airlie authored
      here's a batch of i915 fixes for 4.7.
      
      * tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel:
        drm/i915: Fix missing unlock on error in i915_ppgtt_info()
        drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
        drm/i915: Add more Kabylake PCI IDs.
        drm/i915: Avoid early timeout during AUX transfers
        drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
        drm/i915/lpt: Avoid early timeout during FDI PHY reset
        drm/i915/bxt: Avoid early timeout during PLL enable
        drm/i915: Refresh cached DP port register value on resume
      88c08710
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 40793e85
      Dave Airlie authored
      Just a few more late fixes for Polaris cards.
      
      * 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
        drm/amd/powerplay: workaround for UVD clock issue
        drm/amdgpu: add ACLK_CNTL setting for polaris10
        drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
        drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
        drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
        drm/amd/powerplay: disable FFC.
        drm/amd/powerplay: add some definition for FFC feature on polaris.
      40793e85
  5. 01 Jul, 2016 8 commits
    • Ralf Baechle's avatar
      MIPS: Fix possible corruption of cache mode by mprotect. · 6d037de9
      Ralf Baechle authored
      The following testcase may result in a page table entries with a invalid
      CCA field being generated:
      
      static void *bindstack;
      
      static int sysrqfd;
      
      static void protect_low(int protect)
      {
      	mprotect(bindstack, BINDSTACK_SIZE, protect);
      }
      
      static void sigbus_handler(int signal, siginfo_t * info, void *context)
      {
      	void *addr = info->si_addr;
      
      	write(sysrqfd, "x", 1);
      
      	printf("sigbus, fault address %p (should not happen, but might)\n",
      	       addr);
      	abort();
      }
      
      static void run_bind_test(void)
      {
      	unsigned int *p = bindstack;
      
      	p[0] = 0xf001f001;
      
      	write(sysrqfd, "x", 1);
      
      	/* Set trap on access to p[0] */
      	protect_low(PROT_NONE);
      
      	write(sysrqfd, "x", 1);
      
      	/* Clear trap on access to p[0] */
      	protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);
      
      	write(sysrqfd, "x", 1);
      
      	/* Check the contents of p[0] */
      	if (p[0] != 0xf001f001) {
      		write(sysrqfd, "x", 1);
      
      		/* Reached, but shouldn't be */
      		printf("badness, shouldn't happen but does\n");
      		abort();
      	}
      }
      
      int main(void)
      {
      	struct sigaction sa;
      
      	sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);
      
      	if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
      		perror("sigprocmask");
      		return 0;
      	}
      
      	sa.sa_sigaction = sigbus_handler;
      	sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
      	if (sigaction(SIGBUS, &sa, NULL)) {
      		perror("sigaction");
      		return 0;
      	}
      
      	bindstack = mmap(NULL,
      			 BINDSTACK_SIZE,
      			 PROT_READ | PROT_WRITE | PROT_EXEC,
      			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      	if (bindstack == MAP_FAILED) {
      		perror("mmap bindstack");
      		return 0;
      	}
      
      	printf("bindstack: %p\n", bindstack);
      
      	run_bind_test();
      
      	printf("done\n");
      
      	return 0;
      }
      
      There are multiple ingredients for this:
      
       1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
          on all platforms except SB1 where it's CCA 5.
       2) _page_cachable_default must have bits set which are not set
          _CACHE_CACHABLE_NONCOHERENT.
       3) Either the defective version of pte_modify for XPA or the standard
          version must be in used.  However pte_modify for the 36 bit address
          space support is no affected.
      
      In that case additional bits in the final CCA mode may generate an invalid
      value for the CCA field.  On the R10000 system where this was tracked
      down for example a CCA 7 has been observed, which is Uncached Accelerated.
      
      Fixed by:
      
       1) Using the proper CCA mode for PAGE_NONE just like for all the other
          PAGE_* pte/pmd bits.
       2) Fix the two affected variants of pte_modify.
      
      Further code inspection also shows the same issue to exist in pmd_modify
      which would affect huge page systems.
      
      Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
      and pmd_modify issue found by me.
      
      The history of this goes back beyond Linus' git history.  Chris Dearman's
      commit 35133692 ("[MIPS] Allow setting of
      the cache attribute at run time.") missed the opportunity to fix this
      but it was originally introduced in lmo commit
      d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
      and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
      CONFIG_MIPS_UNCACHED.")
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Reported-by: default avatarAlastair Bridgewater <alastair.bridgewater@gmail.com>
      6d037de9
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · dbdc3bb7
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Fix an expression in the ACPI PCI IRQ management code added by a
        recent commit that overlooked missing parens in it, so the result of
        the computation is incorrect in some cases (Sinan Kaya)"
      
      * tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI,PCI,IRQ: correct operator precedence
      dbdc3bb7
    • Linus Torvalds's avatar
      Merge tag 'pm-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 81dbd6f5
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "Three cpufreq fixes, one in the core (stable-candidate) and two in
        drivers (intel_pstate and cpufreq-dt).
      
        Specifics:
      
         - Fix a recent intel_pstate regression that caused the number of
           wakeups to increase significantly on an idle system in some cases
           due to excessive synchronize_sched() invocations (Rafael Wysocki).
      
         - Fix unnecessary invocations of WARN_ON() in the cpufreq core after
           cpufreq has been suspended introduced during the 4.6 cycla (Rafael
           Wysocki).
      
         - Fix an error code path in the cpufreq-dt-platdev driver that
           forgets to drop a reference to a DT node (Masahiro Yamada)"
      
      * tag 'pm-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy()
        cpufreq: dt: call of_node_put() before error out
        intel_pstate: Do not clear utilization update hooks on policy changes
      81dbd6f5
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 48c4565e
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "Tmpfs readdir throughput regression fix (this cycle) + some -stable
        fodder all over the place.
      
        One missing bit is Miklos' tonight locks.c fix - NFS folks had already
        grabbed that one by the time I woke up ;-)"
      
      [ The locks.c fix came through the nfsd tree just moments ago ]
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        namespace: update event counter when umounting a deleted dentry
        9p: use file_dentry()
        ceph: fix d_obtain_alias() misuses
        lockless next_positive()
        libfs.c: new helper - next_positive()
        dcache_{readdir,dir_lseek}(): don't bother with nested ->d_lock
      48c4565e
    • Linus Torvalds's avatar
      Merge tag 'nfsd-4.7-3' of git://linux-nfs.org/~bfields/linux · 2728c57f
      Linus Torvalds authored
      Pull lockd/locks fixes from Bruce Fields:
       "One fix for lockd soft lookups in an error path, and one fix for file
        leases on overlayfs"
      
      * tag 'nfsd-4.7-3' of git://linux-nfs.org/~bfields/linux:
        locks: use file_inode()
        lockd: unregister notifier blocks if the service fails to come up completely
      2728c57f
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-4.7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 0d064a7b
      Linus Torvalds authored
      Pull more MFD fixes from Lee Jones:
       "Apologies for missing these from the first pull request.
      
        Final patches fixing Reset API change"
      
      * tag 'mfd-fixes-4.7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        usb: dwc3: st: Use explicit reset_control_get_exclusive() API
        phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() API
        phy: miphy28lp: Inform the reset framework that our reset line may be shared
      0d064a7b
    • Linus Torvalds's avatar
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · f3683ccd
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "1/ Two regression fixes since v4.6: one for the byte order of a sysfs
           attribute (bz121161) and another for QEMU 2.6's NVDIMM _DSM (ACPI
           Device Specific Method) implementation that gets tripped up by new
           auto-probing behavior in the NFIT driver.
      
        2/ A fix tagged for -stable that stops the kernel from
           clobbering/ignoring changes to the configuration of a 'pfn'
           instance ("struct page" driver).  For example changing the
           alignment from 2M to 1G may silently revert to 2M if that value is
           currently stored on media.
      
        3/ A fix from Eric for an xfstests failure in dax.  It is not
           currently tagged for -stable since it requires an 8-exabyte file
           system to trigger, and there appear to be no user visible side
           effects"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        nfit: fix format interface code byte order
        dax: fix offset overflow in dax_io
        acpi, nfit: fix acpi_check_dsm() vs zero functions implemented
        libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment
      f3683ccd
    • Linus Torvalds's avatar
      Merge tag 'staging-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 6e5c4f13
      Linus Torvalds authored
      Pull staging and IIO fixes from Greg KH:
       "Here are a few small staging and iio driver fixes for 4.7-rc6.
      
        Nothing major here, just a number of small fixes, all have been in
        linux-next for a while, and the full details are in the shortlog"
      
      * tag 'staging-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        iio:ad7266: Fix probe deferral for vref
        iio:ad7266: Fix support for optional regulators
        iio:ad7266: Fix broken regulator error handling
        iio: accel: kxsd9: fix the usage of spi_w8r8()
        staging: iio: accel: fix error check
        staging: iio: ad5933: fix order of cycle conditions
        staging: iio: fix ad7606_spi regression
        iio: inv_mpu6050: Fix use-after-free in ACPI code
      6e5c4f13