1. 26 Sep, 2012 1 commit
    • Frederic Weisbecker's avatar
      rcu: New rcu_user_enter() and rcu_user_exit() APIs · adf5091e
      Frederic Weisbecker authored
      RCU currently insists that only idle tasks can enter RCU idle mode, which
      prohibits an adaptive tickless kernel (AKA nohz cpusets), which in turn
      would mean that usermode execution would always take scheduling-clock
      interrupts, even when there is only one task runnable on the CPU in
      question.
      
      This commit therefore adds rcu_user_enter() and rcu_user_exit(), which
      allow non-idle tasks to enter RCU idle mode.  These are quite similar
      to rcu_idle_enter() and rcu_idle_exit(), respectively, except that they
      omit the idle-task checks.
      
      [ Updated to use "user" flag rather than separate check functions. ]
      
      [ paulmck: Updated to drop exports of new functions based on Josh's patch
        getting rid of the need for them. ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      adf5091e
  2. 25 Sep, 2012 4 commits
  3. 24 Sep, 2012 1 commit
  4. 23 Sep, 2012 34 commits
    • Linus Torvalds's avatar
      Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 56bae802
      Linus Torvalds authored
      Pull kbuild fixes from Michal Marek:
       "There are two more kbuild fixes for 3.6.
      
        One fixes a race between x86's archscripts target and the rule
        (re)building scripts/basic/fixdep.  The second is a fix for the
        previous attempt at fixing make firmware_install with make 3.82.
        This new solution should work with any version of GNU make"
      
      * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        x86/kbuild: archscripts depends on scripts_basic
        firmware: fix directory creation rule matching with make 3.80
      56bae802
    • Linus Torvalds's avatar
      Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 0737c8d7
      Linus Torvalds authored
      Pull hwmon subsystem fixes from Jean Delvare.
      
      * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        hwmon: (fam15h_power) Tweak runavg_range on resume
        hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug
        hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug
      0737c8d7
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 0bf7a705
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a set of four essential fixes: two oops related (bnx2i,
        virtio-scsi), one data corruption related (hpsa) and one failure to
        boot due to interrupt routing issues (mpt2ss).
      
        Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] hpsa: fix handling of protocol error
        [SCSI] mpt2sas: Fix for issue - Unable to boot from the drive connected to HBA
        [SCSI] bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload
        [SCSI] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list
      0bf7a705
    • Shaun Ruffell's avatar
      edac_mc: edac_mc_free() cannot assume mem_ctl_info is registered in sysfs. · faa2ad09
      Shaun Ruffell authored
      Fix potential NULL pointer dereference in edac_unregister_sysfs() on
      system boot introduced in 3.6-rc1.
      
      Since commit 7a623c03 ("edac: rewrite the sysfs code to use struct
      device") edac_mc_alloc() no longer initializes embedded kobjects in
      struct mem_ctl_info.  Therefore edac_mc_free() can no longer simply
      decrement a kobject reference count to free the allocated memory unless
      the memory controller driver module had also called edac_mc_add_mc().
      
      Now edac_mc_free() will check if the newly embedded struct device has
      been registered with sysfs before using either the standard device
      release functions or freeing the data structures itself with logic
      pulled out of the error path of edac_mc_alloc().
      
      The BUG this patch resolves for me:
      
        BUG: unable to handle kernel NULL pointer dereference at   (null)
        EIP is at __wake_up_common+0x1a/0x6a
        Process modprobe (pid: 933, ti=f3dc6000 task=f3db9520 task.ti=f3dc6000)
        Call Trace:
          complete_all+0x3f/0x50
          device_pm_remove+0x23/0xa2
          device_del+0x34/0x142
          edac_unregister_sysfs+0x3b/0x5c [edac_core]
          edac_mc_free+0x29/0x2f [edac_core]
          e7xxx_probe1+0x268/0x311 [e7xxx_edac]
          e7xxx_init_one+0x56/0x61 [e7xxx_edac]
          local_pci_probe+0x13/0x15
        ...
      
      Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
      Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
      Signed-off-by: default avatarShaun Ruffell <sruffell@digium.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      faa2ad09
    • Fengguang Wu's avatar
      edac_mc: fix messy kfree calls in the error path · ef6e7816
      Fengguang Wu authored
      coccinelle warns about:
      
      + drivers/edac/edac_mc.c:429:9-23: ERROR: reference preceded by free on line 429
      
         421         if (mci->csrows) {
       > 422                 for (chn = 0; chn < tot_channels; chn++) {
         423                         csr = mci->csrows[chn];
         424                         if (csr) {
       > 425                                 for (chn = 0; chn < tot_channels; chn++)
         426                                          kfree(csr->channels[chn]);
         427                                  kfree(csr);
         428                          }
       > 429                          kfree(mci->csrows[i]);
         430                  }
         431                  kfree(mci->csrows);
         432          }
      
      and that code block seem to mess things up in several ways (double free, memory
      leak, out-of-bound reads etc.):
      
      L422: The iterator "chn" and bound "tot_channels" are totally wrong. Should be
            "row" and "tot_csrows" respectively. Which means either memory leak, or
            out-of-bound reads (which if does not trigger an immediate page fault
            error, will further lead to kfree() on random addresses).
      
      L425: The inner loop is reusing the same iterator "chn" as the outer loop,
            which could lead to premature end of the outer loop, and hence memory leak.
      
      L429: The array index 'i' in mci->csrows[i] is a temporary value used in
            previous loops, and won't change at all in the current loop. Which
            means either out-of-bound read and possibly kfree(random number), or the
            same mci->csrows[i] get freed once and again, and possibly double free
            for the kfree(csr) in L427.
      
      L426/L427: a kfree(csr->channels) is needed in between to avoid leaking the memory.
      
      The buggy code was introduced by commit de3910eb ("edac: change the mem
      allocation scheme to make Documentation/kobject.txt happy") in the 3.6-rc1
      merge window. Fix it by freeing up resources in this order:
      
        free csrows[i]->channels[j]
        free csrows[i]->channels
        free csrows[i]
        free csrows
      
      CC: Mauro Carvalho Chehab <mchehab@redhat.com>
      CC: Shaun Ruffell <sruffell@digium.com>
      Signed-off-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ef6e7816
    • Andreas Herrmann's avatar
      hwmon: (fam15h_power) Tweak runavg_range on resume · 5f0ecb90
      Andreas Herrmann authored
      The quirk introduced with commit
      00250ec9 (hwmon: fam15h_power: fix
      bogus values with current BIOSes) is not only required during driver
      load but also when system resumes from suspend. The BIOS might set the
      previously recommended (but unsuitable) initilization value for the
      running average range register during resume.
      Signed-off-by: default avatarAndreas Herrmann <andreas.herrmann3@amd.com>
      Tested-by: default avatarAndreas Hartmann <andihartmann@01019freenet.de>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      Cc: stable@vger.kernel.org # 3.0+
      5f0ecb90
    • Silas Boyd-Wickizer's avatar
      hwmon: (coretemp) Use get_online_cpus to avoid races involving CPU hotplug · 641f1456
      Silas Boyd-Wickizer authored
      coretemp_init loops with for_each_online_cpu, adding platform_devices
      and sysfs interfaces, then calls register_hotcpu_notifier.  There is a
      race if a CPU is offlined or onlined after the loop, but before
      register_hotcpu_notifier.  The race might result in the absence of a
      platform_device+sysfs interface for an online CPU, or the presence of
      a platform_device+sysfs interface for an offline CPU.  A similar race
      occurs during coretemp_exit, after the module calls
      unregister_hotcpu_notifier, but before it unregisters all devices, a
      CPU might offline and a device for an offline CPU will exist for a
      short while.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier
      with get_online_cpus+put_online_cpus; and surrounds
      unregister_hotcpu_notifier and device unregistering with
      get_online_cpus+put_online_cpus.
      
      Build tested.
      Signed-off-by: default avatarSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      641f1456
    • Silas Boyd-Wickizer's avatar
      hwmon: (via-cputemp) Use get_online_cpus to avoid races involving CPU hotplug · 1ec3ddfd
      Silas Boyd-Wickizer authored
      via_cputemp_init loops with for_each_online_cpu, adding
      platform_devices, then calls register_hotcpu_notifier.  If a CPU is
      offlined between the loop and register_hotcpu_notifier, then later
      onlined, via_cputemp_device_add will attempt to add platform devices
      with the same ID.  A similar race occurs during via_cputemp_exit,
      after the module calls unregister_hotcpu_notifier, a CPU might offline
      and a device will exist for a CPU that is offline.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier
      with get_online_cpus+put_online_cpus; and surrounds
      unregister_hotcpu_notifier and device unregistering with
      get_online_cpus+put_online_cpus.
      
      Build tested.
      Signed-off-by: default avatarSilas Boyd-Wickizer <sbw@mit.edu>
      Acked-by: default avatarHarald Welte <laforge@gnumonks.org>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      1ec3ddfd
    • Paul E. McKenney's avatar
      ia64: Add missing RCU idle APIs on idle loop · 93482f4e
      Paul E. McKenney authored
      Traditionally, the entire idle task served as an RCU quiescent state.
      But when RCU read side critical sections started appearing within the
      idle loop, this traditional strategy became untenable.  The fix was to
      create new RCU APIs named rcu_idle_enter() and rcu_idle_exit(), which
      must be called by each architecture's idle loop so that RCU can tell
      when it is safe to ignore a given idle CPU.
      
      Unfortunately, this fix was never applied to ia64, a shortcoming remedied
      by this commit.
      
      Reported by: Tony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested by: Tony Luck <tony.luck@intel.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      93482f4e
    • Frederic Weisbecker's avatar
      xtensa: Add missing RCU idle APIs on idle loop · 11ad47a0
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the xtensa's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      11ad47a0
    • Frederic Weisbecker's avatar
      score: Add missing RCU idle APIs on idle loop · 0ee23fda
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in scores's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      0ee23fda
    • Frederic Weisbecker's avatar
      parisc: Add missing RCU idle APIs on idle loop · fbe75218
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the parisc's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Parisc <linux-parisc@vger.kernel.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      fbe75218
    • Frederic Weisbecker's avatar
      mn10300: Add missing RCU idle APIs on idle loop · 5b0753a9
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the mn10300's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      5b0753a9
    • Frederic Weisbecker's avatar
      m68k: Add missing RCU idle APIs on idle loop · 5b57ba37
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the m68k's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: m68k <linux-m68k@lists.linux-m68k.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      5b57ba37
    • Frederic Weisbecker's avatar
      m32r: Add missing RCU idle APIs on idle loop · 48ae077c
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the m32r's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      48ae077c
    • Frederic Weisbecker's avatar
      h8300: Add missing RCU idle APIs on idle loop · b2fe1430
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the h8300's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      b2fe1430
    • Frederic Weisbecker's avatar
      frv: Add missing RCU idle APIs on idle loop · 41d8fe5b
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Frv's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      41d8fe5b
    • Frederic Weisbecker's avatar
      cris: Add missing RCU idle APIs on idle loop · c633f9e7
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Cris's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Cris <linux-cris-kernel@axis.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      c633f9e7
    • Frederic Weisbecker's avatar
      alpha: Add missing RCU idle APIs on idle loop · 4c94cada
      Frederic Weisbecker authored
      In the old times, the whole idle task was considered
      as an RCU quiescent state. But as RCU became more and
      more successful overtime, some RCU read side critical
      section have been added even in the code of some
      architectures idle tasks, for tracing for example.
      
      So nowadays, rcu_idle_enter() and rcu_idle_exit() must
      be called by the architecture to tell RCU about the part
      in the idle loop that doesn't make use of rcu read side
      critical sections, typically the part that puts the CPU
      in low power mode.
      
      This is necessary for RCU to find the quiescent states in
      idle in order to complete grace periods.
      
      Add this missing pair of calls in the Alpha's idle loop.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: default avatarMichael Cree <mcree@orcon.net.nz>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: alpha <linux-alpha@vger.kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org> # 3.3+
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      4c94cada
    • Frederic Weisbecker's avatar
      alpha: Fix preemption handling in idle loop · 6a6c0272
      Frederic Weisbecker authored
      cpu_idle() is called on the boot CPU by the init code with
      preemption disabled. But the cpu_idle() function in alpha
      doesn't handle this when it calls schedule() directly.
      
      Fix it by converting it into schedule_preempt_disabled().
      
      Also disable preemption before calling cpu_idle() from
      secondary CPU entry code to stay consistent with this
      state.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Tested-by: default avatarMichael Cree <mcree@orcon.net.nz>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: alpha <linux-alpha@vger.kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      6a6c0272
    • Silas Boyd-Wickizer's avatar
      Use get_online_cpus to avoid races involving CPU hotplug · 429227bb
      Silas Boyd-Wickizer authored
      If arch/x86/kernel/cpuid.c is a module, a CPU might offline or online
      between the for_each_online_cpu() loop and the call to
      register_hotcpu_notifier in cpuid_init or the call to
      unregister_hotcpu_notifier in cpuid_exit.  The potential races can
      lead to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in cpuid_exit if:
      
              for_each_online_cpu(cpu)
                      cpuid_device_destroy(cpu);
              class_destroy(cpuid_class);
              __unregister_chrdev(CPUID_MAJOR, 0, NR_CPUS, "cpu/cpuid");
              <----- CPU onlines
              unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);
      
      the hotcpu notifier will attempt to create a device for the
      cpuid_class, which the module already destroyed.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: default avatarSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      429227bb
    • Silas Boyd-Wickizer's avatar
      Use get_online_cpus to avoid races involving CPU hotplug · a2db672a
      Silas Boyd-Wickizer authored
      If arch/x86/kernel/msr.c is a module, a CPU might offline or online
      between the for_each_online_cpu(i) loop and the call to
      register_hotcpu_notifier in msr_init or the call to
      unregister_hotcpu_notifier in msr_exit. The potential races can lead
      to leaks/duplicates, attempts to destroy non-existant devices, or
      random pointer dereferences.
      
      For example, in msr_init if:
      
              for_each_online_cpu(i) {
                      err = msr_device_create(i);
                      if (err != 0)
                              goto out_class;
              }
              <----- CPU offlines
              register_hotcpu_notifier(&msr_class_cpu_notifier);
      
      and the CPU never onlines before msr_exit, then the module will never
      call msr_device_destroy for the associated CPU.
      
      This fix surrounds for_each_online_cpu and register_hotcpu_notifier or
      unregister_hotcpu_notifier with get_online_cpus+put_online_cpus.
      
      Tested on a VM.
      Signed-off-by: default avatarSilas Boyd-Wickizer <sbw@mit.edu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a2db672a
    • Peter Zijlstra's avatar
      sched: Fix load avg vs cpu-hotplug · 5d180232
      Peter Zijlstra authored
      Rabik and Paul reported two different issues related to the same few
      lines of code.
      
      Rabik's issue is that the nr_uninterruptible migration code is wrong in
      that he sees artifacts due to this (Rabik please do expand in more
      detail).
      
      Paul's issue is that this code as it stands relies on us using
      stop_machine() for unplug, we all would like to remove this assumption
      so that eventually we can remove this stop_machine() usage altogether.
      
      The only reason we'd have to migrate nr_uninterruptible is so that we
      could use for_each_online_cpu() loops in favour of
      for_each_possible_cpu() loops, however since nr_uninterruptible() is the
      only such loop and its using possible lets not bother at all.
      
      The problem Rabik sees is (probably) caused by the fact that by
      migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
      involved.
      
      So don't bother with fancy migration schemes (meaning we now have to
      keep using for_each_possible_cpu()) and instead fold any nr_active delta
      after we migrate all tasks away to make sure we don't have any skewed
      nr_active accounting.
      
      [ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
      miscounting noted by Rakib. ]
      Reported-by: default avatarRakib Mullick <rakib.mullick@gmail.com>
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      5d180232
    • Paul E. McKenney's avatar
      rcu: Disallow callback registry on offline CPUs · 0d8ee37e
      Paul E. McKenney authored
      Posting a callback after the CPU_DEAD notifier effectively leaks
      that callback unless/until that CPU comes back online.  Silence is
      unhelpful when attempting to track down such leaks, so this commit emits
      a WARN_ON_ONCE() and unconditionally leaks the callback when an offline
      CPU attempts to register a callback.  The rdp->nxttail[RCU_NEXT_TAIL] is
      set to NULL in the CPU_DEAD notifier and restored in the CPU_UP_PREPARE
      notifier, allowing _call_rcu() to determine exactly when posting callbacks
      is illegal.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      0d8ee37e
    • Paul E. McKenney's avatar
      rcu: Remove _rcu_barrier() dependency on __stop_machine() · 1331e7a1
      Paul E. McKenney authored
      Currently, _rcu_barrier() relies on preempt_disable() to prevent
      any CPU from going offline, which in turn depends on CPU hotplug's
      use of __stop_machine().
      
      This patch therefore makes _rcu_barrier() use get_online_cpus() to
      block CPU-hotplug operations.  This has the added benefit of removing
      the need for _rcu_barrier() to adopt callbacks:  Because CPU-hotplug
      operations are excluded, there can be no callbacks to adopt.  This
      commit simplifies the code accordingly.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      1331e7a1
    • Paul E. McKenney's avatar
      rcu: Fix CONFIG_RCU_FAST_NO_HZ stall warning message · 86f343b5
      Paul E. McKenney authored
      The print_cpu_stall_fast_no_hz() function attempts to print -1 when
      the ->idle_gp_timer is not pending, but unsigned arithmetic causes it
      to instead print ULONG_MAX, which is 4294967295 on 32-bit systems and
      18446744073709551615 on 64-bit systems.  Neither of these are the most
      reader-friendly values, so this commit instead causes "timer not pending"
      to be printed when ->idle_gp_timer is not pending.
      Reported-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      86f343b5
    • Li Zhong's avatar
      rcu: Move TINY_RCU quiescent state out of extended quiescent state · 22a76726
      Li Zhong authored
      TINY_RCU's rcu_idle_enter_common() invokes rcu_sched_qs() in order
      to inform the RCU core of the quiescent state implied by idle entry.
      Of course, idle is also an extended quiescent state, so that the call
      to rcu_sched_qs() speeds up RCU's invoking of any callbacks that might
      be queued.  This speed-up is important when entering into dyntick-idle
      mode -- if there are no further scheduling-clock interrupts, the callbacks
      might never be invoked, which could result in a system hang.
      
      However, processing callbacks does event tracing, which in turn
      implies RCU read-side critical sections, which are illegal in extended
      quiescent states.  This patch therefore moves the call to rcu_sched_qs()
      so that it precedes the point at which we inform lockdep that RCU has
      entered an extended quiescent state.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      22a76726
    • Michael Wang's avatar
      kmemleak: Replace list_for_each_continue_rcu with new interface · 58fac095
      Michael Wang authored
      This patch replaces list_for_each_continue_rcu() with
      list_for_each_entry_continue_rcu() to save a few lines
      of code and allow removing list_for_each_continue_rcu().
      Signed-off-by: default avatarMichael Wang <wangyun@linux.vnet.ibm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      58fac095
    • Paul E. McKenney's avatar
      time: RCU permitted to stop idle entry via softirq · 803b0eba
      Paul E. McKenney authored
      The can_stop_idle_tick() function complains if a softirq vector is
      raised too late in the idle-entry process, presumably in order to
      prevent dangling softirq invocations from being delayed across the
      full idle period, which might be indefinitely long -- and if softirq
      was asserted any later than the call to this function, such a delay
      might well happen.
      
      However, RCU needs to be able to use softirq to stop idle entry in
      order to be able to drain RCU callbacks from the current CPU, which in
      turn enables faster entry into dyntick-idle mode, which in turn reduces
      power consumption.  Because RCU takes this action at a well-defined
      point in the idle-entry path, it is safe for RCU to take this approach.
      
      This commit therefore silences the error message that is sometimes
      produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ
      to process.  The error message will continue to be issued for other
      softirq vectors.
      Reported-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      803b0eba
    • Paul E. McKenney's avatar
      rcu: Move TINY_PREEMPT_RCU away from raw_local_irq_save() · 7a11e205
      Paul E. McKenney authored
      The use of raw_local_irq_save() is unnecessary, given that local_irq_save()
      really does disable interrupts.  Also, it appears to interfere with lockdep.
      Therefore, this commit moves to local_irq_save().
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      7a11e205
    • Paul E. McKenney's avatar
      rcu: Remove redundant memory barrier from __call_rcu() · fdab649b
      Paul E. McKenney authored
      The first memory barrier in __call_rcu() is supposed to order any
      updates done beforehand by the caller against the actual queuing
      of the callback.  However, the second memory barrier (which is intended
      to order incrementing the queue lengths before queuing the callback)
      is also between the caller's updates and the queuing of the callback.
      The second memory barrier can therefore serve both purposes.
      
      This commit therefore removes the first memory barrier.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      fdab649b
    • Paul E. McKenney's avatar
      rcu: Avoid spurious RCU CPU stall warnings · c96ea7cf
      Paul E. McKenney authored
      If a given CPU avoids the idle loop but also avoids starting a new
      RCU grace period for a full minute, RCU can issue spurious RCU CPU
      stall warnings.  This commit fixes this issue by adding a check for
      ongoing grace period to avoid these spurious stall warnings.
      Reported-by: default avatarBecky Bruce <bgillbruce@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      c96ea7cf
    • Paul E. McKenney's avatar
      rcu: Protect rcu_node accesses during CPU stall warnings · c8020a67
      Paul E. McKenney authored
      The print_other_cpu_stall() function accesses a number of rcu_node
      fields without protection from the ->lock.  In theory, this is not
      a problem because the fields accessed are all integers, but in
      practice the compiler can get nasty.  Therefore, the commit extends
      the existing critical section to cover the entire loop body.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c8020a67
    • Paul E. McKenney's avatar
      rcu: Avoid rcu_print_detail_task_stall_rnp() segfault · 5fd4dc06
      Paul E. McKenney authored
      The rcu_print_detail_task_stall_rnp() function invokes
      rcu_preempt_blocked_readers_cgp() to verify that there are some preempted
      RCU readers blocking the current grace period outside of the protection
      of the rcu_node structure's ->lock.  This means that the last blocked
      reader might exit its RCU read-side critical section and remove itself
      from the ->blkd_tasks list before the ->lock is acquired, resulting in
      a segmentation fault when the subsequent code attempts to dereference
      the now-NULL gp_tasks pointer.
      
      This commit therefore moves the test under the lock.  This will not
      have measurable effect on lock contention because this code is invoked
      only when printing RCU CPU stall warnings, in other words, in the common
      case, never.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5fd4dc06