1. 30 Apr, 2012 12 commits
    • Lai Jiangshan's avatar
      rcu: Add rcutorture test for call_srcu() · 9059c940
      Lai Jiangshan authored
      Add srcu_torture_deferred_free() for srcu_ops so as to test the new
      call_srcu().  Rename the original srcu_ops to srcu_sync_ops.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9059c940
    • Lai Jiangshan's avatar
      rcu: Implement per-domain single-threaded call_srcu() state machine · 931ea9d1
      Lai Jiangshan authored
      This commit implements an SRCU state machine in support of call_srcu().
      The state machine is preemptible, light-weight, and single-threaded,
      minimizing synchronization overhead.  In particular, there is no longer
      any need for synchronize_srcu() to be guarded by a mutex.
      
      Expedited processing is handled, at least in the absence of concurrent
      grace-period operations on that same srcu_struct structure, by having
      the synchronize_srcu_expedited() thread take on the role of the
      workqueue thread for one iteration.
      
      There is a reasonable probability that a given SRCU callback will
      be invoked on the same CPU that registered it, however, there is no
      guarantee.  Concurrent SRCU grace-period primitives can cause callbacks
      to be executed elsewhere, even in absence of CPU-hotplug operations.
      
      Callbacks execute in process context, but under the influence of
      local_bh_disable(), so it is illegal to sleep in an SRCU callback
      function.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      931ea9d1
    • Lai Jiangshan's avatar
      rcu: Use single value to handle expedited SRCU grace periods · d9792edd
      Lai Jiangshan authored
      The earlier algorithm used an "expedited" flag combined with a "trycount"
      counter to differentiate between normal and expedited SRCU grace periods.
      However, the difference can be encoded into a single counter with a cutoff
      value and different initial values for expedited and normal SRCU grace
      periods.  This commit makes that change.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      
      Conflicts:
      
      	kernel/srcu.c
      d9792edd
    • Lai Jiangshan's avatar
      rcu: Improve srcu_readers_active_idx()'s cache locality · dc879175
      Lai Jiangshan authored
      Expand the calls to srcu_readers_active_idx() from srcu_readers_active()
      inline.  This change improves cache locality by interating over the CPUs
      once rather than twice.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      dc879175
    • Lai Jiangshan's avatar
      rcu: Remove unused srcu_barrier() · 966f58c2
      Lai Jiangshan authored
      The old srcu_barrier() macro is now unused.  This commit removes it so
      that it may be used for the SRCU flavor of rcu_barrier(), which will in
      turn be needed to allow the upcoming call_srcu() to be used from within
      modules.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      966f58c2
    • Lai Jiangshan's avatar
      rcu: Implement a variant of Peter's SRCU algorithm · b52ce066
      Lai Jiangshan authored
      This commit implements a variant of Peter's algorithm, which may be found
      at https://lkml.org/lkml/2012/2/1/119.
      
      o	Make the checking lock-free to enable parallel checking.
      	Parallel checking is required when (1) the original checking
      	task is preempted for a long time, (2) sychronize_srcu_expedited()
      	starts during an ongoing SRCU grace period, or (3) we wish to
      	avoid acquiring a lock.
      
      o	Since the checking is lock-free, we avoid a mutex in state machine
      	for call_srcu().
      
      o	Remove the SRCU_REF_MASK and remove the coupling with the flipping.
      	This might allow us to remove the preempt_disable() in future
      	versions, though such removal will need great care because it
      	rescinds the one-old-reader-per-CPU guarantee.
      
      o	Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
      	more intuitive.
      Inspired-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b52ce066
    • Lai Jiangshan's avatar
      rcu: Improve SRCU's wait_idx() comments · 18108ebf
      Lai Jiangshan authored
      The safety of SRCU is provided byy wait_idx() rather than flipping.
      The flipping actually prevents starvation.
      
      This commit therefore updates the comments to more accurately and
      precisely describe what is going on.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      18108ebf
    • Lai Jiangshan's avatar
      rcu: Flip ->completed only once per SRCU grace period · 944ce9af
      Lai Jiangshan authored
      This is an optimization of the SRCU grace period.  To guard against
      preempted readers with old values of the counter, it suffices to scan the
      old counters once more, then flip ->completed only one time.  The reason
      this works is that the old readers must have incremented the old set of
      counters (if they have not yet incremented, then their critical section
      starts after this grace period, so they may be safely ignored).
      
      This commit therefore optimizes the second flip out in favor of a simple
      rescan.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      944ce9af
    • Lai Jiangshan's avatar
      rcu: Increment upper bit only for srcu_read_lock() · 440253c1
      Lai Jiangshan authored
      The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
      that no reasonable series of srcu_read_lock() and srcu_read_unlock()
      operations can return the value of the counter to its original value.
      This guarantee is require only after the index has been switched to
      the other set of counters, so at most one srcu_read_lock() can affect
      a given CPU's counter.  The number of srcu_read_unlock() operations
      on a given counter is limited to the number of tasks in the system,
      which given the Linux kernel's current structure is limited to far less
      than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
      (Something about a limited number of bytes in the kernel's address space.)
      
      Therefore, if srcu_read_lock() increments the upper bits, then
      srcu_read_unlock() need not do so.  In this case, an srcu_read_lock() and
      an srcu_read_unlock() will flip the lower bit of the upper field of the
      counter.  An unreasonably large additional number of srcu_read_unlock()
      operations would be required to return the counter to its initial value,
      thus preserving the guarantee.
      
      This commit takes this approach, which further allows it to shrink
      the size of the upper field to one bit, making the number of
      srcu_read_unlock() operations required to return the counter to its
      initial value even more unreasonable than before.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      440253c1
    • Lai Jiangshan's avatar
      rcu: Remove fast check path from __synchronize_srcu() · 4b7a3e9e
      Lai Jiangshan authored
      The fastpath in __synchronize_srcu() is designed to handle cases where
      there are a large number of concurrent calls for the same srcu_struct
      structure.  However, the Linux kernel currently does not use SRCU in
      this manner, so remove the fastpath checks for simplicity.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4b7a3e9e
    • Paul E. McKenney's avatar
      rcu: Direct algorithmic SRCU implementation · cef50120
      Paul E. McKenney authored
      The current implementation of synchronize_srcu_expedited() can cause
      severe OS jitter due to its use of synchronize_sched(), which in turn
      invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
      This can result in severe performance degradation for real-time workloads
      and especially for short-interation-length HPC workloads.  Furthermore,
      because only one instance of try_stop_cpus() can be making forward progress
      at a given time, only one instance of synchronize_srcu_expedited() can
      make forward progress at a time, even if they are all operating on
      distinct srcu_struct structures.
      
      This commit, inspired by an earlier implementation by Peter Zijlstra
      (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
      takes a strictly algorithmic bits-in-memory approach.  This has the
      disadvantage of requiring one explicit memory-barrier instruction in
      each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
      completely dispenses with OS jitter and furthermore allows SRCU to be
      used freely by CPUs that RCU believes to be idle or offline.
      
      The update-side implementation handles the single read-side memory
      barrier by rechecking the per-CPU counters after summing them and
      by running through the update-side state machine twice.
      
      This implementation has passed moderate rcutorture testing on both
      x86 and Power.  Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
      as suggested by Peter Zijlstra.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      cef50120
    • Paul E. McKenney's avatar
      rcu: Introduce rcutorture testing for rcu_barrier() · fae4b54f
      Paul E. McKenney authored
      Although rcutorture does invoke rcu_barrier() and friends, it cannot
      really be called a torture test given that it invokes them only once
      at the end of the test.  This commit therefore introduces heavy-duty
      rcutorture testing for rcu_barrier(), which may be carried out
      concurrently with normal rcutorture testing.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      fae4b54f
  2. 25 Apr, 2012 1 commit
  3. 21 Apr, 2012 23 commits
  4. 20 Apr, 2012 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-torvalds-20120418' of... · 3b422e9c
      Linus Torvalds authored
      Merge tag 'for-torvalds-20120418' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
      
      Pull pinctrl fixes from Linus Walleij:
       - Fixed compilation errors and warnings
       - Stricter checks on the ops vtable
      
      * tag 'for-torvalds-20120418' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: implement pinctrl_check_ops
        pinctrl: include <linux/bug.h> to prevent compile errors
        pinctrl: fix compile error if not select PINMUX support
      3b422e9c
    • Linus Torvalds's avatar
      Merge tag 'tty-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3a537430
      Linus Torvalds authored
      Pull 3 tiny tty bugfixes from Greg Kroah-Hartman.
      
      * tag 'tty-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        drivers/tty/amiserial.c: add missing tty_unlock
        pch_uart: Fix dma channel unallocated issue
        ARM: clps711x: serial driver hungs are a result of call disable_irq within ISR
      3a537430
    • Linus Torvalds's avatar
      Merge tag 'usb-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 1cd653a6
      Linus Torvalds authored
      Pull USB fixes from Greg Kroah-Hartman:
       "Here are a number of tiny USB fixes for 3.4-rc4.
      
        Most of them are in the USB gadget area, but a few other minor USB
        driver and core fixes are here as well.
      
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
      
      * tag 'usb-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (36 commits)
        USB: serial: cp210x: Fixed usb_control_msg timeout values
        USB: ehci-tegra: don't call set_irq_flags(IRQF_VALID)
        USB: yurex: Fix missing URB_NO_TRANSFER_DMA_MAP flag in urb
        USB: yurex: Remove allocation of coherent buffer for setup-packet buffer
        drivers/usb/misc/usbtest.c: add kfrees
        USB: ehci-fsl: Fix kernel crash on mpc5121e
        uwb: fix error handling
        uwb: fix use of del_timer_sync() in interrupt
        EHCI: always clear the STS_FLR status bit
        EHCI: fix criterion for resuming the root hub
        USB: sierra: avoid QMI/wwan interface on MC77xx
        usb: usbtest: avoid integer overflow in alloc_sglist()
        usb: usbtest: avoid integer overflow in test_ctrl_queue()
        USB: fix deadlock in bConfigurationValue attribute method
        usb: gadget: eliminate NULL pointer dereference (bugfix)
        usb: gadget: uvc: Remove non-required locking from 'uvc_queue_next_buffer' routine
        usb: gadget: rndis: fix Missing req->context assignment
        usb: musb: omap: fix the error check for pm_runtime_get_sync
        usb: gadget: udc-core: fix asymmetric calls in remove_driver
        usb: musb: omap: fix crash when musb glue (omap) gets initialized
        ...
      1cd653a6
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.4-rc3-tag' of... · c1acb0ba
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull xen fixes from Konrad Rzeszutek Wilk:
       - mechanism to work with misconfigured backends (where they are
         advertised but in reality don't exist).
       - two tiny compile warning fixes.
       - proper error handling in gnttab_resume
       - Not using VM_PFNMAP anymore to allow backends in the same domain.
      
      * tag 'stable/for-linus-3.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        Revert "xen/p2m: m2p_find_override: use list_for_each_entry_safe"
        xen/resume: Fix compile warnings.
        xen/xenbus: Add quirk to deal with misconfigured backends.
        xen/blkback: Fix warning error.
        xen/p2m: m2p_find_override: use list_for_each_entry_safe
        xen/gntdev: do not set VM_PFNMAP
        xen/grant-table: add error-handling code on failure of gnttab_resume
      c1acb0ba