percpu-refcount: use RCU-sched insted of normal RCU
percpu-refcount was incorrectly using preempt_disable/enable() for RCU critical sections against call_rcu(). 6a24474d ("percpu-refcount: consistently use plain (non-sched) RCU") fixed it by converting the preepmtion operations with rcu_read_[un]lock() citing that there isn't any advantage in using sched-RCU over using the usual one; however, rcu_read_[un]lock() for the preemptible RCU implementation - CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly more expensive than preempt_disable/enable(). In a contrived microbench which repeats the followings, - percpu_ref_get() - copy 32 bytes of data into percpu buffer - percpu_put_get() - copy 32 bytes of data into percpu buffer rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by about 15% when compared to using sched-RCU. As the RCU critical sections are extremely short, using sched-RCU shouldn't have any latency implications. Convert to RCU-sched. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Kent Overstreet <koverstreet@google.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rusty Russell <rusty@rustcorp.com.au>
Showing
Please register or sign in to comment