1. 25 Mar, 2013 4 commits
    • Lai Jiangshan's avatar
      workqueue: protect wq->pwqs and iteration with wq->mutex · b09f4fd3
      Lai Jiangshan authored
      We're expanding wq->mutex to cover all fields specific to each
      workqueue with the end goal of replacing pwq_lock which will make
      locking simpler and easier to understand.
      
      init_and_link_pwq() and pwq_unbound_release_workfn() already grab
      wq->mutex when adding or removing a pwq from wq->pwqs list.  This
      patch makes it official that the list is wq->mutex protected for
      writes and updates readers accoridingly.  Explicit IRQ toggles for
      sched-RCU read-locking in flush_workqueue_prep_pwqs() and
      drain_workqueues() are removed as the surrounding wq->mutex can
      provide sufficient synchronization.
      
      Also, assert_rcu_or_pwq_lock() is renamed to assert_rcu_or_wq_mutex()
      and checks for wq->mutex too.
      
      pwq_lock locking and assertion are not removed by this patch and a
      couple of for_each_pwq() iterations are still protected by it.
      They'll be removed by future patches.
      
      tj: Rebased on top of the current dev branch.  Updated description.
          Folded in assert_rcu_or_wq_mutex() renaming from a later patch
          along with associated comment updates.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      b09f4fd3
    • Lai Jiangshan's avatar
      workqueue: protect wq->nr_drainers and ->flags with wq->mutex · 87fc741e
      Lai Jiangshan authored
      We're expanding wq->mutex to cover all fields specific to each
      workqueue with the end goal of replacing pwq_lock which will make
      locking simpler and easier to understand.
      
      wq->nr_drainers and ->flags are specific to each workqueue.  Protect
      ->nr_drainers and ->flags with wq->mutex instead of pool_mutex.
      
      tj: Rebased on top of the current dev branch.  Updated description.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      87fc741e
    • Lai Jiangshan's avatar
      workqueue: rename wq->flush_mutex to wq->mutex · 3c25a55d
      Lai Jiangshan authored
      Currently pwq->flush_mutex protects many fields of a workqueue
      including, especially, the pwqs list.  We're going to expand this
      mutex to protect most of a workqueue and eventually replace pwq_lock,
      which will make locking simpler and easier to understand.
      
      Drop the "flush_" prefix in preparation.
      
      This patch is pure rename.
      
      tj: Rebased on top of the current dev branch.  Updated description.
          Use WQ: and WR: instead of Q: and QR: for synchronization labels.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      3c25a55d
    • Lai Jiangshan's avatar
      workqueue: rename wq_mutex to wq_pool_mutex · 68e13a67
      Lai Jiangshan authored
      wq->flush_mutex will be renamed to wq->mutex and cover all fields
      specific to each workqueue and eventually replace pwq_lock, which will
      make locking simpler and easier to understand.
      
      Rename wq_mutex to wq_pool_mutex to avoid confusion with wq->mutex.
      After the scheduled changes, wq_pool_mutex won't be protecting
      anything specific to each workqueue instance anyway.
      
      This patch is pure rename.
      
      tj: s/wqs_mutex/wq_pool_mutex/.  Rewrote description.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      68e13a67
  2. 20 Mar, 2013 5 commits
  3. 19 Mar, 2013 5 commits
    • Tejun Heo's avatar
      workqueue: restore CPU affinity of unbound workers on CPU_ONLINE · 7dbc725e
      Tejun Heo authored
      With the recent addition of the custom attributes support, unbound
      pools may have allowed cpumask which isn't full.  As long as some of
      CPUs in the cpumask are online, its workers will maintain cpus_allowed
      as set on worker creation; however, once no online CPU is left in
      cpus_allowed, the scheduler will reset cpus_allowed of any workers
      which get scheduled so that they can execute.
      
      To remain compliant to the user-specified configuration, CPU affinity
      needs to be restored when a CPU becomes online for an unbound pool
      which doesn't currently have any online CPUs before.
      
      This patch implement restore_unbound_workers_cpumask(), which is
      called from CPU_ONLINE for all unbound pools, checks whether the
      coming up CPU is the first allowed online one, and, if so, invokes
      set_cpus_allowed_ptr() with the configured cpumask on all workers.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      7dbc725e
    • Tejun Heo's avatar
      workqueue: directly restore CPU affinity of workers from CPU_ONLINE · a9ab775b
      Tejun Heo authored
      Rebinding workers of a per-cpu pool after a CPU comes online involves
      a lot of back-and-forth mostly because only the task itself could
      adjust CPU affinity if PF_THREAD_BOUND was set.
      
      As CPU_ONLINE itself couldn't adjust affinity, it had to somehow
      coerce the workers themselves to perform set_cpus_allowed_ptr().  Due
      to the various states a worker can be in, this led to three different
      paths a worker may be rebound.  worker->rebind_work is queued to busy
      workers.  Idle ones are signaled by unlinking worker->entry and call
      idle_worker_rebind().  The manager isn't covered by either and
      implements its own mechanism.
      
      PF_THREAD_BOUND has been relaced with PF_NO_SETAFFINITY and CPU_ONLINE
      itself now can manipulate CPU affinity of workers.  This patch
      replaces the existing rebind mechanism with direct one where
      CPU_ONLINE iterates over all workers using for_each_pool_worker(),
      restores CPU affinity, and clears WORKER_UNBOUND.
      
      There are a couple subtleties.  All bound idle workers should have
      their runqueues set to that of the bound CPU; however, if the target
      task isn't running, set_cpus_allowed_ptr() just updates the
      cpus_allowed mask deferring the actual migration to when the task
      wakes up.  This is worked around by waking up idle workers after
      restoring CPU affinity before any workers can become bound.
      
      Another subtlety is stems from matching @pool->nr_running with the
      number of running unbound workers.  While DISASSOCIATED, all workers
      are unbound and nr_running is zero.  As workers become bound again,
      nr_running needs to be adjusted accordingly; however, there is no good
      way to tell whether a given worker is running without poking into
      scheduler internals.  Instead of clearing UNBOUND directly,
      rebind_workers() replaces UNBOUND with another new NOT_RUNNING flag -
      REBOUND, which will later be cleared by the workers themselves while
      preparing for the next round of work item execution.  The only change
      needed for the workers is clearing REBOUND along with PREP.
      
      * This patch leaves for_each_busy_worker() without any user.  Removed.
      
      * idle_worker_rebind(), busy_worker_rebind_fn(), worker->rebind_work
        and rebind logic in manager_workers() removed.
      
      * worker_thread() now looks at WORKER_DIE instead of testing whether
        @worker->entry is empty to determine whether it needs to do
        something special as dying is the only special thing now.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      a9ab775b
    • Tejun Heo's avatar
      workqueue: relocate rebind_workers() · bd7c089e
      Tejun Heo authored
      rebind_workers() will be reimplemented in a way which makes it mostly
      decoupled from the rest of worker management.  Move rebind_workers()
      so that it's located with other CPU hotplug related functions.
      
      This patch is pure function relocation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      bd7c089e
    • Tejun Heo's avatar
      workqueue: convert worker_pool->worker_ida to idr and implement for_each_pool_worker() · 822d8405
      Tejun Heo authored
      Make worker_ida an idr - worker_idr and use it to implement
      for_each_pool_worker() which will be used to simplify worker rebinding
      on CPU_ONLINE.
      
      pool->worker_idr is protected by both pool->manager_mutex and
      pool->lock so that it can be iterated while holding either lock.
      
      * create_worker() allocates ID without installing worker pointer and
        installs the pointer later using idr_replace().  This is because
        worker ID is needed when creating the actual task to name it and the
        new worker shouldn't be visible to iterations before fully
        initialized.
      
      * In destroy_worker(), ID removal is moved before kthread_stop().
        This is again to guarantee that only fully working workers are
        visible to for_each_pool_worker().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      822d8405
    • Tejun Heo's avatar
      sched: replace PF_THREAD_BOUND with PF_NO_SETAFFINITY · 14a40ffc
      Tejun Heo authored
      PF_THREAD_BOUND was originally used to mark kernel threads which were
      bound to a specific CPU using kthread_bind() and a task with the flag
      set allows cpus_allowed modifications only to itself.  Workqueue is
      currently abusing it to prevent userland from meddling with
      cpus_allowed of workqueue workers.
      
      What we need is a flag to prevent userland from messing with
      cpus_allowed of certain kernel tasks.  In kernel, anyone can
      (incorrectly) squash the flag, and, for worker-type usages,
      restricting cpus_allowed modification to the task itself doesn't
      provide meaningful extra proection as other tasks can inject work
      items to the task anyway.
      
      This patch replaces PF_THREAD_BOUND with PF_NO_SETAFFINITY.
      sched_setaffinity() checks the flag and return -EINVAL if set.
      set_cpus_allowed_ptr() is no longer affected by the flag.
      
      This will allow simplifying workqueue worker CPU affinity management.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      14a40ffc
  4. 14 Mar, 2013 7 commits
    • Tejun Heo's avatar
      workqueue: rename workqueue_lock to wq_mayday_lock · 2e109a28
      Tejun Heo authored
      With the recent locking updates, the only thing protected by
      workqueue_lock is workqueue->maydays list.  Rename workqueue_lock to
      wq_mayday_lock.
      
      This patch is pure rename.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      2e109a28
    • Tejun Heo's avatar
      workqueue: separate out pool_workqueue locking into pwq_lock · 794b18bc
      Tejun Heo authored
      This patch continues locking cleanup from the previous patch.  It
      breaks out pool_workqueue synchronization from workqueue_lock into a
      new spinlock - pwq_lock.  The followings are protected by pwq_lock.
      
      * workqueue->pwqs
      * workqueue->saved_max_active
      
      The conversion is straight-forward.  workqueue_lock usages which cover
      the above two are converted to pwq_lock.  New locking label PW added
      for things protected by pwq_lock and FR is updated to mean flush_mutex
      + pwq_lock + sched-RCU.
      
      This patch shouldn't introduce any visible behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      794b18bc
    • Tejun Heo's avatar
      workqueue: separate out pool and workqueue locking into wq_mutex · 5bcab335
      Tejun Heo authored
      Currently, workqueue_lock protects most shared workqueue resources -
      the pools, workqueues, pool_workqueues, draining, ID assignments,
      mayday handling and so on.  The coverage has grown organically and
      there is no identified bottleneck coming from workqueue_lock, but it
      has grown a bit too much and scheduled rebinding changes need the
      pools and workqueues to be protected by a mutex instead of a spinlock.
      
      This patch breaks out pool and workqueue synchronization from
      workqueue_lock into a new mutex - wq_mutex.  The followings are
      protected by wq_mutex.
      
      * worker_pool_idr and unbound_pool_hash
      * pool->refcnt
      * workqueues list
      * workqueue->flags, ->nr_drainers
      
      Most changes are mostly straight-forward.  workqueue_lock is replaced
      with wq_mutex where applicable and workqueue_lock lock/unlocks are
      added where wq_mutex conversion leaves data structures not protected
      by wq_mutex without locking.  irq / preemption flippings were added
      where the conversion affects them.  Things worth noting are
      
      * New WQ and WR locking lables added along with
        assert_rcu_or_wq_mutex().
      
      * worker_pool_assign_id() now expects to be called under wq_mutex.
      
      * create_mutex is removed from get_unbound_pool().  It now just holds
        wq_mutex.
      
      This patch shouldn't introduce any visible behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      5bcab335
    • Tejun Heo's avatar
      workqueue: relocate global variable defs and function decls in workqueue.c · 7d19c5ce
      Tejun Heo authored
      They're split across debugobj code for some reason.  Collect them.
      
      This patch is pure relocation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      7d19c5ce
    • Tejun Heo's avatar
      workqueue: better define locking rules around worker creation / destruction · cd549687
      Tejun Heo authored
      When a manager creates or destroys workers, the operations are always
      done with the manager_mutex held; however, initial worker creation or
      worker destruction during pool release don't grab the mutex.  They are
      still correct as initial worker creation doesn't require
      synchronization and grabbing manager_arb provides enough exclusion for
      pool release path.
      
      Still, let's make everyone follow the same rules for consistency and
      such that lockdep annotations can be added.
      
      Update create_and_start_worker() and put_unbound_pool() to grab
      manager_mutex around thread creation and destruction respectively and
      add lockdep assertions to create_worker() and destroy_worker().
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      cd549687
    • Tejun Heo's avatar
      workqueue: factor out initial worker creation into create_and_start_worker() · ebf44d16
      Tejun Heo authored
      get_unbound_pool(), workqueue_cpu_up_callback() and init_workqueues()
      have similar code pieces to create and start the initial worker factor
      those out into create_and_start_worker().
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ebf44d16
    • Tejun Heo's avatar
      workqueue: rename worker_pool->assoc_mutex to ->manager_mutex · bc3a1afc
      Tejun Heo authored
      Manager operations are currently governed by two mutexes -
      pool->manager_arb and ->assoc_mutex.  The former is used to decide who
      gets to be the manager and the latter to exclude the actual manager
      operations including creation and destruction of workers.  Anyone who
      grabs ->manager_arb must perform manager role; otherwise, the pool
      might stall.
      
      Grabbing ->assoc_mutex blocks everyone else from performing manager
      operations but doesn't require the holder to perform manager duties as
      it's merely blocking manager operations without becoming the manager.
      
      Because the blocking was necessary when [dis]associating per-cpu
      workqueues during CPU hotplug events, the latter was named
      assoc_mutex.  The mutex is scheduled to be used for other purposes, so
      this patch gives it a more fitting generic name - manager_mutex - and
      updates / adds comments to explain synchronization around the manager
      role and operations.
      
      This patch is pure rename / doc update.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      bc3a1afc
  5. 13 Mar, 2013 7 commits
    • Tejun Heo's avatar
      workqueue: inline trivial wrappers · 8425e3d5
      Tejun Heo authored
      There's no reason to make these trivial wrappers full (exported)
      functions.  Inline the followings.
      
       queue_work()
       queue_delayed_work()
       mod_delayed_work()
       schedule_work_on()
       schedule_work()
       schedule_delayed_work_on()
       schedule_delayed_work()
       keventd_up()
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      8425e3d5
    • Tejun Heo's avatar
      workqueue: rename @id to @pi in for_each_each_pool() · 611c92a0
      Tejun Heo authored
      Rename @id argument of for_each_pool() to @pi so that it doesn't get
      reused accidentally when for_each_pool() is used in combination with
      other iterators.
      
      This patch is purely cosmetic.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      611c92a0
    • Tejun Heo's avatar
      workqueue: update comments and a warning message · c5aa87bb
      Tejun Heo authored
      * Update incorrect and add missing synchronization labels.
      
      * Update incorrect or misleading comments.  Add new comments where
        clarification is necessary.  Reformat / rephrase some comments.
      
      * drain_workqueue() can be used separately from destroy_workqueue()
        but its warning message was incorrectly referring to destruction.
      
      Other than the warning message change, this patch doesn't make any
      functional changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      c5aa87bb
    • Tejun Heo's avatar
      workqueue: fix max_active handling in init_and_link_pwq() · 983ca25e
      Tejun Heo authored
      Since 9e8cd2f5 ("workqueue: implement apply_workqueue_attrs()"),
      init_and_link_pwq() may be called to initialize a new pool_workqueue
      for a workqueue which is already online, but the function was setting
      pwq->max_active to wq->saved_max_active without proper
      synchronization.
      
      Fix it by calling pwq_adjust_max_active() under proper locking instead
      of manually setting max_active.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      983ca25e
    • Tejun Heo's avatar
      workqueue: implement and use pwq_adjust_max_active() · 699ce097
      Tejun Heo authored
      Rename pwq_set_max_active() to pwq_adjust_max_active() and move
      pool_workqueue->max_active synchronization and max_active
      determination logic into it.
      
      The new function should be called with workqueue_lock held for stable
      workqueue->saved_max_active, determines the current max_active value
      the target pool_workqueue should be using from @wq->saved_max_active
      and the state of the associated pool, and applies it with proper
      synchronization.
      
      The current two users - workqueue_set_max_active() and
      thaw_workqueues() - are updated accordingly.  In addition, the manual
      freezing handling in __alloc_workqueue_key() and
      freeze_workqueues_begin() are replaced with calls to
      pwq_adjust_max_active().
      
      This centralizes max_active handling so that it's less error-prone.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      699ce097
    • Tejun Heo's avatar
      workqueue: relocate pwq_set_max_active() · 0fbd95aa
      Tejun Heo authored
      pwq_set_max_active() is gonna be modified and used during
      pool_workqueue init.  Move it above init_and_link_pwq().
      
      This patch is pure code reorganization and doesn't introduce any
      functional changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0fbd95aa
    • Tejun Heo's avatar
      workqueue: implement current_is_workqueue_rescuer() · e6267616
      Tejun Heo authored
      Implement a function which queries whether it currently is running off
      a workqueue rescuer.  This will be used to convert writeback to
      workqueue.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      e6267616
  6. 12 Mar, 2013 12 commits
    • Tejun Heo's avatar
      workqueue: implement sysfs interface for workqueues · 226223ab
      Tejun Heo authored
      There are cases where workqueue users want to expose control knobs to
      userland.  e.g. Unbound workqueues with custom attributes are
      scheduled to be used for writeback workers and depending on
      configuration it can be useful to allow admins to tinker with the
      priority or allowed CPUs.
      
      This patch implements workqueue_sysfs_register(), which makes the
      workqueue visible under /sys/bus/workqueue/devices/WQ_NAME.  There
      currently are two attributes common to both per-cpu and unbound pools
      and extra attributes for unbound pools including nice level and
      cpumask.
      
      If alloc_workqueue*() is called with WQ_SYSFS,
      workqueue_sysfs_register() is called automatically as part of
      workqueue creation.  This is the preferred method unless the workqueue
      user wants to apply workqueue_attrs before making the workqueue
      visible to userland.
      
      v2: Disallow exposing ordered workqueues as ordered workqueues can't
          be tuned in any way.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      226223ab
    • Tejun Heo's avatar
    • Tejun Heo's avatar
      driver/base: implement subsys_virtual_register() · d73ce004
      Tejun Heo authored
      Kay tells me the most appropriate place to expose workqueues to
      userland would be /sys/devices/virtual/workqueues/WQ_NAME which is
      symlinked to /sys/bus/workqueue/devices/WQ_NAME and that we're lacking
      a way to do that outside of driver core as virtual_device_parent()
      isn't exported and there's no inteface to conveniently create a
      virtual subsystem.
      
      This patch implements subsys_virtual_register() by factoring out
      subsys_register() from subsys_system_register() and using it with
      virtual_device_parent() as the origin directory.  It's identical to
      subsys_system_register() other than the origin directory but we aren't
      gonna restrict the device names which should be used under it.
      
      This will be used to expose workqueue attributes to userland.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      d73ce004
    • Tejun Heo's avatar
      cpumask: implement cpumask_parse() · ba630e49
      Tejun Heo authored
      We have cpulist_parse() but not cpumask_parse().  Implement it using
      bitmap_parse().
      
      bitmap_parse() is weird in that it takes @len for a string in
      kernel-memory which also is inconsistent with bitmap_parselist().
      Make cpumask_parse() calculate the length and don't expose the
      inconsistency to cpumask users.  Maybe we can fix up bitmap_parse()
      later.
      
      This will be used to expose workqueue cpumask knobs to userland via
      sysfs.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      ba630e49
    • Tejun Heo's avatar
      workqueue: reject adjusting max_active or applying attrs to ordered workqueues · 8719dcea
      Tejun Heo authored
      Adjusting max_active of or applying new workqueue_attrs to an ordered
      workqueue breaks its ordering guarantee.  The former is obvious.  The
      latter is because applying attrs creates a new pwq (pool_workqueue)
      and there is no ordering constraint between the old and new pwqs.
      
      Make apply_workqueue_attrs() and workqueue_set_max_active() trigger
      WARN_ON() if those operations are requested on an ordered workqueue
      and fail / ignore respectively.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      8719dcea
    • Tejun Heo's avatar
      workqueue: make it clear that WQ_DRAINING is an internal flag · 618b01eb
      Tejun Heo authored
      We're gonna add another internal WQ flag.  Let's make the distinction
      clear.  Prefix WQ_DRAINING with __ and move it to bit 16.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      618b01eb
    • Tejun Heo's avatar
      workqueue: implement apply_workqueue_attrs() · 9e8cd2f5
      Tejun Heo authored
      Implement apply_workqueue_attrs() which applies workqueue_attrs to the
      specified unbound workqueue by creating a new pwq (pool_workqueue)
      linked to worker_pool with the specified attributes.
      
      A new pwq is linked at the head of wq->pwqs instead of tail and
      __queue_work() verifies that the first unbound pwq has positive refcnt
      before choosing it for the actual queueing.  This is to cover the case
      where creation of a new pwq races with queueing.  As base ref on a pwq
      won't be dropped without making another pwq the first one,
      __queue_work() is guaranteed to make progress and not add work item to
      a dead pwq.
      
      init_and_link_pwq() is updated to return the last first pwq the new
      pwq replaced, which is put by apply_workqueue_attrs().
      
      Note that apply_workqueue_attrs() is almost identical to unbound pwq
      part of alloc_and_link_pwqs().  The only difference is that there is
      no previous first pwq.  apply_workqueue_attrs() is implemented to
      handle such cases and replaces unbound pwq handling in
      alloc_and_link_pwqs().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      9e8cd2f5
    • Tejun Heo's avatar
      workqueue: perform non-reentrancy test when queueing to unbound workqueues too · c9178087
      Tejun Heo authored
      Because per-cpu workqueues have multiple pwqs (pool_workqueues) to
      serve the CPUs, to guarantee that a single work item isn't queued on
      one pwq while still executing another, __queue_work() takes a look at
      the previous pool the target work item was on and if it's still
      executing there, queue the work item on that pool.
      
      To support changing workqueue_attrs on the fly, unbound workqueues too
      will have multiple pwqs and thus need non-reentrancy test when
      queueing.  This patch modifies __queue_work() such that the reentrancy
      test is performed regardless of the workqueue type.
      
      per_cpu_ptr(wq->cpu_pwqs, cpu) used to be used to determine the
      matching pwq for the last pool.  This can't be used for unbound
      workqueues and is replaced with worker->current_pwq which also happens
      to be simpler.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      c9178087
    • Tejun Heo's avatar
      workqueue: prepare flush_workqueue() for dynamic creation and destrucion of unbound pool_workqueues · 75ccf595
      Tejun Heo authored
      Unbound pwqs (pool_workqueues) will be dynamically created and
      destroyed with the scheduled unbound workqueue w/ custom attributes
      support.  This patch synchronizes pwq linking and unlinking against
      flush_workqueue() so that its operation isn't disturbed by pwqs coming
      and going.
      
      Linking and unlinking a pwq into wq->pwqs is now protected also by
      wq->flush_mutex and a new pwq's work_color is initialized to
      wq->work_color during linking.  This ensures that pwqs changes don't
      disturb flush_workqueue() in progress and the new pwq's work coloring
      stays in sync with the rest of the workqueue.
      
      flush_mutex during unlinking isn't strictly necessary but it's simpler
      to do it anyway.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      75ccf595
    • Tejun Heo's avatar
      workqueue: implement get/put_pwq() · 8864b4e5
      Tejun Heo authored
      Add pool_workqueue->refcnt along with get/put_pwq().  Both per-cpu and
      unbound pwqs have refcnts and any work item inserted on a pwq
      increments the refcnt which is dropped when the work item finishes.
      
      For per-cpu pwqs the base ref is never dropped and destroy_workqueue()
      frees the pwqs as before.  For unbound ones, destroy_workqueue()
      simply drops the base ref on the first pwq.  When the refcnt reaches
      zero, pwq_unbound_release_workfn() is scheduled on system_wq, which
      unlinks the pwq, puts the associated pool and frees the pwq and wq as
      necessary.  This needs to be done from a work item as put_pwq() needs
      to be protected by pool->lock but release can't happen with the lock
      held - e.g. put_unbound_pool() involves blocking operations.
      
      Unbound pool->locks are marked with lockdep subclas 1 as put_pwq()
      will schedule the release work item on system_wq while holding the
      unbound pool's lock and triggers recursive locking warning spuriously.
      
      This will be used to implement dynamic creation and destruction of
      unbound pwqs.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      8864b4e5
    • Tejun Heo's avatar
      workqueue: restructure __alloc_workqueue_key() · d2c1d404
      Tejun Heo authored
      * Move initialization and linking of pool_workqueues into
        init_and_link_pwq().
      
      * Make the failure path use destroy_workqueue() once pool_workqueue
        initialization succeeds.
      
      These changes are to prepare for dynamic management of pool_workqueues
      and don't introduce any functional changes.
      
      While at it, convert list_del(&wq->list) to list_del_init() as a
      precaution as scheduled changes will make destruction more complex.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      d2c1d404
    • Tejun Heo's avatar
      workqueue: drop WQ_RESCUER and test workqueue->rescuer for NULL instead · 493008a8
      Tejun Heo authored
      WQ_RESCUER is superflous.  WQ_MEM_RECLAIM indicates that the user
      wants a rescuer and testing wq->rescuer for NULL can answer whether a
      given workqueue has a rescuer or not.  Drop WQ_RESCUER and test
      wq->rescuer directly.
      
      This will help simplifying __alloc_workqueue_key() failure path by
      allowing it to use destroy_workqueue() on a partially constructed
      workqueue, which in turn will help implementing dynamic management of
      pool_workqueues.
      
      While at it, clear wq->rescuer after freeing it in
      destroy_workqueue().  This is a precaution as scheduled changes will
      make destruction more complex.
      
      This patch doesn't introduce any functional changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      493008a8