1. 02 Mar, 2024 3 commits
    • Tejun Heo's avatar
      dm-verity: Convert from tasklet to BH workqueue · c375b223
      Tejun Heo authored
      The only generic interface to execute asynchronously in the BH context is
      tasklet; however, it's marked deprecated and has some design flaws. To
      replace tasklets, BH workqueue support was recently added. A BH workqueue
      behaves similarly to regular workqueues except that the queued work items
      are executed in the BH context.
      
      This commit converts dm-verity from tasklet to BH workqueue. It
      backfills tasklet code that was removed with commit 0a9bab39
      ("dm-crypt, dm-verity: disable tasklets") and tweaks to use BH
      workqueue (and does some renaming).
      
      This is a minimal conversion which doesn't rename the related names
      including the "try_verify_in_tasklet" option. If this patch is applied, a
      follow-up patch would be necessary. I couldn't decide whether the option
      name would need to be updated too.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      [snitzer: rename 'use_tasklet' to 'use_bh_wq' and 'in_tasklet' to 'in_bh']
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      c375b223
    • Tejun Heo's avatar
      dm-crypt: Convert from tasklet to BH workqueue · fb6ad4ae
      Tejun Heo authored
      The only generic interface to execute asynchronously in the BH context is
      tasklet; however, it's marked deprecated and has some design flaws. To
      replace tasklets, BH workqueue support was recently added. A BH workqueue
      behaves similarly to regular workqueues except that the queued work items
      are executed in the BH context.
      
      This commit converts dm-crypt from tasklet to BH workqueue.  It
      backfills tasklet code that was removed with commit 0a9bab39
      ("dm-crypt, dm-verity: disable tasklets") and tweaks to use BH
      workqueue.
      
      Like a regular workqueue, a BH workqueue allows freeing the currently
      executing work item. Converting from tasklet to BH workqueue removes the
      need for deferring bio_endio() again to a work item, which was buggy anyway.
      
      I tested this lightly with "--perf-no_read_workqueue
      --perf-no_write_workqueue" + some code modifications, but would really
      -appreciate if someone who knows the code base better could take a look.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/82b964f0-c2c8-a2c6-5b1f-f3145dc2c8e5@redhat.com
      [snitzer: rebase ontop of commit 0a9bab39 reduced this commit's changes]
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      fb6ad4ae
    • Mike Snitzer's avatar
  2. 29 Feb, 2024 1 commit
    • Tejun Heo's avatar
      workqueue: Drain BH work items on hot-unplugged CPUs · 1acd92d9
      Tejun Heo authored
      Boqun pointed out that workqueues aren't handling BH work items on offlined
      CPUs. Unlike tasklet which transfers out the pending tasks from
      CPUHP_SOFTIRQ_DEAD, BH workqueue would just leave them pending which is
      problematic. Note that this behavior is specific to BH workqueues as the
      non-BH per-CPU workers just become unbound when the CPU goes offline.
      
      This patch fixes the issue by draining the pending BH work items from an
      offlined CPU from CPUHP_SOFTIRQ_DEAD. Because work items carry more context,
      it's not as easy to transfer the pending work items from one pool to
      another. Instead, run BH work items which execute the offlined pools on an
      online CPU.
      
      Note that this assumes that no further BH work items will be queued on the
      offlined CPUs. This assumption is shared with tasklet and should be fine for
      conversions. However, this issue also exists for per-CPU workqueues which
      will just keep executing work items queued after CPU offline on unbound
      workers and workqueue should reject per-CPU and BH work items queued on
      offline CPUs. This will be addressed separately later.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-reviewed-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Link: http://lkml.kernel.org/r/Zdvw0HdSXcU3JZ4g@boqun-archlinux
      1acd92d9
  3. 27 Feb, 2024 1 commit
    • Allen Pais's avatar
      workqueue: Introduce from_work() helper for cleaner callback declarations · 60b2ebf4
      Allen Pais authored
      To streamline the transition from tasklets to worqueues, a new helper
      function, from_work(), is introduced. This helper, inspired by existing
      from_() patterns, utilizes container_of() and eliminates the redundancy
      of declaring variable types, leading to more concise and readable code.
      
      The modified code snippet demonstrates the enhanced clarity achieved
      with from_wq():
      
        void callback(struct work_struct *w)
         {
           - struct some_data_structure *local = container_of(w,
      						       struct some_data_structure,
      						       work);
           + struct some_data_structure *local = from_work(local, w, work);
      
      This change aims to facilitate a smoother transition and uphold code
      quality standards.
      
      Based on:
        git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git disable_work-v3
      Signed-off-by: default avatarAllen Pais <allen.lkml@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      60b2ebf4
  4. 22 Feb, 2024 1 commit
    • Xuewen Yan's avatar
      workqueue: Control intensive warning threshold through cmdline · ccdec921
      Xuewen Yan authored
      When CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel will report
      the work functions which violate the intensive_threshold_us repeatedly.
      And now, only when the violate times exceed 4 and is a power of 2,
      the kernel warning could be triggered.
      
      However, sometimes, even if a long work execution time occurs only once,
      it may cause other work to be delayed for a long time. This may also
      cause some problems sometimes.
      
      In order to freely control the threshold of warninging, a boot argument
      is added so that the user can control the warning threshold to be printed.
      At the same time, keep the exponential backoff to prevent reporting too much.
      
      By default, the warning threshold is 4.
      
      tj: Updated kernel-parameters.txt description.
      Signed-off-by: default avatarXuewen Yan <xuewen.yan@unisoc.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ccdec921
  5. 21 Feb, 2024 10 commits
  6. 20 Feb, 2024 15 commits
  7. 16 Feb, 2024 1 commit
    • Tejun Heo's avatar
      workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK · fd0a68a2
      Tejun Heo authored
      2f34d733 ("workqueue: Fix queue_work_on() with BH workqueues") added
      irq_work usage to workqueue; however, it turns out irq_work is actually
      optional and the change breaks build on configuration which doesn't have
      CONFIG_IRQ_WORK enabled.
      
      Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
      CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
      be better to just always enable it. However, this still saves a small bit of
      memory for tiny UP configs and also the least amount of change, so, for now,
      let's keep it conditional.
      
      Verified to do the right thing for x86_64 allnoconfig and defconfig, and
      aarch64 allnoconfig, allnoconfig + prink disable (SMP but nothing selects
      IRQ_WORK) and a modified aarch64 Kconfig where !SMP and nothing selects
      IRQ_WORK.
      
      v2: `depends on SMP` leads to Kconfig warnings when CONFIG_IRQ_WORK is
          selected by something else when !CONFIG_SMP. Use `def_bool y if SMP`
          instead.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Tested-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Fixes: 2f34d733 ("workqueue: Fix queue_work_on() with BH workqueues")
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      fd0a68a2
  8. 14 Feb, 2024 1 commit
    • Tejun Heo's avatar
      workqueue: Fix queue_work_on() with BH workqueues · 2f34d733
      Tejun Heo authored
      When queue_work_on() is used to queue a BH work item on a remote CPU, the
      work item is queued on that CPU but kick_pool() raises softirq on the local
      CPU. This leads to stalls as the work item won't be executed until something
      else on the remote CPU schedules a BH work item or tasklet locally.
      
      Fix it by bouncing raising softirq to the target CPU using per-cpu irq_work.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 4cb1ef64 ("workqueue: Implement BH workqueues to eventually replace tasklets")
      2f34d733
  9. 09 Feb, 2024 3 commits
  10. 08 Feb, 2024 4 commits
    • Waiman Long's avatar
      workqueue: Bind unbound workqueue rescuer to wq_unbound_cpumask · 49584bb8
      Waiman Long authored
      Commit 85f0ab43 ("kernel/workqueue: Bind rescuer to unbound
      cpumask for WQ_UNBOUND") modified init_rescuer() to bind rescuer of
      an unbound workqueue to the cpumask in wq->unbound_attrs. However
      unbound_attrs->cpumask's of all workqueues are initialized to
      cpu_possible_mask and will only be changed if it has the WQ_SYSFS flag
      to expose a cpumask sysfs file to be written by users. So this patch
      doesn't achieve what it is intended to do.
      
      If an unbound workqueue is created after wq_unbound_cpumask is modified
      and there is no more unbound cpumask update after that, the unbound
      rescuer will be bound to all CPUs unless the workqueue is created
      with the WQ_SYSFS flag and a user explicitly modified its cpumask
      sysfs file.  Fix this problem by binding directly to wq_unbound_cpumask
      in init_rescuer().
      
      Fixes: 85f0ab43 ("kernel/workqueue: Bind rescuer to unbound cpumask for WQ_UNBOUND")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      49584bb8
    • Juri Lelli's avatar
      kernel/workqueue: Let rescuers follow unbound wq cpumask changes · d64f2fa0
      Juri Lelli authored
      When workqueue cpumask changes are committed the associated rescuer (if
      one exists) affinity is not touched and this might be a problem down the
      line for isolated setups.
      
      Make sure rescuers affinity is updated every time a workqueue cpumask
      changes, so that rescuers can't break isolation.
      
       [longman: set_cpus_allowed_ptr() will block until the designated task
        is enqueued on an allowed CPU, no wake_up_process() needed. Also use
        the unbound_effective_cpumask() helper as suggested by Tejun.]
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      d64f2fa0
    • Waiman Long's avatar
      workqueue: Enable unbound cpumask update on ordered workqueues · 4c065dbc
      Waiman Long authored
      Ordered workqueues does not currently follow changes made to the
      global unbound cpumask because per-pool workqueue changes may break
      the ordering guarantee. IOW, a work function in an ordered workqueue
      may run on an isolated CPU.
      
      This patch enables ordered workqueues to follow changes made to the
      global unbound cpumask by temporaily plug or suspend the newly allocated
      pool_workqueue from executing newly queued work items until the old
      pwq has been properly drained. For ordered workqueues, there should
      only be one pwq that is unplugged, the rests should be plugged.
      
      This enables ordered workqueues to follow the unbound cpumask changes
      like other unbound workqueues at the expense of some delay in execution
      of work functions during the transition period.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      4c065dbc
    • Waiman Long's avatar
      workqueue: Link pwq's into wq->pwqs from oldest to newest · 26fb7e3d
      Waiman Long authored
      Add a new pwq into the tail of wq->pwqs so that pwq iteration will
      start from the oldest pwq to the newest. This ordering will facilitate
      the inclusion of ordered workqueues in a wq_unbound_cpumask update.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      26fb7e3d