1. 24 Sep, 2022 2 commits
  2. 23 Sep, 2022 3 commits
  3. 07 Sep, 2022 1 commit
  4. 06 Sep, 2022 3 commits
    • Tejun Heo's avatar
      cgroup: Remove CFTYPE_PRESSURE · 8a693f77
      Tejun Heo authored
      CFTYPE_PRESSURE is used to flag PSI related files so that they are not
      created if PSI is disabled during boot. It's a bit weird to use a generic
      flag to mark a specific file type. Let's instead move the PSI files into its
      own cftypes array and add/rm them conditionally. This is a bit more code but
      cleaner.
      
      No userland visible changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      8a693f77
    • Tejun Heo's avatar
      cgroup: Improve cftype add/rm error handling · 0083d27b
      Tejun Heo authored
      Let's track whether a cftype is currently added or not using a new flag
      __CFTYPE_ADDED so that duplicate operations can be failed safely and
      consistently allow using empty cftypes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0083d27b
    • Tejun Heo's avatar
      cpuset: Add Waiman Long as a cpuset maintainer · a81e18e9
      Tejun Heo authored
      Waiman has been very active with cpuset recently and I've been cc'ing him
      for cpuset related changes for a while now. Let's make him a cpuset
      maintainer.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Zefan Li <lizefan.x@bytedance.com>
      Cc: Waiman Long <longman@redhat.com>
      a81e18e9
  5. 04 Sep, 2022 16 commits
    • Waiman Long's avatar
      kselftest/cgroup: Add cpuset v2 partition root state test · a8c52eba
      Waiman Long authored
      Add a test script test_cpuset_prs.sh with a helper program wait_inotify
      for exercising the cpuset v2 partition root state code.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      a8c52eba
    • Waiman Long's avatar
      cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst · 8cbfdc24
      Waiman Long authored
      Update Documentation/admin-guide/cgroup-v2.rst on the newly introduced
      "isolated" cpuset partition type as well as other changes made in other
      cpuset patches.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      8cbfdc24
    • Waiman Long's avatar
      cgroup/cpuset: Make partition invalid if cpumask change violates exclusivity rule · d7c8142d
      Waiman Long authored
      Currently, changes in "cpust.cpus" of a partition root is not allowed if
      it violates the sibling cpu exclusivity rule when the check is done
      in the validate_change() function. That is inconsistent with the
      other cpuset changes that are always allowed but may make a partition
      invalid.
      
      Update the cpuset code to allow cpumask change even if it violates the
      sibling cpu exclusivity rule, but invalidate the partition instead
      just like the other changes. However, other sibling partitions with
      conflicting cpumask will also be invalidated in order to not violating
      the exclusivity rule. This behavior is specific to this partition
      rule violation.
      
      Note that a previous commit has made sibling cpu exclusivity rule check
      the last check of validate_change(). So if -EINVAL is returned, we can
      be sure that sibling cpu exclusivity rule violation is the only rule
      that is broken.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      d7c8142d
    • Waiman Long's avatar
      cgroup/cpuset: Relocate a code block in validate_change() · 74027a65
      Waiman Long authored
      This patch moves down the exclusive cpu and memory check in
      validate_change(). There is no functional change.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      74027a65
    • Waiman Long's avatar
      cgroup/cpuset: Show invalid partition reason string · 7476a636
      Waiman Long authored
      There are a number of different reasons which can cause a partition to
      become invalid. A user seeing an invalid partition may not know exactly
      why. To help user to get a better understanding of the underlying reason,
      The cpuset.cpus.partition control file, when read, will now report the
      reason why a partition become invalid. When a partition does become
      invalid, reading the control file will show "root invalid (<reason>)"
      where <reason> is a string that describes why the partition is invalid.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      7476a636
    • Waiman Long's avatar
      cgroup/cpuset: Add a new isolated cpus.partition type · f28e2244
      Waiman Long authored
      Cpuset v1 uses the sched_load_balance control file to determine if load
      balancing should be enabled.  Cpuset v2 gets rid of sched_load_balance
      as its use may require disabling load balancing at cgroup root.
      
      For workloads that require very low latency like DPDK, the latency
      jitters caused by periodic load balancing may exceed the desired
      latency limit.
      
      When cpuset v2 is in use, the only way to avoid this latency cost is to
      use the "isolcpus=" kernel boot option to isolate a set of CPUs. After
      the kernel boot, however, there is no way to add or remove CPUs from
      this isolated set. For workloads that are more dynamic in nature, that
      means users have to provision enough CPUs for the worst case situation
      resulting in excess idle CPUs.
      
      To address this issue for cpuset v2, a new cpuset.cpus.partition type
      "isolated" is added which allows the creation of a cpuset partition
      without load balancing. This will allow system administrators to
      dynamically adjust the size of isolated partition to the current need
      of the workload without rebooting the system.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      f28e2244
    • Waiman Long's avatar
      cgroup/cpuset: Relax constraints to partition & cpus changes · f0af1bfc
      Waiman Long authored
      Currently, enabling a partition root is only allowed if all the
      constraints of a valid partition are satisfied. Even changes to
      "cpuset.cpus" may not be allowed in some cases. Moreover, there are
      limits to changes made to a parent cpuset if it is a valid partition
      root. This is contrary to the general cgroup v2 philosophy.
      
      This patch relaxes the constraints of changing the state of "cpuset.cpus"
      and "cpuset.cpus.partition". Now all valid changes ("member" or "root")
      to "cpuset.cpus.partition" are allowed even if there are child cpusets
      underneath it.
      
      Trying to make a cpuset a partition root, however, will cause its state
      to become invalid if the following constraints of a valid partition
      root are not satisfied.
      
       1) The "cpuset.cpus" is non-empty and exclusive.
       2) The parent cpuset is a valid partition root.
       3) The "cpuset.cpus" overlaps parent's "cpuset.cpus".
      
      Similarly, almost all changes to "cpuset.cpus" are allowed with the
      exception that if the underlying CS_CPU_EXCLUSIVE flag is set, the
      exclusivity rule will still apply.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      f0af1bfc
    • Waiman Long's avatar
      cgroup/cpuset: Allow no-task partition to have empty cpuset.cpus.effective · e2d59900
      Waiman Long authored
      Currently, a partition root cannot have empty "cpuset.cpus.effective".
      As a result, a parent partition root cannot distribute out all its
      CPUs to child partitions with no CPUs left. However in most cases,
      there shouldn't be any tasks associated with intermediate nodes of the
      default hierarchy. So the current rule is too restrictive and can waste
      valuable CPU resource.
      
      To address this issue, we are now allowing a partition to have empty
      "cpuset.cpus.effective" as long as it has no task. Since cpuset is
      threaded, no-internal-process rule does not apply. So it is possible
      to have tasks in a partition root with child sub-partitions even though
      that should be uncommon.
      
      A parent partition with no task can now have all its CPUs distributed out
      to its child partitions. The top cpuset always have some house-keeping
      tasks running and so its list of effective cpu can't be empty.
      
      Once a partition with empty "cpuset.cpus.effective" is formed, no
      new task can be moved into it until "cpuset.cpus.effective" becomes
      non-empty.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      e2d59900
    • Waiman Long's avatar
      cgroup/cpuset: Miscellaneous cleanups & add helper functions · 18065ebe
      Waiman Long authored
      The partition root state (PRS) macro names do not currently match the
      external names. Change them to match the external names and add helper
      functions to read or change the state.
      
      Shorten the cpuset argument of update_parent_subparts_cpumask() to cs
      to match other cpuset functions.
      
      Remove the new_prs argument from notify_partition_change() as the
      cs->partition_root_state has already been set to new_prs before it
      is called.
      
      There is no functional change.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      18065ebe
    • Waiman Long's avatar
      cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset · ec5fbdfb
      Waiman Long authored
      Previously, update_tasks_cpumask() is not supposed to be called with
      top cpuset. With cpuset partition that takes CPUs away from the top
      cpuset, adjusting the cpus_mask of the tasks in the top cpuset is
      necessary. Percpu kthreads, however, are ignored.
      
      Fixes: ee8dde0c ("cpuset: Add new v2 cpuset.sched.partition flag")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ec5fbdfb
    • Josh Don's avatar
      cgroup: add pids.peak interface for pids controller · 5251c6c4
      Josh Don authored
      pids.peak tracks the high watermark of usage for number of pids. This
      helps give a better baseline on which to set pids.max. Polling
      pids.current isn't really feasible, since it would potentially miss
      short-lived spikes.
      
      This interface is analogous to memory.peak.
      Signed-off-by: default avatarJosh Don <joshdon@google.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      5251c6c4
    • Tejun Heo's avatar
      cgroup: Remove data-race around cgrp_dfl_visible · dc79ec1b
      Tejun Heo authored
      There's a seemingly harmless data-race around cgrp_dfl_visible detected by
      kernel concurrency sanitizer. Let's remove it by throwing WRITE/READ_ONCE at
      it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarAbhishek Shah <abhishek.shah@columbia.edu>
      Cc: Gabriel Ryan <gabe@cs.columbia.edu>
      Reviewed-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Link: https://lore.kernel.org/netdev/20220819072256.fn7ctciefy4fc4cu@wittgenstein/
      dc79ec1b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 59954972
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix handling of PCI domains in /proc on 32-bit systems using the
         recently added support for numbering buses from zero for each domain.
      
       - A fix and a revert for some changes to use READ/WRITE_ONCE() which
         caused problems with KASAN enabled due to sanitisation calls being
         introduced in low-level paths that can't cope with it.
      
       - Fix build errors on 32-bit caused by the syscall table being
         misaligned sometimes.
      
       - Two fixes to get IBM Cell native machines booting again, which had
         bit-rotted while my QS22 was temporarily out of action.
      
       - Fix the papr_scm driver to not assume the order of events returned by
         the hypervisor is stable, and a related compile fix.
      
      Thanks to Aneesh Kumar K.V, Christophe Leroy, Jordan Niethe, Kajol Jain,
      Masahiro Yamada, Nathan Chancellor, Pali Rohár, Vaibhav Jain, and Zhouyi
      Zhou.
      
      * tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()
        Revert "powerpc/irq: Don't open code irq_soft_mask helpers"
        powerpc: Fix hard_irq_disable() with sanitizer
        powerpc/rtas: Fix RTAS MSR[HV] handling for Cell
        Revert "powerpc: Remove unused FW_FEATURE_NATIVE references"
        powerpc: align syscall table for ppc32
        powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not unique
        powerpc/papr_scm: Fix nvdimm event mappings
      59954972
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 685ed983
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "s390:
      
         - PCI interpretation compile fixes
      
        RISC-V:
      
         - fix unused variable warnings in vcpu_timer.c
      
         - move extern sbi_ext declarations to a header
      
        x86:
      
         - check validity of argument to KVM_SET_MP_STATE
      
         - use guest's global_ctrl to completely disable guest PEBS
      
         - fix a memory leak on memory allocation failure
      
         - mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
      
         - fix build failure with Clang integrated assembler
      
         - fix MSR interception
      
         - always flush TLBs when enabling dirty logging"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: check validity of argument to KVM_SET_MP_STATE
        perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
        KVM: x86: fix memoryleak in kvm_arch_vcpu_create()
        KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
        KVM: s390: pci: Hook to access KVM lowlevel from VFIO
        riscv: kvm: move extern sbi_ext declarations to a header
        riscv: kvm: vcpu_timer: fix unused variable warnings
        KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE()
        KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang
        KVM: VMX: Heed the 'msr' argument in msr_write_intercepted()
        kvm: x86: mmu: Always flush TLBs when enabling dirty logging
        kvm: x86: mmu: Drop the need_remote_flush() function
      685ed983
    • Nick Desaulniers's avatar
      Makefile.extrawarn: re-enable -Wformat for clang; take 2 · b0839b28
      Nick Desaulniers authored
      -Wformat was recently re-enabled for builds with clang, then quickly
      re-disabled, due to concerns stemming from the frequency of default
      argument promotion related warning instances.
      
      commit 258fafcd ("Makefile.extrawarn: re-enable -Wformat for clang")
      commit 21f9c8a1 ("Revert "Makefile.extrawarn: re-enable -Wformat for clang"")
      
      ISO WG14 has ratified N2562 to address default argument promotion
      explicitly for printf, as part of the upcoming ISO C2X standard.
      
      The behavior of clang was changed in clang-16 to not warn for the cited
      cases in all language modes.
      
      Add a version check, so that users of clang-16 now get the full effect
      of -Wformat. For older clang versions, re-enable flags under the
      -Wformat group that way users still get some useful checks related to
      format strings, without noisy default argument promotion warnings. I
      intentionally omitted -Wformat-y2k and -Wformat-security from being
      re-enabled, which are also part of -Wformat in clang-16.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/378
      Link: https://github.com/llvm/llvm-project/issues/57102
      Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2562.pdfSuggested-by: default avatarJustin Stitt <jstitt007@gmail.com>
      Suggested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Suggested-by: default avatarYoungmin Nam <youngmin.nam@samsung.com>
      Signed-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0839b28
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 7726d4c3
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
       "A a set of fixes from the GPIO subsystem.
      
        Most are small driver fixes except the realtek-otto driver patch which
        is pretty big but addresses a significant flaw that can cause the CPU
        to stay infinitely busy on uncleared ISR on some platforms.
      
        Summary:
         - MAINTAINERS update
         - fix resource leaks in gpio-mockup and gpio-pxa
         - add missing locking in gpio-pca953x
         - use 32-bit I/O in gpio-realtek-otto
         - make irq_chip structures immutable in four more drivers"
      
      * tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: ws16c48: Make irq_chip immutable
        gpio: 104-idio-16: Make irq_chip immutable
        gpio: 104-idi-48: Make irq_chip immutable
        gpio: 104-dio-48e: Make irq_chip immutable
        gpio: realtek-otto: switch to 32-bit I/O
        gpio: pca953x: Add mutex_lock for regcache sync in PM
        gpio: mockup: remove gpio debugfs when remove device
        gpio: pxa: use devres for the clock struct
        MAINTAINERS: rectify entry for XILINX GPIO DRIVER
      7726d4c3
  6. 03 Sep, 2022 15 commits