• Waiman Long's avatar
    cgroup/cpuset: Introduce remote partition · 181c8e09
    Waiman Long authored
    One can use "cpuset.cpus.partition" to create multiple scheduling domains
    or to produce a set of isolated CPUs where load balancing is disabled.
    The former use case is less common but the latter one can be frequently
    used especially for the Telco use cases like DPDK.
    
    The existing "isolated" partition can be used to produce isolated
    CPUs if the applications have full control of a system. However, in a
    containerized environment where all the apps are run in a container,
    it is hard to distribute out isolated CPUs from the root down given
    the unified hierarchy nature of cgroup v2.
    
    The container running on isolated CPUs can be several layers down from
    the root. The current partition feature requires that all the ancestors
    of a leaf partition root must be parititon roots themselves. This can
    be hard to configure.
    
    This patch introduces a new type of partition called remote partition.
    A remote partition is a partition whose parent is not a partition root
    itself and its CPUs are acquired directly from available CPUs in the
    top cpuset through a hierachical distribution of exclusive CPUs down
    from it.
    
    By contrast, the existing type of partitions where their parents have
    to be valid partition roots are referred to as local partitions as they
    have to be clustered around a parent partition root.
    
    Child local partitons can be created under a remote partition, but
    a remote partition cannot be created under a local partition. We may
    relax this limitation in the future if there are use cases for such
    configuration.
    
    Manually writing to the "cpuset.cpus.exclusive" file is not necessary
    when creating local partitions.  However, writing proper values to
    "cpuset.cpus.exclusive" down the cgroup hierarchy before the target
    remote partition root is mandatory for the creation of a remote
    partition.
    
    The value in "cpuset.cpus.exclusive.effective" may change if its
    "cpuset.cpus" or its parent's "cpuset.cpus.exclusive.effective" changes.
    Signed-off-by: default avatarWaiman Long <longman@redhat.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    181c8e09
cpuset.c 136 KB