• Linus Torvalds's avatar
    Merge tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 3323ddce
    Linus Torvalds authored
    Pull user work thread updates from Christian Brauner:
     "This contains the work generalizing the ability to create a kernel
      worker from a userspace process.
    
      Such user workers will run with the same credentials as the userspace
      process they were created from providing stronger security and
      accounting guarantees than the traditional override_creds() approach
      ever could've hoped for.
    
      The original work was heavily based and optimzed for the needs of
      io_uring which was the first user. However, as it quickly turned out
      the ability to create user workers inherting properties from a
      userspace process is generally useful.
    
      The vhost subsystem currently creates workers using the kthread api.
      The consequences of using the kthread api are that RLIMITs don't work
      correctly as they are inherited from khtreadd. This leads to bugs
      where more workers are created than would be allowed by the RLIMITs of
      the userspace process in lieu of which workers are created.
    
      Problems like this disappear with user workers created from the
      userspace processes for which they perform the work. In addition,
      providing this api allows vhost to remove additional complexity. For
      example, cgroup and mm sharing will just work out of the box with user
      workers based on the relevant userspace process instead of manually
      ensuring the correct cgroup and mm contexts are used.
    
      So the vhost subsystem should simply be made to use the same mechanism
      as io_uring. To this end the original mechanism used for
      create_io_thread() is generalized into user workers:
    
       - Introduce PF_USER_WORKER as a generic indicator that a given task
         is a user worker, i.e., a kernel task that was created from a
         userspace process. Now a PF_IO_WORKER thread is just a specialized
         version of PF_USER_WORKER. So io_uring io workers raise both flags.
    
       - Make copy_process() available to core kernel code
    
       - Extend struct kernel_clone_args with the following bitfields
         allowing to indicate to copy_process():
           - to create a user worker (raise PF_USER_WORKER)
           - to not inherit any files from the userspace process
           - to ignore signals
    
      After all generic changes are in place the vhost subsystem implements
      a new dedicated vhost api based on user workers. Finally, vhost is
      switched to rely on the new api moving it off of kthreads.
    
      Thanks to Mike for sticking it out and making it through this rather
      arduous journey"
    
    * tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
      vhost: use vhost_tasks for worker threads
      vhost: move worker thread fields to new struct
      vhost_task: Allow vhost layer to use copy_process
      fork: allow kernel code to call copy_process
      fork: Add kernel_clone_args flag to ignore signals
      fork: add kernel_clone_args flag to not dup/clone files
      fork/vm: Move common PF_IO_WORKER behavior to new flag
      kernel: Make io_thread and kthread bit fields
      kthread: Pass in the thread's name during creation
      kernel: Allow a kernel thread's name to be set in copy_process
      csky: Remove kernel_thread declaration
    3323ddce
vhost.c 62.7 KB