• Mike Christie's avatar
    fork, vhost: Use CLONE_THREAD to fix freezer/ps regression · f9010dbd
    Mike Christie authored
    When switching from kthreads to vhost_tasks two bugs were added:
    1. The vhost worker tasks's now show up as processes so scripts doing
    ps or ps a would not incorrectly detect the vhost task as another
    process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
    vhost tasks's didn't disable or add support for them.
    
    To fix both bugs, this switches the vhost task to be thread in the
    process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
    get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
    SIGKILL/STOP support is required because CLONE_THREAD requires
    CLONE_SIGHAND which requires those 2 signals to be supported.
    
    This is a modified version of the patch written by Mike Christie
    <michael.christie@oracle.com> which was a modified version of patch
    originally written by Linus.
    
    Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
    Including ignoring signals, setting up the register state, and having
    get_signal return instead of calling do_group_exit.
    
    Tidied up the vhost_task abstraction so that the definition of
    vhost_task only needs to be visible inside of vhost_task.c.  Making
    it easier to review the code and tell what needs to be done where.
    As part of this the main loop has been moved from vhost_worker into
    vhost_task_fn.  vhost_worker now returns true if work was done.
    
    The main loop has been updated to call get_signal which handles
    SIGSTOP, freezing, and collects the message that tells the thread to
    exit as part of process exit.  This collection clears
    __fatal_signal_pending.  This collection is not guaranteed to
    clear signal_pending() so clear that explicitly so the schedule()
    sleeps.
    
    For now the vhost thread continues to exist and run work until the
    last file descriptor is closed and the release function is called as
    part of freeing struct file.  To avoid hangs in the coredump
    rendezvous and when killing threads in a multi-threaded exec.  The
    coredump code and de_thread have been modified to ignore vhost threads.
    
    Remvoing the special case for exec appears to require teaching
    vhost_dev_flush how to directly complete transactions in case
    the vhost thread is no longer running.
    
    Removing the special case for coredump rendezvous requires either the
    above fix needed for exec or moving the coredump rendezvous into
    get_signal.
    
    Fixes: 6e890c5d ("vhost: use vhost_tasks for worker threads")
    Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
    Co-developed-by: default avatarMike Christie <michael.christie@oracle.com>
    Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
    Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f9010dbd
fork.c 87.6 KB