Commit 4b13cbef authored by Mike Christie's avatar Mike Christie Committed by Michael S. Tsirkin

vhost: Fix worker hangs due to missed wake up calls

We can race where we have added work to the work_list, but
vhost_task_fn has passed that check but not yet set us into
TASK_INTERRUPTIBLE. wake_up_process will see us in TASK_RUNNING and
just return.

This bug was intoduced in commit f9010dbd ("fork, vhost: Use
CLONE_THREAD to fix freezer/ps regression") when I moved the setting
of TASK_INTERRUPTIBLE to simplfy the code and avoid get_signal from
logging warnings about being in the wrong state. This moves the setting
of TASK_INTERRUPTIBLE back to before we test if we need to stop the
task to avoid a possible race there as well. We then have vhost_worker
set TASK_RUNNING if it finds work similar to before.

Fixes: f9010dbd ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
Message-Id: <20230607192338.6041-3-michael.christie@oracle.com>
Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
parent a284f09e
......@@ -341,6 +341,8 @@ static bool vhost_worker(void *data)
node = llist_del_all(&worker->work_list);
if (node) {
__set_current_state(TASK_RUNNING);
node = llist_reverse_order(node);
/* make sure flag is seen after deletion */
smp_wmb();
......
......@@ -28,10 +28,6 @@ static int vhost_task_fn(void *data)
for (;;) {
bool did_work;
/* mb paired w/ vhost_task_stop */
if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
break;
if (!dead && signal_pending(current)) {
struct ksignal ksig;
/*
......@@ -48,11 +44,17 @@ static int vhost_task_fn(void *data)
clear_thread_flag(TIF_SIGPENDING);
}
did_work = vtsk->fn(vtsk->data);
if (!did_work) {
/* mb paired w/ vhost_task_stop */
set_current_state(TASK_INTERRUPTIBLE);
schedule();
if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
__set_current_state(TASK_RUNNING);
break;
}
did_work = vtsk->fn(vtsk->data);
if (!did_work)
schedule();
}
complete(&vtsk->exited);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment