Commit ed0ca140 authored by Josef Bacik's avatar Josef Bacik

Btrfs: set no_trans_join after trying to expand the transaction

We can lockup if we try to allow new writers join the transaction and we have
flushoncommit set or have a pending snapshot.  This is because we set
no_trans_join and then loop around and try to wait for ordered extents again.
The problem is the ordered endio stuff needs to join the transaction, which it
can't do because no_trans_join is set.  So instead wait until after this loop to
set no_trans_join and then make sure to wait for num_writers == 1 in case
anybody got started in between us exiting the loop and setting no_trans_join.
This could easily be reproduced by mounting -o flushoncommit and running xfstest
13.  It cannot be reproduced with this patch.  Thanks,
Reported-by: default avatarJim Schutt <jaschut@sandia.gov>
Signed-off-by: default avatarJosef Bacik <josef@redhat.com>
parent 8351583e
......@@ -1241,12 +1241,20 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
schedule_timeout(1);
finish_wait(&cur_trans->writer_wait, &wait);
spin_lock(&root->fs_info->trans_lock);
root->fs_info->trans_no_join = 1;
spin_unlock(&root->fs_info->trans_lock);
} while (atomic_read(&cur_trans->num_writers) > 1 ||
(should_grow && cur_trans->num_joined != joined));
/*
* Ok now we need to make sure to block out any other joins while we
* commit the transaction. We could have started a join before setting
* no_join so make sure to wait for num_writers to == 1 again.
*/
spin_lock(&root->fs_info->trans_lock);
root->fs_info->trans_no_join = 1;
spin_unlock(&root->fs_info->trans_lock);
wait_event(cur_trans->writer_wait,
atomic_read(&cur_trans->num_writers) == 1);
ret = create_pending_snapshots(trans, root->fs_info);
BUG_ON(ret);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment