• Dave Chinner's avatar
    xfs: don't sleep in xlog_cil_force_lsn on shutdown · ac983517
    Dave Chinner authored
    Reports of a shutdown hang when fsyncing a directory have surfaced,
    such as this:
    
    [ 3663.394472] Call Trace:
    [ 3663.397199]  [<ffffffff815f1889>] schedule+0x29/0x70
    [ 3663.402743]  [<ffffffffa01feda5>] xlog_cil_force_lsn+0x185/0x1a0 [xfs]
    [ 3663.416249]  [<ffffffffa01fd3af>] _xfs_log_force_lsn+0x6f/0x2f0 [xfs]
    [ 3663.429271]  [<ffffffffa01a339d>] xfs_dir_fsync+0x7d/0xe0 [xfs]
    [ 3663.435873]  [<ffffffff811df8c5>] do_fsync+0x65/0xa0
    [ 3663.441408]  [<ffffffff811dfbc0>] SyS_fsync+0x10/0x20
    [ 3663.447043]  [<ffffffff815fc7d9>] system_call_fastpath+0x16/0x1b
    
    If we trigger a shutdown in xlog_cil_push() from xlog_write(), we
    will never wake waiters on the current push sequence number, so
    anything waiting in xlog_cil_force_lsn() for that push sequence
    number to come up will not get woken and hence stall the shutdown.
    
    Fix this by ensuring we call wake_up_all(&cil->xc_commit_wait) in
    the push abort handling, in the log shutdown code when waking all
    waiters, and adding a shutdown check in the sequence completion wait
    loops to ensure they abort when a wakeup due to a shutdown occurs.
    Reported-by: default avatarBoris Ranto <branto@redhat.com>
    Reported-by: default avatarEric Sandeen <esandeen@redhat.com>
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
    ac983517
xfs_log_cil.c 28.9 KB