• Brian Foster's avatar
    xfs: sync eofblocks scans under iolock are livelock prone · c3155097
    Brian Foster authored
    The xfs_eofblocks.eof_scan_owner field is an internal field to
    facilitate invoking eofb scans from the kernel while under the iolock.
    This is necessary because the eofb scan acquires the iolock of each
    inode. Synchronous scans are invoked on certain buffered write failures
    while under iolock. In such cases, the scan owner indicates that the
    context for the scan already owns the particular iolock and prevents a
    double lock deadlock.
    
    eofblocks scans while under iolock are still livelock prone in the event
    of multiple parallel scans, however. If multiple buffered writes to
    different inodes fail and invoke eofblocks scans at the same time, each
    scan avoids a deadlock with its own inode by virtue of the
    eof_scan_owner field, but will never be able to acquire the iolock of
    the inode from the parallel scan. Because the low free space scans are
    invoked with SYNC_WAIT, the scan will not return until it has processed
    every tagged inode and thus both scans will spin indefinitely on the
    iolock being held across the opposite scan. This problem can be
    reproduced reliably by generic/224 on systems with higher cpu counts
    (x16).
    
    To avoid this problem, simplify the semantics of eofblocks scans to
    never invoke a scan while under iolock. This means that the buffered
    write context must drop the iolock before the scan. It must reacquire
    the lock before the write retry and also repeat the initial write
    checks, as the original state might no longer be valid once the iolock
    was dropped.
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    c3155097
xfs_file.c 39.6 KB