• Brian Foster's avatar
    xfs: fix log recovery corruption error due to tail overwrite · 47db1fc6
    Brian Foster authored
    commit 4a4f66ea upstream.
    
    If we consider the case where the tail (T) of the log is pinned long
    enough for the head (H) to push and block behind the tail, we can
    end up blocked in the following state without enough free space (f)
    in the log to satisfy a transaction reservation:
    
    	0	phys. log	N
    	[-------HffT---H'--T'---]
    
    The last good record in the log (before H) refers to T. The tail
    eventually pushes forward (T') leaving more free space in the log
    for writes to H. At this point, suppose space frees up in the log
    for the maximum of 8 in-core log buffers to start flushing out to
    the log. If this pushes the head from H to H', these next writes
    overwrite the previous tail T. This is safe because the items logged
    from T to T' have been written back and removed from the AIL.
    
    If the next log writes (H -> H') happen to fail and result in
    partial records in the log, the filesystem shuts down having
    overwritten T with invalid data. Log recovery correctly locates H on
    the subsequent mount, but H still refers to the now corrupted tail
    T. This results in log corruption errors and recovery failure.
    
    Since the tail overwrite results from otherwise correct runtime
    behavior, it is up to log recovery to try and deal with this
    situation. Update log recovery tail verification to run a CRC pass
    from the first record past the tail to the head. This facilitates
    error detection at T and moves the recovery tail to the first good
    record past H' (similar to truncating the head on torn write
    detection). If corruption is detected beyond the range possibly
    affected by the max number of iclogs, the log is legitimately
    corrupted and log recovery failure is expected.
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    47db1fc6
xfs_log_recover.c 161 KB