• Robbie Ko's avatar
    Btrfs: fix unexpected failure of nocow buffered writes after snapshotting when low on space · 8ecebf4d
    Robbie Ko authored
    Commit e9894fd3 ("Btrfs: fix snapshot vs nocow writting") forced
    nocow writes to fallback to COW, during writeback, when a snapshot is
    created. This resulted in writes made before creating the snapshot to
    unexpectedly fail with ENOSPC during writeback when success (0) was
    returned to user space through the write system call.
    
    The steps leading to this problem are:
    
    1. When it's not possible to allocate data space for a write, the
       buffered write path checks if a NOCOW write is possible.  If it is,
       it will not reserve space and success (0) is returned to user space.
    
    2. Then when a snapshot is created, the root's will_be_snapshotted
       atomic is incremented and writeback is triggered for all inode's that
       belong to the root being snapshotted. Incrementing that atomic forces
       all previous writes to fallback to COW during writeback (running
       delalloc).
    
    3. This results in the writeback for the inodes to fail and therefore
       setting the ENOSPC error in their mappings, so that a subsequent
       fsync on them will report the error to user space. So it's not a
       completely silent data loss (since fsync will report ENOSPC) but it's
       a very unexpected and undesirable behaviour, because if a clean
       shutdown/unmount of the filesystem happens without previous calls to
       fsync, it is expected to have the data present in the files after
       mounting the filesystem again.
    
    So fix this by adding a new atomic named snapshot_force_cow to the
    root structure which prevents this behaviour and works the following way:
    
    1. It is incremented when we start to create a snapshot after triggering
       writeback and before waiting for writeback to finish.
    
    2. This new atomic is now what is used by writeback (running delalloc)
       to decide whether we need to fallback to COW or not. Because we
       incremented this new atomic after triggering writeback in the
       snapshot creation ioctl, we ensure that all buffered writes that
       happened before snapshot creation will succeed and not fallback to
       COW (which would make them fail with ENOSPC).
    
    3. The existing atomic, will_be_snapshotted, is kept because it is used
       to force new buffered writes, that start after we started
       snapshotting, to reserve data space even when NOCOW is possible.
       This makes these writes fail early with ENOSPC when there's no
       available space to allocate, preventing the unexpected behaviour of
       writeback later failing with ENOSPC due to a fallback to COW mode.
    
    Fixes: e9894fd3 ("Btrfs: fix snapshot vs nocow writting")
    Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    8ecebf4d
disk-io.c 123 KB