• Jeff Layton's avatar
    [CIFS] Fix potential data corruption when writing out cached dirty pages · cea21805
    Jeff Layton authored
    Fix RedHat bug 329431
    
    The idea here is separate "conscious" from "unconscious" flushes.
    Conscious flushes are those due to a fsync() or close(). Unconscious
    ones are flushes that occur as a side effect of some other operation or
    due to memory pressure.
    
    Currently, when an error occurs during an unconscious flush (ENOSPC or
    EIO), we toss out the page and don't preserve that error to report to
    the user when a conscious flush occurs. If after the unconscious flush,
    there are no more dirty pages for the inode, the conscious flush will
    simply return success even though there were previous errors when writing
    out pages. This can lead to data corruption.
    
    The easiest way to reproduce this is to mount up a CIFS share that's
    very close to being full or where the user is very close to quota. mv
    a file to the share that's slightly larger than the quota allows. The
    writes will all succeed (since they go to pagecache). The mv will do a
    setattr to set the new file's attributes. This calls
    filemap_write_and_wait,
    which will return an error since all of the pages can't be written out.
    Then later, when the flush and release ops occur, there are no more
    dirty pages in pagecache for the file and those operations return 0. mv
    then assumes that the file was written out correctly and deletes the
    original.
    
    CIFS already has a write_behind_rc variable where it stores the results
    from earlier flushes, but that value is only reported in cifs_close.
    Since the VFS ignores the return value from the release operation, this
    isn't helpful. We should be reporting this error during the flush
    operation.
    
    This patch does the following:
    
    1) changes cifs_fsync to use filemap_write_and_wait and cifs_flush and also
    sync to check its return code. If it returns successful, they then check
    the value of write_behind_rc to see if an earlier flush had reported any
    errors. If so, they return that error and clear write_behind_rc.
    
    2) sets write_behind_rc in a few other places where pages are written
    out as a side effect of other operations and the code waits on them.
    
    3) changes cifs_setattr to only call filemap_write_and_wait for
    ATTR_SIZE changes.
    
    4) makes cifs_writepages accurately distinguish between EIO and ENOSPC
    errors when writing out pages.
    
    Some simple testing indicates that the patch works as expected and that
    it fixes the reproduceable known problem.
    Acked-by: default avatarDave Kleikamp <shaggy@austin.rr.com>
    Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
    Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
    cea21805
README 31.7 KB