• Andrew Morton's avatar
    [PATCH] fadvise(): write commands · ebcf28e1
    Andrew Morton authored
    Add two new linux-specific fadvise extensions():
    
    LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file
    offsets `offset' and `offset+len'.  Any pages which are currently under
    writeout are skipped, whether or not they are dirty.
    
    LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file
    offsets `offset' and `offset+len'.
    
    By combining these two operations the application may do several things:
    
    LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.
    
    LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty
    pages at the disk.
    
    LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all
    of the currently dirty pages at the disk, wait until they have been written.
    
    It should be noted that none of these operations write out the file's
    metadata.  So unless the application is strictly performing overwrites of
    already-instantiated disk blocks, there are no guarantees here that the data
    will be available after a crash.
    
    To complete this suite of operations I guess we should have a "sync file
    metadata only" operation.  This gives applications access to all the building
    blocks needed for all sorts of sync operations.  But sync-metadata doesn't fit
    well with the fadvise() interface.  Probably it should be a new syscall:
    sys_fmetadatasync().
    
    The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64().
    It is made to represent that last affected byte in the file (ie: it is
    inclusive).  Generally, all these byterange and pagerange functions are
    inclusive so we can easily represent EOF with -1.
    
    As Ulrich notes, these two functions are somewhat abusive of the fadvise()
    concept, which appears to be "set the future policy for this fd".
    
    But these commands are a perfect fit with the fadvise() impementation, and
    several of the existing fadvise() commands are synchronous and don't affect
    future policy either.   I think we can live with the slight incongruity.
    
    Cc: Michael Kerrisk <mtk-manpages@gmx.net>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    ebcf28e1
fadvise.c 3.9 KB