• Neil Brown's avatar
    NFS: Fix directory caching problem - with test case and patch. · 83672d39
    Neil Brown authored
    Try running this script in an NFS mounted directory (Client relatively
    recent - 2.6.18 has the problem as does 2.6.20).
    
    ------------------------------------------------------
    #!/bin/bash
    #
    # This script will produce the following errormessage from tar:
    #
    #   tar: newdir/innerdir/innerfile: file changed as we read it
    
    # create dirs
    rm -rf nfstest
    mkdir -p nfstest/dir/innerdir
    
    # create files (should not be empty)
    echo "Hello World!" >nfstest/dir/file
    echo "Hello World!" >nfstest/dir/innerdir/innerfile
    
    # problem only happens if we sleep before chmod
    sleep 1
    
    # change file modes
    chmod -R a+r nfstest
    
    # rename dir
    mv nfstest/dir nfstest/newdir
    
    # tar it
    tar -cf nfstest/nfstest.tar -C nfstest newdir
    
    # restore old dir name
    mv nfstest/newdir nfstest/dir
    --------------------------------------------------------
    
    What happens:
    
    The 'chmod -R' does a readdir_plus in each directory and the results
    get cached in the page cache.  It then updates the ctime on each file
    by one second.  When this happens, the post-op attributes are used to
    update the ctime stored on the client to match the value in the kernel.
    
    The 'mv' calls shrink_dcache_parent on the directory tree which
    flushes all the dentries (so a new lookup will be required) but
    doesn't flush the inodes or pagecache.
    
    The 'tar' does a readdir on each directory, but (in the case of
    'innerdir' at least) satisfies it from the pagecache and uses the
    READDIRPLUS data to update all the inodes.  In the case of
    'innerdir/innerfile', the ctime is out of date.
    
    'tar' then calls 'lstat' on innerdir/innerfile getting an old ctime.
    It then opens the file (triggering a GETATTR), reads the content, and
    then calls fstat to see if anything has changed.  It finds that ctime
    has changed and so complains.
    
    The problem seems to be that the cache readdirplus info is kept around
    for too long.
    
    My patch below discards pagecache data for directories when
    dentry_iput is called on them.  This effectively removes the symptom
    which convinces me that I correctly understand the problem.  However
    I'm not convinced that is a proper solution, as there could easily be
    other races that trigger the same problem without being affected by
    this 'fix'.
    
    One possibility would be to require that readdirplus pagecache data be
    only used *once* to instantiate an inode.  Somehow it should then be
    invalidated so that if the dentry subsequently disappears, it will
    cause a new request to the server to fill in the stat data.
    
    Another possibility is to compare the cache_change_attribute on the
    inode with something similar for the readdirplus info and reject the
    info from readdirplus if it is too old.
    
    I haven't tried to implement these and would value other opinions
    before I do.
    
    Thanks,
    NeilBrown
    Signed-off-by: default avatarNeil Brown <neilb@suse.de>
    Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
    83672d39
dir.c 53.5 KB