• David Howells's avatar
    afs: Overhaul invalidation handling to better support RO volumes · 453924de
    David Howells authored
    Overhaul the third party-induced invalidation handling, making use of the
    previously added volume-level event counters (cb_scrub and cb_ro_snapshot)
    that are now being parsed out of the VolSync record returned by the
    fileserver in many of its replies.
    
    This allows better handling of RO (and Backup) volumes.  Since these are
    snapshot of a RW volume that are updated atomically simultantanously across
    all servers that host them, they only require a single callback promise for
    the entire volume.  The currently upstream code assumes that RO volumes
    operate in the same manner as RW volumes, and that each file has its own
    individual callback - which means that it does a status fetch for *every*
    file in a RO volume, whether or not the volume got "released" (volume
    callback breaks can occur for other reasons too, such as the volumeserver
    taking ownership of a volume from a fileserver).
    
    To this end, make the following changes:
    
     (1) Change the meaning of the volume's cb_v_break counter so that it is
         now a hint that we need to issue a status fetch to work out the state
         of a volume.  cb_v_break is incremented by volume break callbacks and
         by server initialisation callbacks.
    
     (2) Add a second counter, cb_v_check, to the afs_volume struct such that
         if this differs from cb_v_break, we need to do a check.  When the
         check is complete, cb_v_check is advanced to what cb_v_break was at
         the start of the status fetch.
    
     (3) Move the list of mmap'd vnodes to the volume and trigger removal of
         PTEs that map to files on a volume break rather than on a server
         break.
    
     (4) When a server reinitialisation callback comes in, use the
         server-to-volume reverse mapping added in a preceding patch to iterate
         over all the volumes using that server and clear the volume callback
         promises for that server and the general volume promise as a whole to
         trigger reanalysis.
    
     (5) Replace the AFS_VNODE_CB_PROMISED flag with an AFS_NO_CB_PROMISE
         (TIME64_MIN) value in the cb_expires_at field, reducing the number of
         checks we need to make.
    
     (6) Change afs_check_validity() to quickly see if various event counters
         have been incremented or if the vnode or volume callback promise is
         due to expire/has expired without making any changes to the state.
         That is now left to afs_validate() as this may get more complicated in
         future as we may have to examine server records too.
    
     (7) Overhaul afs_validate() so that it does a single status fetch if we
         need to check the state of either the vnode or the volume - and do so
         under appropriate locking.  The function does the following steps:
    
         (A) If the vnode/volume is no longer seen as valid, then we take the
         vnode validation lock and, if the volume promise has expired, the
         volume check lock also.  The latter prevents redundant checks being
         made to find out if a new version of the volume got released.
    
         (B) If a previous RPC call found that the volsync changed unexpectedly
         or that a RO volume was updated, then we unmap all PTEs pointing to
         the file to stop mmap being used for access.
    
         (C) If the vnode is still seen to be of uncertain validity, then we
         perform an FS.FetchStatus RPC op to jointly update the volume status
         and the vnode status.  This assessment is done as part of parsing the
         reply:
    
    	If the RO volume creation timestamp advances, cb_ro_snapshot is
    	incremented; if either the creation or update timestamps changes in
    	an unexpected way, the cb_scrub counter is incremented
    
    	If the Data Version returned doesn't match the copy we have
    	locally, then we ask for the pagecache to be zapped.  This takes
    	care of handling RO update.
    
         (D) If cb_scrub differs between volume and vnode, the vnode's
         pagecache is zapped and the vnode's cb_scrub is updated unless the
         file is marked as having been deleted.
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    cc: Marc Dionne <marc.dionne@auristor.com>
    cc: linux-afs@lists.infradead.org
    453924de
dir.c 53.3 KB