• Christoph Hellwig's avatar
    pnfs/blocklayout: rewrite extent tracking · 8067253c
    Christoph Hellwig authored
    Currently the block layout driver tracks extents in three separate
    data structures:
    
     - the two list of pnfs_block_extent structures returned by the server
     - the list of sectors that were in invalid state but have been written to
     - a list of pnfs_block_short_extent structures for LAYOUTCOMMIT
    
    All of these share the property that they are not only highly inefficient
    data structures, but also that operations on them are even more inefficient
    than nessecary.
    
    In addition there are various implementation defects like:
    
     - using an int to track sectors, causing corruption for large offsets
     - incorrect normalization of page or block granularity ranges
     - insufficient error handling
     - incorrect synchronization as extents can be modified while they are in
       use
    
    This patch replace all three data with a single unified rbtree structure
    tracking all extents, as well as their in-memory state, although we still
    need to instance for read-only and read-write extent due to the arcane
    client side COW feature in the block layouts spec.
    
    To fix the problem of extent possibly being modified while in use we make
    sure to return a copy of the extent for use in the write path - the
    extent can only be invalidated by a layout recall or return which has
    to wait until the I/O operations finished due to refcounts on the layout
    segment.
    
    The new extent tree work similar to the schemes used by block based
    filesystems like XFS or ext4.
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
    8067253c
blocklayout.c 24 KB