• Dave Chinner's avatar
    xfs: extent size hints can round up extents past MAXEXTLEN · 6dea405e
    Dave Chinner authored
    This results in BMBT corruption, as seen by this test:
    
    # mkfs.xfs -f -d size=40051712b,agcount=4 /dev/vdc
    ....
    # mount /dev/vdc /mnt/scratch
    # xfs_io -ft -c "extsize 16m" -c "falloc 0 30g" -c "bmap -vp" /mnt/scratch/foo
    
    which results in this failure on a debug kernel:
    
    XFS: Assertion failed: (blockcount & xfs_mask64hi(64-BMBT_BLOCKCOUNT_BITLEN)) == 0, file: fs/xfs/libxfs/xfs_bmap_btree.c, line: 211
    ....
    Call Trace:
     [<ffffffff814cf0ff>] xfs_bmbt_set_allf+0x8f/0x100
     [<ffffffff814cf18d>] xfs_bmbt_set_all+0x1d/0x20
     [<ffffffff814f2efe>] xfs_iext_insert+0x9e/0x120
     [<ffffffff814c7956>] ? xfs_bmap_add_extent_hole_real+0x1c6/0xc70
     [<ffffffff814c7956>] xfs_bmap_add_extent_hole_real+0x1c6/0xc70
     [<ffffffff814caaab>] xfs_bmapi_write+0x72b/0xed0
     [<ffffffff811c72ac>] ? kmem_cache_alloc+0x15c/0x170
     [<ffffffff814fe070>] xfs_alloc_file_space+0x160/0x400
     [<ffffffff81ddcc29>] ? down_write+0x29/0x60
     [<ffffffff815063eb>] xfs_file_fallocate+0x29b/0x310
     [<ffffffff811d2bc8>] ? __sb_start_write+0x58/0x120
     [<ffffffff811e3e18>] ? do_vfs_ioctl+0x318/0x570
     [<ffffffff811cd680>] vfs_fallocate+0x140/0x260
     [<ffffffff811ce6f8>] SyS_fallocate+0x48/0x80
     [<ffffffff81ddec09>] system_call_fastpath+0x12/0x17
    
    The tracepoint that indicates the extent that triggered the assert
    failure is:
    
    xfs_iext_insert:   idx 0 offset 0 block 16777224 count 2097152 flag 1
    
    Clearly indicating that the extent length is greater than MAXEXTLEN,
    which is 2097151. A prior trace point shows the allocation was an
    exact size match and that a length greater than MAXEXTLEN was asked
    for:
    
    xfs_alloc_size_done:  agno 1 agbno 8 minlen 2097152 maxlen 2097152
    					    ^^^^^^^        ^^^^^^^
    
    We don't see this problem with extent size hints through the IO path
    because we can't do single IOs large enough to trigger MAXEXTLEN
    allocation. fallocate(), OTOH, is not limited in it's allocation
    sizes and so needs help here.
    
    The issue is that the extent size hint alignment is rounding up the
    extent size past MAXEXTLEN, because xfs_bmapi_write() is not taking
    into account extent size hints when calculating the maximum extent
    length to allocate. xfs_bmapi_reserve_delalloc() is already doing
    this, but direct extent allocation is not.
    
    Unfortunately, the calculation in xfs_bmapi_reserve_delalloc() is
    wrong, and it works only because delayed allocation extents are not
    limited in size to MAXEXTLEN in the in-core extent tree. hence this
    calculation does not work for direct allocation, and the delalloc
    code needs fixing. This may, in fact be the underlying bug that
    occassionally causes transaction overruns in delayed allocation
    extent conversion, so now we know it's wrong we should fix it, too.
    Many thanks to Brian Foster for finding this problem during review
    of this patch.
    
    Hence the fix, after much code reading, is to allow
    xfs_bmap_extsize_align() to align partial extents when full
    alignment would extend the alignment past MAXEXTLEN. We can safely
    do this because all callers have higher layer allocation loops that
    already handle short allocations, and so will simply run another
    allocation to cover the remainder of the requested allocation range
    that we ignored during alignment. The advantage of this approach is
    that it also removes the need for callers to do anything other than
    limit their requests to MAXEXTLEN - they don't really need to be
    aware of extent size hints at all.
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
    6dea405e
xfs_bmap.c 170 KB