• Linus Torvalds's avatar
    Clarify (and fix) MAX_LFS_FILESIZE macros · 03e58884
    Linus Torvalds authored
    commit 0cc3b0ec upstream.
    
    We have a MAX_LFS_FILESIZE macro that is meant to be filled in by
    filesystems (and other IO targets) that know they are 64-bit clean and
    don't have any 32-bit limits in their IO path.
    
    It turns out that our 32-bit value for that limit was bogus.  On 32-bit,
    the VM layer is limited by the page cache to only 32-bit index values,
    but our logic for that was confusing and actually wrong.  We used to
    define that value to
    
    	(((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
    
    which is actually odd in several ways: it limits the index to 31 bits,
    and then it limits files so that they can't have data in that last byte
    of a page that has the highest 31-bit index (ie page index 0x7fffffff).
    
    Neither of those limitations make sense.  The index is actually the full
    32 bit unsigned value, and we can use that whole full page.  So the
    maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG".
    
    However, we do wan tto avoid the maximum index, because we have code
    that iterates over the page indexes, and we don't want that code to
    overflow.  So the maximum size of a file on a 32-bit host should
    actually be one page less than the full 32-bit index.
    
    So the actual limit is ULONG_MAX << PAGE_SHIFT.  That means that we will
    not actually be using the page of that last index (ULONG_MAX), but we
    can grow a file up to that limit.
    
    The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug
    Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5
    volume.  It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one
    byte less), but the actual true VM limit is one page less than 16TiB.
    
    This was invisible until commit c2a9737f ("vfs,mm: fix a dead loop
    in truncate_inode_pages_range()"), which started applying that
    MAX_LFS_FILESIZE limit to block devices too.
    
    NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is
    actually just the offset type itself (loff_t), which is signed.  But for
    clarity, on 64-bit, just use the maximum signed value, and don't make
    people have to count the number of 'f' characters in the hex constant.
    
    So just use LLONG_MAX for the 64-bit case.  That was what the value had
    been before too, just written out as a hex constant.
    
    Fixes: c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
    Reported-and-tested-by: default avatarDoug Nazar <nazard@nazar.ca>
    Cc: Andreas Dilger <adilger@dilger.ca>
    Cc: Mark Fasheh <mfasheh@versity.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Dave Kleikamp <shaggy@kernel.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    03e58884
fs.h 103 KB