1. 10 Dec, 2016 3 commits
    • Sergey Karamov's avatar
      ext4: do not perform data journaling when data is encrypted · 73b92a2a
      Sergey Karamov authored
      Currently data journalling is incompatible with encryption: enabling both
      at the same time has never been supported by design, and would result in
      unpredictable behavior. However, users are not precluded from turning on
      both features simultaneously. This change programmatically replaces data
      journaling for encrypted regular files with ordered data journaling mode.
      
      Background:
      Journaling encrypted data has not been supported because it operates on
      buffer heads of the page in the page cache. Namely, when the commit
      happens, which could be up to five seconds after caching, the commit
      thread uses the buffer heads attached to the page to copy the contents of
      the page to the journal. With encryption, it would have been required to
      keep the bounce buffer with ciphertext for up to the aforementioned five
      seconds, since the page cache can only hold plaintext and could not be
      used for journaling. Alternatively, it would be required to setup the
      journal to initiate a callback at the commit time to perform deferred
      encryption - in this case, not only would the data have to be written
      twice, but it would also have to be encrypted twice. This level of
      complexity was not justified for a mode that in practice is very rarely
      used because of the overhead from the data journalling.
      
      Solution:
      If data=journaled has been set as a mount option for a filesystem, or if
      journaling is enabled on a regular file, do not perform journaling if the
      file is also encrypted, instead fall back to the data=ordered mode for the
      file.
      
      Rationale:
      The intent is to allow seamless and proper filesystem operation when
      journaling and encryption have both been enabled, and have these two
      conflicting features gracefully resolved by the filesystem.
      
      Fixes: 44614711Signed-off-by: default avatarSergey Karamov <skaramov@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      73b92a2a
    • Dan Carpenter's avatar
      ext4: return -ENOMEM instead of success · 578620f4
      Dan Carpenter authored
      We should set the error code if kzalloc() fails.
      
      Fixes: 67cf5b09 ("ext4: add the basic function for inline data support")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      578620f4
    • Darrick J. Wong's avatar
      ext4: reject inodes with negative size · 7e6e1ef4
      Darrick J. Wong authored
      Don't load an inode with a negative size; this causes integer overflow
      problems in the VFS.
      
      [ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT]
      
      Fixes: a48380f7 (ext4: rename i_dir_acl to i_size_high)
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      7e6e1ef4
  2. 03 Dec, 2016 8 commits
  3. 02 Dec, 2016 1 commit
    • Theodore Ts'o's avatar
      ext4: fix reading new encrypted symlinks on no-journal file systems · 4db0d88e
      Theodore Ts'o authored
      On a filesystem with no journal, a symlink longer than about 32
      characters (exact length depending on padding for encryption) could not
      be followed or read immediately after being created in an encrypted
      directory.  This happened because when the symlink data went through the
      delayed allocation path instead of the journaling path, the symlink was
      incorrectly detected as a "fast" symlink rather than a "slow" symlink
      until its data was written out.
      
      To fix this, disable delayed allocation for symlinks, since there is
      no benefit for delayed allocation anyway.
      Reported-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      4db0d88e
  4. 01 Dec, 2016 8 commits
    • Eryu Guan's avatar
      ext4: validate s_first_meta_bg at mount time · 3a4b77cd
      Eryu Guan authored
      Ralf Spenneberg reported that he hit a kernel crash when mounting a
      modified ext4 image. And it turns out that kernel crashed when
      calculating fs overhead (ext4_calculate_overhead()), this is because
      the image has very large s_first_meta_bg (debug code shows it's
      842150400), and ext4 overruns the memory in count_overhead() when
      setting bitmap buffer, which is PAGE_SIZE.
      
      ext4_calculate_overhead():
        buf = get_zeroed_page(GFP_NOFS);  <=== PAGE_SIZE buffer
        blks = count_overhead(sb, i, buf);
      
      count_overhead():
        for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400
                ext4_set_bit(EXT4_B2C(sbi, s++), buf);   <=== buffer overrun
                count++;
        }
      
      This can be reproduced easily for me by this script:
      
        #!/bin/bash
        rm -f fs.img
        mkdir -p /mnt/ext4
        fallocate -l 16M fs.img
        mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img
        debugfs -w -R "ssv first_meta_bg 842150400" fs.img
        mount -o loop fs.img /mnt/ext4
      
      Fix it by validating s_first_meta_bg first at mount time, and
      refusing to mount if its value exceeds the largest possible meta_bg
      number.
      Reported-by: default avatarRalf Spenneberg <ralf@os-t.de>
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      3a4b77cd
    • Eric Biggers's avatar
      ext4: correctly detect when an xattr value has an invalid size · d7614cc1
      Eric Biggers authored
      It was possible for an xattr value to have a very large size, which
      would then pass validation on 32-bit architectures due to a pointer
      wraparound.  Fix this by validating the size in a way which avoids
      pointer wraparound.
      
      It was also possible that a value's size would fit in the available
      space but its padded size would not.  This would cause an out-of-bounds
      memory write in ext4_xattr_set_entry when replacing the xattr value.
      For example, if an xattr value of unpadded size 253 bytes went until the
      very end of the inode or block, then using setxattr(2) to replace this
      xattr's value with 256 bytes would cause a write to the 3 bytes past the
      end of the inode or buffer, and the new xattr value would be incorrectly
      truncated.  Fix this by requiring that the padded size fit in the
      available space rather than the unpadded size.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d7614cc1
    • Eric Biggers's avatar
      ext4: don't read out of bounds when checking for in-inode xattrs · 290ab230
      Eric Biggers authored
      With i_extra_isize equal to or close to the available space, it was
      possible for us to read past the end of the inode when trying to detect
      or validate in-inode xattrs.  Fix this by checking for the needed extra
      space first.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      290ab230
    • Eric Biggers's avatar
      ext4: forbid i_extra_isize not divisible by 4 · 2dc8d9e1
      Eric Biggers authored
      i_extra_isize not divisible by 4 is problematic for several reasons:
      
      - It causes the in-inode xattr space to be misaligned, but the xattr
        header and entries are not declared __packed to express this
        possibility.  This may cause poor performance or incorrect code
        generation on some platforms.
      - When validating the xattr entries we can read past the end of the
        inode if the size available for xattrs is not a multiple of 4.
      - It allows the nonsensical i_extra_isize=1, which doesn't even leave
        enough room for i_extra_isize itself.
      
      Therefore, update ext4_iget() to consider i_extra_isize not divisible by
      4 to be an error, like the case where i_extra_isize is too large.
      
      This also matches the rule recently added to e2fsck for determining
      whether an inode has valid i_extra_isize.
      
      This patch shouldn't have any noticeable effect on
      non-corrupted/non-malicious filesystems, since the size of ext4_inode
      has always been a multiple of 4.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      2dc8d9e1
    • Eric Biggers's avatar
      ext4: disable pwsalt ioctl when encryption disabled by config · ba679017
      Eric Biggers authored
      On a CONFIG_EXT4_FS_ENCRYPTION=n kernel, the ioctls to get and set
      encryption policies were disabled but EXT4_IOC_GET_ENCRYPTION_PWSALT was
      not.  But there's no good reason to expose the pwsalt ioctl if the
      kernel doesn't support encryption.  The pwsalt ioctl was also disabled
      pre-4.8 (via ext4_sb_has_crypto() previously returning 0 when encryption
      was disabled by config) and seems to have been enabled by mistake when
      ext4 encryption was refactored to use fs/crypto/.  So let's disable it
      again.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      ba679017
    • Eric Biggers's avatar
      ext4: get rid of ext4_sb_has_crypto() · 35997d1c
      Eric Biggers authored
      ext4_sb_has_crypto() just called through to ext4_has_feature_encrypt(),
      and all callers except one were already using the latter.  So remove it
      and switch its one caller to ext4_has_feature_encrypt().
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      35997d1c
    • Daeho Jeong's avatar
      ext4: fix inode checksum calculation problem if i_extra_size is small · 05ac5aa1
      Daeho Jeong authored
      We've fixed the race condition problem in calculating ext4 checksum
      value in commit b47820ed ("ext4: avoid modifying checksum fields
      directly during checksum veficationon"). However, by this change,
      when calculating the checksum value of inode whose i_extra_size is
      less than 4, we couldn't calculate the checksum value in a proper way.
      This problem was found and reported by Nix, Thank you.
      Reported-by: default avatarNix <nix@esperi.org.uk>
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarYoungjin Gil <youngjin.gil@samsung.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      05ac5aa1
    • Jan Kara's avatar
      ext4: warn when page is dirtied without buffers · 6dcc693b
      Jan Kara authored
      Warn when a page is dirtied without buffers (as that will likely lead to
      a crash in ext4_writepages()) or when it gets newly dirtied without the
      page being locked (as there is nothing that prevents buffers to get
      stripped just before calling set_page_dirty() under memory pressure).
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6dcc693b
  5. 29 Nov, 2016 2 commits
    • Jan Kara's avatar
      ext4: be more strict when verifying flags set via SETFLAGS ioctls · d14e7683
      Jan Kara authored
      Currently we just silently ignore flags that we don't understand (or
      that cannot be manipulated) through EXT4_IOC_SETFLAGS and
      EXT4_IOC_FSSETXATTR ioctls. This makes it problematic for the unused
      flags to be used in future (some app may be inadvertedly setting them
      and we won't notice until the flag gets used). Also this is inconsistent
      with other filesystems like XFS or BTRFS which return EOPNOTSUPP when
      they see a flag they cannot set.
      
      ext4 has the additional problem that there are flags which are returned
      by EXT4_IOC_GETFLAGS ioctl but which cannot be modified via
      EXT4_IOC_SETFLAGS. So we have to be careful to ignore value of these
      flags and not fail the ioctl when they are set (as e.g. chattr(1) passes
      flags returned from EXT4_IOC_GETFLAGS to EXT4_IOC_SETFLAGS without any
      masking and thus we'd break this utility).
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d14e7683
    • Jan Kara's avatar
      ext4: add EXT4_JOURNAL_DATA_FL and EXT4_EXTENTS_FL to modifiable mask · f8011d93
      Jan Kara authored
      Add EXT4_JOURNAL_DATA_FL and EXT4_EXTENTS_FL to EXT4_FL_USER_MODIFIABLE
      to recognize that they are modifiable by userspace. So far we got away
      without having them there because ext4_ioctl_setflags() treats them in a
      special way. But it was really confusing like that.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      f8011d93
  6. 26 Nov, 2016 1 commit
  7. 23 Nov, 2016 1 commit
  8. 21 Nov, 2016 4 commits
    • Eric Biggers's avatar
      ext4: avoid lockdep warning when inheriting encryption context · 2f8f5e76
      Eric Biggers authored
      On a lockdep-enabled kernel, xfstests generic/027 fails due to a lockdep
      warning when run on ext4 mounted with -o test_dummy_encryption:
      
          xfs_io/4594 is trying to acquire lock:
           (jbd2_handle
          ){++++.+}, at:
          [<ffffffff813096ef>] jbd2_log_wait_commit+0x5/0x11b
      
          but task is already holding lock:
           (jbd2_handle
          ){++++.+}, at:
          [<ffffffff813000de>] start_this_handle+0x354/0x3d8
      
      The abbreviated call stack is:
      
       [<ffffffff813096ef>] ? jbd2_log_wait_commit+0x5/0x11b
       [<ffffffff8130972a>] jbd2_log_wait_commit+0x40/0x11b
       [<ffffffff813096ef>] ? jbd2_log_wait_commit+0x5/0x11b
       [<ffffffff8130987b>] ? __jbd2_journal_force_commit+0x76/0xa6
       [<ffffffff81309896>] __jbd2_journal_force_commit+0x91/0xa6
       [<ffffffff813098b9>] jbd2_journal_force_commit_nested+0xe/0x18
       [<ffffffff812a6049>] ext4_should_retry_alloc+0x72/0x79
       [<ffffffff812f0c1f>] ext4_xattr_set+0xef/0x11f
       [<ffffffff812cc35b>] ext4_set_context+0x3a/0x16b
       [<ffffffff81258123>] fscrypt_inherit_context+0xe3/0x103
       [<ffffffff812ab611>] __ext4_new_inode+0x12dc/0x153a
       [<ffffffff812bd371>] ext4_create+0xb7/0x161
      
      When a file is created in an encrypted directory, ext4_set_context() is
      called to set an encryption context on the new file.  This calls
      ext4_xattr_set(), which contains a retry loop where the journal is
      forced to commit if an ENOSPC error is encountered.
      
      If the task actually were to wait for the journal to commit in this
      case, then it would deadlock because a handle remains open from
      __ext4_new_inode(), so the running transaction can't be committed yet.
      Fortunately, __jbd2_journal_force_commit() avoids the deadlock by not
      allowing the running transaction to be committed while the current task
      has it open.  However, the above lockdep warning is still triggered.
      
      This was a false positive which was introduced by: 1eaa566d: jbd2:
      track more dependencies on transaction commit
      
      Fix the problem by passing the handle through the 'fs_data' argument to
      ext4_set_context(), then using ext4_xattr_set_handle() instead of
      ext4_xattr_set().  And in the case where no journal handle is specified
      and ext4_set_context() has to open one, add an ENOSPC retry loop since
      in that case it is the outermost transaction.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      2f8f5e76
    • Ross Zwisler's avatar
      ext4: remove unused function ext4_aligned_io() · d086630e
      Ross Zwisler authored
      The last user of ext4_aligned_io() was the DAX path in
      ext4_direct_IO_write().  This usage was removed by Jan Kara's patch
      entitled "ext4: Rip out DAX handling from direct IO path".
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d086630e
    • Jan Kara's avatar
      dax: rip out get_block based IO support · dd936e43
      Jan Kara authored
      No one uses functions using the get_block callback anymore. Rip them
      out and update documentation.
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      dd936e43
    • Jan Kara's avatar
      ext2: use iomap_zero_range() for zeroing truncated page in DAX path · 00697eed
      Jan Kara authored
      Currently the last user of ext2_get_blocks() for DAX inodes was
      dax_truncate_page(). Convert that to iomap_zero_range() so that all DAX
      IO uses the iomap path.
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      00697eed
  9. 20 Nov, 2016 8 commits
  10. 18 Nov, 2016 4 commits