1. 25 May, 2019 1 commit
  2. 24 May, 2019 2 commits
  3. 22 May, 2019 1 commit
    • Theodore Ts'o's avatar
      ext4: don't perform block validity checks on the journal inode · 0a944e8a
      Theodore Ts'o authored
      Since the journal inode is already checked when we added it to the
      block validity's system zone, if we check it again, we'll just trigger
      a failure.
      
      This was causing failures like this:
      
      [   53.897001] EXT4-fs error (device sda): ext4_find_extent:909: inode
      #8: comm jbd2/sda-8: pblk 121667583 bad header/extent: invalid extent entries - magic f30a, entries 8, max 340(340), depth 0(0)
      [   53.931430] jbd2_journal_bmap: journal block not found at offset 49 on sda-8
      [   53.938480] Aborting journal on device sda-8.
      
      ... but only if the system was under enough memory pressure that
      logical->physical mapping for the journal inode gets pushed out of the
      extent cache.  (This is why it wasn't noticed earlier.)
      
      Fixes: 345c0dbf ("ext4: protect journal inode's blocks using block_validity")
      Reported-by: default avatarDan Rue <dan.rue@linaro.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Tested-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      0a944e8a
  4. 17 May, 2019 1 commit
  5. 15 May, 2019 1 commit
  6. 12 May, 2019 3 commits
  7. 11 May, 2019 4 commits
    • Colin Ian King's avatar
      ext4: unsigned int compared against zero · fbbbbd2f
      Colin Ian King authored
      There are two cases where u32 variables n and err are being checked
      for less than zero error values, the checks is always false because
      the variables are not signed. Fix this by making the variables ints.
      
      Addresses-Coverity: ("Unsigned compared against 0")
      Fixes: 345c0dbf ("ext4: protect journal inode's blocks using block_validity")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      fbbbbd2f
    • Sahitya Tummala's avatar
      ext4: fix use-after-free in dx_release() · 08fc98a4
      Sahitya Tummala authored
      The buffer_head (frames[0].bh) and it's corresping page can be
      potentially free'd once brelse() is done inside the for loop
      but before the for loop exits in dx_release(). It can be free'd
      in another context, when the page cache is flushed via
      drop_caches_sysctl_handler(). This results into below data abort
      when accessing info->indirect_levels in dx_release().
      
      Unable to handle kernel paging request at virtual address ffffffc17ac3e01e
      Call trace:
       dx_release+0x70/0x90
       ext4_htree_fill_tree+0x2d4/0x300
       ext4_readdir+0x244/0x6f8
       iterate_dir+0xbc/0x160
       SyS_getdents64+0x94/0x174
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Cc: stable@kernel.org
      08fc98a4
    • Lukas Czerner's avatar
      ext4: fix data corruption caused by overlapping unaligned and aligned IO · 57a0da28
      Lukas Czerner authored
      Unaligned AIO must be serialized because the zeroing of partial blocks
      of unaligned AIO can result in data corruption in case it's overlapping
      another in flight IO.
      
      Currently we wait for all unwritten extents before we submit unaligned
      AIO which protects data in case of unaligned AIO is following overlapping
      IO. However if a unaligned AIO is followed by overlapping aligned AIO we
      can still end up corrupting data.
      
      To fix this, we must make sure that the unaligned AIO is the only IO in
      flight by waiting for unwritten extents conversion not just before the
      IO submission, but right after it as well.
      
      This problem can be reproduced by xfstest generic/538
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      57a0da28
    • Chengguang Xu's avatar
      jbd2: fix potential double free · 0d52154b
      Chengguang Xu authored
      When failing from creating cache jbd2_inode_cache, we will destroy the
      previously created cache jbd2_handle_cache twice.  This patch fixes
      this by moving each cache initialization/destruction to its own
      separate, individual function.
      Signed-off-by: default avatarChengguang Xu <cgxu519@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      0d52154b
  8. 10 May, 2019 1 commit
  9. 06 May, 2019 1 commit
  10. 01 May, 2019 1 commit
  11. 28 Apr, 2019 1 commit
    • Masahiro Yamada's avatar
      unicode: refactor the rule for regenerating utf8data.h · 28ba53c0
      Masahiro Yamada authored
      scripts/mkutf8data is used only when regenerating utf8data.h,
      which never happens in the normal kernel build. However, it is
      irrespectively built if CONFIG_UNICODE is enabled.
      
      Moreover, there is no good reason for it to reside in the scripts/
      directory since it is only used in fs/unicode/.
      
      Hence, move it from scripts/ to fs/unicode/.
      
      In some cases, we bypass build artifacts in the normal build. The
      conventional way to do so is to surround the code with ifdef REGENERATE_*.
      
      For example,
      
       - 7373f4f8 ("kbuild: add implicit rules for parser generation")
       - 6aaf49b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped")
      
      I rewrote the rule in a more kbuild'ish style.
      
      In the normal build, utf8data.h is just shipped from the check-in file.
      
      $ make
        [ snip ]
        SHIPPED fs/unicode/utf8data.h
        CC      fs/unicode/utf8-norm.o
        CC      fs/unicode/utf8-core.o
        CC      fs/unicode/utf8-selftest.o
        AR      fs/unicode/built-in.a
      
      If you want to generate utf8data.h based on UCD, put *.txt files into
      fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line.
      The mkutf8data tool will be automatically compiled to generate the
      utf8data.h from the *.txt files.
      
      $ make REGENERATE_UTF8DATA=1
        [ snip ]
        HOSTCC  fs/unicode/mkutf8data
        GEN     fs/unicode/utf8data.h
        CC      fs/unicode/utf8-norm.o
        CC      fs/unicode/utf8-core.o
        CC      fs/unicode/utf8-selftest.o
        AR      fs/unicode/built-in.a
      
      I renamed the check-in utf8data.h to utf8data.h_shipped so that this
      will work for the out-of-tree build.
      
      You can update it based on the latest UCD like this:
      
      $ make REGENERATE_UTF8DATA=1 fs/unicode/
      $ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped
      
      Also, I added entries to .gitignore and dontdiff.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      28ba53c0
  12. 25 Apr, 2019 14 commits
    • Gabriel Krisman Bertazi's avatar
      docs: ext4.rst: document case-insensitive directories · 0a790fe4
      Gabriel Krisman Bertazi authored
      Introduces the case-insensitive features on ext4 for system
      administrators.  Explain the minimum of design decisions that are
      important for sysadmins wanting to enable this feature.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      0a790fe4
    • Gabriel Krisman Bertazi's avatar
      ext4: Support case-insensitive file name lookups · b886ee3e
      Gabriel Krisman Bertazi authored
      This patch implements the actual support for case-insensitive file name
      lookups in ext4, based on the feature bit and the encoding stored in the
      superblock.
      
      A filesystem that has the casefold feature set is able to configure
      directories with the +F (EXT4_CASEFOLD_FL) attribute, enabling lookups
      to succeed in that directory in a case-insensitive fashion, i.e: match
      a directory entry even if the name used by userspace is not a byte per
      byte match with the disk name, but is an equivalent case-insensitive
      version of the Unicode string.  This operation is called a
      case-insensitive file name lookup.
      
      The feature is configured as an inode attribute applied to directories
      and inherited by its children.  This attribute can only be enabled on
      empty directories for filesystems that support the encoding feature,
      thus preventing collision of file names that only differ by case.
      
      * dcache handling:
      
      For a +F directory, Ext4 only stores the first equivalent name dentry
      used in the dcache. This is done to prevent unintentional duplication of
      dentries in the dcache, while also allowing the VFS code to quickly find
      the right entry in the cache despite which equivalent string was used in
      a previous lookup, without having to resort to ->lookup().
      
      d_hash() of casefolded directories is implemented as the hash of the
      casefolded string, such that we always have a well-known bucket for all
      the equivalencies of the same string. d_compare() uses the
      utf8_strncasecmp() infrastructure, which handles the comparison of
      equivalent, same case, names as well.
      
      For now, negative lookups are not inserted in the dcache, since they
      would need to be invalidated anyway, because we can't trust missing file
      dentries.  This is bad for performance but requires some leveraging of
      the vfs layer to fix.  We can live without that for now, and so does
      everyone else.
      
      * on-disk data:
      
      Despite using a specific version of the name as the internal
      representation within the dcache, the name stored and fetched from the
      disk is a byte-per-byte match with what the user requested, making this
      implementation 'name-preserving'. i.e. no actual information is lost
      when writing to storage.
      
      DX is supported by modifying the hashes used in +F directories to make
      them case/encoding-aware.  The new disk hashes are calculated as the
      hash of the full casefolded string, instead of the string directly.
      This allows us to efficiently search for file names in the htree without
      requiring the user to provide an exact name.
      
      * Dealing with invalid sequences:
      
      By default, when a invalid UTF-8 sequence is identified, ext4 will treat
      it as an opaque byte sequence, ignoring the encoding and reverting to
      the old behavior for that unique file.  This means that case-insensitive
      file name lookup will not work only for that file.  An optional bit can
      be set in the superblock telling the filesystem code and userspace tools
      to enforce the encoding.  When that optional bit is set, any attempt to
      create a file name using an invalid UTF-8 sequence will fail and return
      an error to userspace.
      
      * Normalization algorithm:
      
      The UTF-8 algorithms used to compare strings in ext4 is implemented
      lives in fs/unicode, and is based on a previous version developed by
      SGI.  It implements the Canonical decomposition (NFD) algorithm
      described by the Unicode specification 12.1, or higher, combined with
      the elimination of ignorable code points (NFDi) and full
      case-folding (CF) as documented in fs/unicode/utf8_norm.c.
      
      NFD seems to be the best normalization method for EXT4 because:
      
        - It has a lower cost than NFC/NFKC (which requires
          decomposing to NFD as an intermediary step)
        - It doesn't eliminate important semantic meaning like
          compatibility decompositions.
      
      Although:
      
        - This implementation is not completely linguistic accurate, because
        different languages have conflicting rules, which would require the
        specialization of the filesystem to a given locale, which brings all
        sorts of problems for removable media and for users who use more than
        one language.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      b886ee3e
    • Gabriel Krisman Bertazi's avatar
      ext4: include charset encoding information in the superblock · c83ad55e
      Gabriel Krisman Bertazi authored
      Support for encoding is considered an incompatible feature, since it has
      potential to create collisions of file names in existing filesystems.
      If the feature flag is not enabled, the entire filesystem will operate
      on opaque byte sequences, respecting the original behavior.
      
      The s_encoding field stores a magic number indicating the encoding
      format and version used globally by file and directory names in the
      filesystem.  The s_encoding_flags defines policies for using the charset
      encoding, like how to handle invalid sequences.  The magic number is
      mapped to the exact charset table, but the mapping is specific to ext4.
      Since we don't have any commitment to support old encodings, the only
      encoding I am supporting right now is utf8-12.1.0.
      
      The current implementation prevents the user from enabling encoding and
      per-directory encryption on the same filesystem at the same time.  The
      incompatibility between these features lies in how we do efficient
      directory searches when we cannot be sure the encryption of the user
      provided fname will match the actual hash stored in the disk without
      decrypting every directory entry, because of normalization cases.  My
      quickest solution is to simply block the concurrent use of these
      features for now, and enable it later, once we have a better solution.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      c83ad55e
    • Gabriel Krisman Bertazi's avatar
    • Gabriel Krisman Bertazi's avatar
      unicode: update unicode database unicode version 12.1.0 · 1215d239
      Gabriel Krisman Bertazi authored
      Regenerate utf8data.h based on the latest UCD files and run tests
      against the latest version.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      1215d239
    • Gabriel Krisman Bertazi's avatar
      unicode: introduce test module for normalized utf8 implementation · f0d6cc00
      Gabriel Krisman Bertazi authored
      This implements a in-kernel sanity test module for the utf8
      normalization core.  At probe time, it will run basic sequences through
      the utf8n core, to identify problems will equivalent sequences and
      normalization/casefold code.  This is supposed to be useful for
      regression testing when adding support for a new version of utf8 to
      linux.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      f0d6cc00
    • Gabriel Krisman Bertazi's avatar
      unicode: implement higher level API for string handling · 9d53690f
      Gabriel Krisman Bertazi authored
      This patch integrates the utf8n patches with some higher level API to
      perform UTF-8 string comparison, normalization and casefolding
      operations.  Implemented is a variation of NFD, and casefold is
      performed by doing full casefold on top of NFD.  These algorithms are
      based on the core implemented by Olaf Weber from SGI.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      9d53690f
    • Olaf Weber's avatar
      unicode: reduce the size of utf8data[] · a8384c68
      Olaf Weber authored
      Remove the Hangul decompositions from the utf8data trie, and do
      algorithmic decomposition to calculate them on the fly. To store the
      decomposition the caller of utf8lookup()/utf8nlookup() must provide a
      12-byte buffer, which is used to synthesize a leaf with the
      decomposition. This significantly reduces the size of the utf8data[]
      array.
      
      Changes made by Gabriel:
        Rebase to mainline
        Fix checkpatch errors
        Extract robustness fixes and merge back to original mkutf8data.c patch
        Regenerate utf8data.h
      Signed-off-by: default avatarOlaf Weber <olaf@sgi.com>
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      a8384c68
    • Olaf Weber's avatar
      unicode: introduce code for UTF-8 normalization · 44594c2f
      Olaf Weber authored
      Supporting functions for UTF-8 normalization are in utf8norm.c with the
      header utf8norm.h. Two normalization forms are supported: nfdi and
      nfdicf.
      
        nfdi:
         - Apply unicode normalization form NFD.
         - Remove any Default_Ignorable_Code_Point.
      
        nfdicf:
         - Apply unicode normalization form NFD.
         - Remove any Default_Ignorable_Code_Point.
         - Apply a full casefold (C + F).
      
      For the purposes of the code, a string is valid UTF-8 if:
      
       - The values encoded are 0x1..0x10FFFF.
       - The surrogate codepoints 0xD800..0xDFFFF are not encoded.
       - The shortest possible encoding is used for all values.
      
      The supporting functions work on null-terminated strings (utf8 prefix)
      and on length-limited strings (utf8n prefix).
      
      From the original SGI patch and for conformity with coding standards,
      the utf8data_t typedef was dropped, since it was just masking the struct
      keyword.  On other occasions, namely utf8leaf_t and utf8trie_t, I
      decided to keep it, since they are simple pointers to memory buffers,
      and using uchars here wouldn't provide any more meaningful information.
      
      From the original submission, we also converted from the compatibility
      form to canonical.
      
      Changes made by Gabriel:
        Rebase to Mainline
        Fix up checkpatch.pl warnings
        Drop typedefs
        move out of libxfs
        Convert from NFKD to NFD
      Signed-off-by: default avatarOlaf Weber <olaf@sgi.com>
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      44594c2f
    • Gabriel Krisman Bertazi's avatar
      unicode: introduce UTF-8 character database · 955405d1
      Gabriel Krisman Bertazi authored
      The decomposition and casefolding of UTF-8 characters are described in a
      prefix tree in utf8data.h, which is a generate from the Unicode
      Character Database (UCD), published by the Unicode Consortium, and
      should not be edited by hand.  The structures in utf8data.h are meant to
      be used for lookup operations by the unicode subsystem, when decoding a
      utf-8 string.
      
      mkutf8data.c is the source for a program that generates utf8data.h. It
      was written by Olaf Weber from SGI and originally proposed to be merged
      into Linux in 2014.  The original proposal performed the compatibility
      decomposition, NFKD, but the current version was modified by me to do
      canonical decomposition, NFD, as suggested by the community.  The
      changes from the original submission are:
      
        * Rebase to mainline.
        * Fix out-of-tree-build.
        * Update makefile to build 11.0.0 ucd files.
        * drop references to xfs.
        * Convert NFKD to NFD.
        * Merge back robustness fixes from original patch. Requested by
          Dave Chinner.
      
      The original submission is archived at:
      
      <https://linux-xfs.oss.sgi.narkive.com/Xx10wjVY/rfc-unicode-utf-8-support-for-xfs>
      
      The utf8data.h file can be regenerated using the instructions in
      fs/unicode/README.utf8data.
      
      - Notes on the update from 8.0.0 to 11.0:
      
      The structure of the ucd files and special cases have not experienced
      any changes between versions 8.0.0 and 11.0.0.  8.0.0 saw the addition
      of Cherokee LC characters, which is an interesting case for
      case-folding.  The update is accompanied by new tests on the test_ucd
      module to catch specific cases.  No changes to mkutf8data script were
      required for the updates.
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.co.uk>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      955405d1
    • Kirill Tkhai's avatar
      ext4: actually request zeroing of inode table after grow · 310a997f
      Kirill Tkhai authored
      It is never possible, that number of block groups decreases,
      since only online grow is supported.
      
      But after a growing occured, we have to zero inode tables
      for just created new block groups.
      
      Fixes: 19c5246d ("ext4: add new online resize interface")
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      310a997f
    • Khazhismel Kumykov's avatar
    • Barret Rhoden's avatar
      ext4: fix use-after-free race with debug_want_extra_isize · 7bc04c5c
      Barret Rhoden authored
      When remounting with debug_want_extra_isize, we were not performing the
      same checks that we do during a normal mount.  That allowed us to set a
      value for s_want_extra_isize that reached outside the s_inode_size.
      
      Fixes: e2b911c5 ("ext4: clean up feature test macros with predicate functions")
      Reported-by: syzbot+f584efa0ac7213c226b7@syzkaller.appspotmail.com
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarBarret Rhoden <brho@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      7bc04c5c
    • Pan Bian's avatar
      ext4: avoid drop reference to iloc.bh twice · 8c380ab4
      Pan Bian authored
      The reference to iloc.bh has been dropped in ext4_mark_iloc_dirty.
      However, the reference is dropped again if error occurs during
      ext4_handle_dirty_metadata, which may result in use-after-free bugs.
      
      Fixes: fb265c9c("ext4: add ext4_sb_bread() to disambiguate ENOMEM cases")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      8c380ab4
  13. 10 Apr, 2019 2 commits
  14. 07 Apr, 2019 2 commits
    • Arnd Bergmann's avatar
      ext4: use BUG() instead of BUG_ON(1) · 1e83bc81
      Arnd Bergmann authored
      BUG_ON(1) leads to bogus warnings from clang when
      CONFIG_PROFILE_ANNOTATED_BRANCHES is set:
      
       fs/ext4/inode.c:544:4: error: variable 'retval' is used uninitialized whenever 'if' condition is false
            [-Werror,-Wsometimes-uninitialized]
                              BUG_ON(1);
                              ^~~~~~~~~
       include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON'
                                         ^~~~~~~~~~~~~~~~~~~
       include/linux/compiler.h:48:23: note: expanded from macro 'unlikely'
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       fs/ext4/inode.c:591:6: note: uninitialized use occurs here
              if (retval > 0 && map->m_flags & EXT4_MAP_MAPPED) {
                  ^~~~~~
       fs/ext4/inode.c:544:4: note: remove the 'if' if its condition is always true
                              BUG_ON(1);
                              ^
       include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON'
                                     ^
       fs/ext4/inode.c:502:12: note: initialize the variable 'retval' to silence this warning
      
      Change it to BUG() so clang can see that this code path can never
      continue.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      1e83bc81
    • Liu Xiang's avatar
      ext4: fix prefetchw of NULL page · d454a273
      Liu Xiang authored
      In ext4_mpage_readpages(), if the parameter pages is not NULL, another
      parameter page is NULL. At the first time prefetchw(&page->flags)
      works on NULL. From second time, prefetchw(&page->flags) always works on
      the last consumed page. This might do little improvment for handling
      current page. So prefetchw() should be called while the page pointer
      has just been updated.
      Signed-off-by: default avatarLiu Xiang <liu.xiang6@zte.com.cn>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d454a273
  15. 06 Apr, 2019 4 commits
  16. 31 Mar, 2019 1 commit