- 19 Oct, 2018 1 commit
-
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1798617 commit 4274f516 upstream. When mounting the superblock, ext4_fill_super() calculates the free blocks and free inodes and stores them in the superblock. It's not strictly necessary, since we don't use them any more, but it's nice to keep them roughly aligned to reality. Since it's not critical for file system correctness, the code doesn't call ext4_commit_super(). The problem is that it's in ext4_commit_super() that we recalculate the superblock checksum. So if we're not going to call ext4_commit_super(), we need to call ext4_superblock_csum_set() to make sure the superblock checksum is consistent. Most of the time, this doesn't matter, since we end up calling ext4_commit_super() very soon thereafter, and definitely by the time the file system is unmounted. However, it doesn't work in this sequence: mke2fs -Fq -t ext4 /dev/vdc 128M mount /dev/vdc /vdc cp xfstests/git-versions /vdc godown /vdc umount /vdc mount /dev/vdc tune2fs -l /dev/vdc With this commit, the "tune2fs -l" no longer fails. Reported-by:
Chengguang Xu <cgxu519@gmx.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 02 Oct, 2018 1 commit
-
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1792174 commit 50122847 upstream. Commit 8844618d: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: 8844618d ("ext4: only look at the bg_flags field if it is valid") Reported-by:
Eric Whitney <enwlinux@gmail.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 05 Sep, 2018 1 commit
-
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1789653 Ext4_check_descriptors() was getting called before s_gdb_count was initialized. So for file systems w/o the meta_bg feature, allocation bitmaps could overlap the block group descriptors and ext4 wouldn't notice. For file systems with the meta_bg feature enabled, there was a fencepost error which would cause the ext4_check_descriptors() to incorrectly believe that the block allocation bitmap overlaps with the block group descriptor blocks, and it would reject the mount. Fix both of these problems. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org (backported from commit 44de022c ) Signed-off-by:
Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by:
Stefan Bader <stefan.bader@canonical.com> Acked-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 14 Aug, 2018 5 commits
-
-
Jon Derrick authored
BugLink: https://bugs.launchpad.net/bugs/1784409 commit a17712c8 upstream. This patch attempts to close a hole leading to a BUG seen with hot removals during writes [1]. A block device (NVME namespace in this test case) is formatted to EXT4 without partitions. It's mounted and write I/O is run to a file, then the device is hot removed from the slot. The superblock attempts to be written to the drive which is no longer present. The typical chain of events leading to the BUG: ext4_commit_super() __sync_dirty_buffer() submit_bh() submit_bh_wbc() BUG_ON(!buffer_mapped(bh)); This fix checks for the superblock's buffer head being mapped prior to syncing. [1] https://www.spinics.net/lists/linux-ext4/msg56527.html Signed-off-by:
Jon Derrick <jonathan.derrick@intel.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Khalid Elmously <khalid.elmously@canonical.com>
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1784409 commit bfe0a5f4 upstream. The kernel's ext4 mount-time checks were more permissive than e2fsprogs's libext2fs checks when opening a file system. The superblock is considered too insane for debugfs or e2fsck to operate on it, the kernel has no business trying to mount it. This will make file system fuzzing tools work harder, but the failure cases that they find will be more useful and be easier to evaluate. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Khalid Elmously <khalid.elmously@canonical.com>
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1784409 commit c37e9e01 upstream. If there is a directory entry pointing to a system inode (such as a journal inode), complain and declare the file system to be corrupted. Also, if the superblock's first inode number field is too small, refuse to mount the file system. This addresses CVE-2018-10882. https://bugzilla.kernel.org/show_bug.cgi?id=200069 Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Khalid Elmously <khalid.elmously@canonical.com>
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1784409 commit 8844618d upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403 Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Khalid Elmously <khalid.elmously@canonical.com>
-
Theodore Ts'o authored
BugLink: https://bugs.launchpad.net/bugs/1784409 commit 77260807 upstream. It's really bad when the allocation bitmaps and the inode table overlap with the block group descriptors, since it causes random corruption of the bg descriptors. So we really want to head those off at the pass. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Khalid Elmously <khalid.elmously@canonical.com>
-
- 23 May, 2018 1 commit
-
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1768429 commit 18db4b4e upstream. If some metadata block, such as an allocation bitmap, overlaps the superblock, it's very likely that if the file system is mounted read/write, the results will not be pretty. So disallow r/w mounts for file systems corrupted in this particular way. Backport notes: 3.18.y is missing bc98a42c ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") and e462ec50 ("VFS: Differentiate mount flags (MS_*) from internal superblock flags") so we simply use the sb MS_RDONLY check from pre bc98a42c in place of the sb_rdonly function used in the upstream variant of the patch. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by:
Harsh Shandilya <harsh@prjkt.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Juerg Haefliger <juergh@canonical.com> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 04 Apr, 2018 1 commit
-
-
Zhouyi Zhou authored
BugLink: http://bugs.launchpad.net/bugs/1756860 commit 06f29cc8 upstream. In the function __ext4_grp_locked_error(), __save_error_info() is called to save error info in super block block, but does not sync that information to disk to info the subsequence fsck after reboot. This patch writes the error information to disk. After this patch, I think there is no obvious EXT4 error handle branches which leads to "Remounting filesystem read-only" will leave the disk partition miss the subsequence fsck. Signed-off-by:
Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Juerg Haefliger <juergh@canonical.com> Signed-off-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com>
-
- 23 Jan, 2018 2 commits
-
-
Andreas Gruenbacher authored
This variable, introduced in commit 9c191f70 , is unnecessary: it is set once the module has been initialized correctly, and ext4_fill_super cannot run unless the module has been initialized correctly. Signed-off-by:
Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> (cherry picked from commit 2335d05f ) CVE-2015-8952 Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by:
Stefan Bader <stefan.bader@canonical.com> Acked-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
Jan Kara authored
The conversion is generally straightforward. The only tricky part is that xattr block corresponding to found mbcache entry can get freed before we get buffer lock for that block. So we have to check whether the entry is still valid after getting buffer lock. Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> (backported from commit 82939d79 ) CVE-2015-8952 Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by:
Stefan Bader <stefan.bader@canonical.com> Acked-by:
Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
- 16 Nov, 2017 1 commit
-
-
Jan Kara authored
BugLink: http://bugs.launchpad.net/bugs/1731915 [ Upstream commit 5469d7c3 ] Avoid using stripe_width for sbi->s_stripe value if it is not actually set. It prevents using the stride for sbi->s_stripe. Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
- 05 Oct, 2017 2 commits
-
-
zhangyi (F) authored
BugLink: http://bugs.launchpad.net/bugs/1721477 commit 95f1fda4 upstream. Quota does not get enabled for read-only mounts if filesystem has quota feature, so that quotas cannot updated during orphan cleanup, which will lead to quota inconsistency. This patch turn on quotas during orphan cleanup for this case, make sure quotas can be updated correctly. Reported-by:
Jan Kara <jack@suse.cz> Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
zhangyi (F) authored
BugLink: http://bugs.launchpad.net/bugs/1721477 commit b0a5a958 upstream. Current ext4 quota should always "usage enabled" if the quota feautre is enabled. But in ext4_orphan_cleanup(), it turn quotas off directly (used for the older journaled quota), so we cannot turn it on again via "quotaon" unless umount and remount ext4. Simple reproduce: mkfs.ext4 -O project,quota /dev/vdb1 mount -o prjquota /dev/vdb1 /mnt chattr -p 123 /mnt chattr +P /mnt touch /mnt/aa /mnt/bb exec 100<>/mnt/aa rm -f /mnt/aa sync echo c > /proc/sysrq-trigger #reboot and mount mount -o prjquota /dev/vdb1 /mnt #query status quotaon -Ppv /dev/vdb1 #output quotaon: Cannot find mountpoint for device /dev/vdb1 quotaon: No correct mountpoint specified. This patch add check for journaled quotas to avoid incorrect quotaoff when ext4 has quota feautre. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Stefan Bader <stefan.bader@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
- 06 Apr, 2017 3 commits
-
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1676424 commit 2ba3e6e8 upstream. It is OK for s_first_meta_bg to be equal to the number of block group descriptor blocks. (It rarely happens, but it shouldn't cause any problems.) https://bugzilla.kernel.org/show_bug.cgi?id=194567 Fixes: 3a4b77cd Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1673538 commit 4753d8a2 upstream. If the file system requires journal recovery, and the device is read-ony, return EROFS to the mount system call. This allows xfstests generic/050 to pass. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1673538 commit 97abd7d4 upstream. If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
- 08 Mar, 2017 1 commit
-
-
Eryu Guan authored
BugLink: http://bugs.launchpad.net/bugs/1663657 commit 3a4b77cd upstream. Ralf Spenneberg reported that he hit a kernel crash when mounting a modified ext4 image. And it turns out that kernel crashed when calculating fs overhead (ext4_calculate_overhead()), this is because the image has very large s_first_meta_bg (debug code shows it's 842150400), and ext4 overruns the memory in count_overhead() when setting bitmap buffer, which is PAGE_SIZE. ext4_calculate_overhead(): buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer blks = count_overhead(sb, i, buf); count_overhead(): for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400 ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun count++; } This can be reproduced easily for me by this script: #!/bin/bash rm -f fs.img mkdir -p /mnt/ext4 fallocate -l 16M fs.img mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img debugfs -w -R "ssv first_meta_bg 842150400" fs.img mount -o loop fs.img /mnt/ext4 Fix it by validating s_first_meta_bg first at mount time, and refusing to mount if its value exceeds the largest possible meta_bg number. Reported-by:
Ralf Spenneberg <ralf@os-t.de> Signed-off-by:
Eryu Guan <guaneryu@gmail.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
-
- 10 Jan, 2017 4 commits
-
-
Sergey Karamov authored
BugLink: http://bugs.launchpad.net/bugs/1654602 commit 73b92a2a upstream. Currently data journalling is incompatible with encryption: enabling both at the same time has never been supported by design, and would result in unpredictable behavior. However, users are not precluded from turning on both features simultaneously. This change programmatically replaces data journaling for encrypted regular files with ordered data journaling mode. Background: Journaling encrypted data has not been supported because it operates on buffer heads of the page in the page cache. Namely, when the commit happens, which could be up to five seconds after caching, the commit thread uses the buffer heads attached to the page to copy the contents of the page to the journal. With encryption, it would have been required to keep the bounce buffer with ciphertext for up to the aforementioned five seconds, since the page cache can only hold plaintext and could not be used for journaling. Alternatively, it would be required to setup the journal to initiate a callback at the commit time to perform deferred encryption - in this case, not only would the data have to be written twice, but it would also have to be encrypted twice. This level of complexity was not justified for a mode that in practice is very rarely used because of the overhead from the data journalling. Solution: If data=journaled has been set as a mount option for a filesystem, or if journaling is enabled on a regular file, do not perform journaling if the file is also encrypted, instead fall back to the data=ordered mode for the file. Rationale: The intent is to allow seamless and proper filesystem operation when journaling and encryption have both been enabled, and have these two conflicting features gracefully resolved by the filesystem. Fixes: 44614711 Signed-off-by:
Sergey Karamov <skaramov@google.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Luis Henriques <luis.henriques@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1654602 commit c48ae41b upstream. The commit "ext4: sanity check the block and cluster size at mount time" should prevent any problems, but in case the superblock is modified while the file system is mounted, add an extra safety check to make sure we won't overrun the allocated buffer. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Luis Henriques <luis.henriques@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1654602 commit 5aee0f8a upstream. Fix a large number of problems with how we handle mount options in the superblock. For one, if the string in the superblock is long enough that it is not null terminated, we could run off the end of the string and try to interpret superblocks fields as characters. It's unlikely this will cause a security problem, but it could result in an invalid parse. Also, parse_options is destructive to the string, so in some cases if there is a comma-separated string, it would be modified in the superblock. (Fortunately it only happens on file systems with a 1k block size.) Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Conflicts: fs/ext4/super.c Signed-off-by:
Luis Henriques <luis.henriques@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1654602 commit cd6bb35b upstream. Centralize the checks for inodes_per_block and be more strict to make sure the inodes_per_block_group can't end up being zero. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Luis Henriques <luis.henriques@canonical.com>
-
- 06 Dec, 2016 1 commit
-
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1645453 commit 8cdf3372 upstream. If the block size or cluster size is insane, reject the mount. This is important for security reasons (although we shouldn't be just depending on this check). Ref: http://www.securityfocus.com/archive/1/539661 Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506 Reported-by:
Borislav Petkov <bp@alien8.de> Reported-by:
Nikolay Borisov <kernel@kyup.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Luis Henriques <luis.henriques@canonical.com>
-
- 15 Sep, 2016 2 commits
-
-
Daeho Jeong authored
BugLink: http://bugs.launchpad.net/bugs/1624037 commit b47820ed upstream. We temporally change checksum fields in buffers of some types of metadata into '0' for verifying the checksum values. By doing this without locking the buffer, some metadata's checksums, which are being committed or written back to the storage, could be damaged. In our test, several metadata blocks were found with damaged metadata checksum value during recovery process. When we only verify the checksum value, we have to avoid modifying checksum fields directly. Signed-off-by:
Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by:
Youngjin Gil <youngjin.gil@samsung.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Cc: Török Edwin <edwin@etorok.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Kamal Mostafa <kamal@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1624037 commit 829fa70d upstream. A number of fuzzing failures seem to be caused by allocation bitmaps or other metadata blocks being pointed at the superblock. This can cause kernel BUG or WARNings once the superblock is overwritten, so validate the group descriptor blocks to make sure this doesn't happen. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Kamal Mostafa <kamal@canonical.com>
-
- 18 Aug, 2016 2 commits
-
-
Vegard Nossum authored
BugLink: http://bugs.launchpad.net/bugs/1614560 commit c65d5c6c upstream. If we encounter a filesystem error during orphan cleanup, we should stop. Otherwise, we may end up in an infinite loop where the same inode is processed again and again. EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters Aborting journal on device loop0-8. EXT4-fs (loop0): Remounting filesystem read-only EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed! [...] See-also: c9eb13a9 ("ext4: fix hang when processing corrupted orphaned inode list") Cc: Jan Kara <jack@suse.cz> Signed-off-by:
Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Kamal Mostafa <kamal@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1614560 commit 5b9554dc upstream. If s_reserved_gdt_blocks is extremely large, it's possible for ext4_init_block_bitmap(), which is called when ext4 sets up an uninitialized block bitmap, to corrupt random kernel memory. Add the same checks which e2fsck has --- it must never be larger than blocksize / sizeof(__u32) --- and then add a backup check in ext4_init_block_bitmap() in case the superblock gets modified after the file system is mounted. Reported-by:
Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com> Signed-off-by:
Kamal Mostafa <kamal@canonical.com>
-
- 16 May, 2016 1 commit
-
-
Jan Kara authored
BugLink: http://bugs.launchpad.net/bugs/1578798 commit ea3d7209 upstream. Currently, page faults and hole punching are completely unsynchronized. This can result in page fault faulting in a page into a range that we are punching after truncate_pagecache_range() has been called and thus we can end up with a page mapped to disk blocks that will be shortly freed. Filesystem corruption will shortly follow. Note that the same race is avoided for truncate by checking page fault offset against i_size but there isn't similar mechanism available for punching holes. Fix the problem by creating new rw semaphore i_mmap_sem in inode and grab it for writing over truncate, hole punching, and other functions removing blocks from extent tree and for read over page faults. We cannot easily use i_data_sem for this since that ranks below transaction start and we need something ranking above it so that it can be held over the whole truncate / hole punching operation. Also remove various workarounds we had in the code to reduce race window when page fault could have created pages with stale mapping information. Signed-off-by:
Jan Kara <jack@suse.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Kamal Mostafa <kamal@canonical.com>
-
- 21 Apr, 2016 2 commits
-
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1573034 commit c325a67c upstream. Previously, ext4 would fail the mount if the file system had the quota feature enabled and quota mount options (used for the older quota setups) were present. This broke xfstests, since xfs silently ignores the usrquote and grpquota mount options if they are specified. This commit changes things so that we are consistent with xfs; having the mount options specified is harmless, so no sense break users by forbidding them. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com>
-
Theodore Ts'o authored
BugLink: http://bugs.launchpad.net/bugs/1573034 commit daf647d2 upstream. With the internal Quota feature, mke2fs creates empty quota inodes and quota usage tracking is enabled as soon as the file system is mounted. Since quotacheck is no longer preallocating all of the blocks in the quota inode that are likely needed to be written to, we are now seeing a lockdep false positive caused by needing to allocate a quota block from inside ext4_map_blocks(), while holding i_data_sem for a data inode. This results in this complaint: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); Google-Bug-Id: 27907753 Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com>
-
- 06 Apr, 2016 2 commits
-
-
Seth Forshee authored
This is still an experimental feature, so disable it by default and allow it only when the system administrator supplies the userns_mounts=true module parameter. Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Support unprivileged mounting of ext4 volumes from user namespaces. This requires the following changes: - Perform all uid and gid conversions to/from disk relative to s_user_ns. In many cases this will already be handled by the vfs helper functions. This also requires updates to handle cases where ids may not map into s_user_ns. - Update most capability checks to check for capabilities in s_user_ns rather than init_user_ns. These mostly reflect changes to the filesystem that a user in s_user_ns could already make externally by virtue of having write access to the backing device. - Restrict unsafe options in either the mount options or the ext4 superblock. Currently the only concerning option is errors=panic, and this is made to require CAP_SYS_ADMIN in init_user_ns. - Verify that unprivileged users have the required access to the journal device at the path passed via the journal_path mount option. Note that for the journal_path and the journal_dev mount options, and for external journal devices specified in the ext4 superblock, devcgroup restrictions will be enforced by __blkdev_get(), (via blkdev_get_by_dev()), ensuring that the user has been granted appropriate access to the block device. - Set the FS_USERNS_MOUNT flag on the filesystem types supported by ext4. sysfs attributes for ext4 mounts remain writable only by real root. Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Signed-off-by:
Tim Gardner <tim.gardner@canonical.com>
-
- 16 Nov, 2015 1 commit
-
-
Dan Williams authored
Similar to XFS warn when mounting DAX while it is still considered under development. Also, aspects of the DAX implementation, for example synchronization against multiple faults and faults causing block allocation, depend on the correct implementation in the filesystem. The maturity of a given DAX implementation is filesystem specific. Cc: <stable@vger.kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: linux-ext4@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by:
Dave Chinner <david@fromorbit.com> Acked-by:
Jan Kara <jack@suse.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 07 Nov, 2015 1 commit
-
-
Mel Gorman authored
mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for backgrou...
-
- 19 Oct, 2015 2 commits
-
-
Dmitry Monakhov authored
It is appeared that we can pass journal related mount options and such options be shown in /proc/mounts Example: #mkfs.ext4 -F /dev/vdb #tune2fs -O ^has_journal /dev/vdb #mount /dev/vdb /mnt/ -ocommit=20,journal_async_commit #cat /proc/mounts | grep /mnt /dev/vdb /mnt ext4 rw,relatime,journal_checksum,journal_async_commit,commit=20,data=ordered 0 0 But options:"journal_checksum,journal_async_commit,commit=20,data=ordered" has nothing with reality because there is no journal at all. This patch disallow following options for journalless configurations: - journal_checksum - journal_async_commit - commit=%ld - data={writeback,ordered,journal} Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Andreas Dilger <adilger@dilger.ca>
-
Dmitry Monakhov authored
Currently MOPT_EXPLICIT treated as EXPLICIT_DELALLOC which may be changed in future. Let's fix it now. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
- 18 Oct, 2015 1 commit
-
-
Daeho Jeong authored
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the journaling will be aborted first and the error number will be recorded into JBD2 superblock and, finally, the system will enter into the panic state in "errors=panic" option. But, in the rare case, this sequence is little twisted like the below figure and it will happen that the system enters into panic state, which means the system reset in mobile environment, before completion of recording an error in the journal superblock. In this case, e2fsck cannot recognize that the filesystem failure occurred in the previous run and the corruption wouldn't be fixed. Task A Task B ext4_handle_error() -> jbd2_journal_abort() -> __journal_abort_soft() -> __jbd2_journal_abort_hard() | -> journal->j_flags |= JBD2_ABORT; | | __ext4_abort() | -> jbd2_journal_abort() | | -> __journal_abort_soft() | | -> if (journal->j_flags & JBD2_ABORT) | | return; | -> panic() | -> jbd2_journal_update_sb_errno() Tested-by:
Hobin Woo <hobin.woo@samsung.com> Signed-off-by:
Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 17 Oct, 2015 2 commits
-
-
Darrick J. Wong authored
Create separate predicate functions to test/set/clear feature flags, thereby replacing the wordy old macros. Furthermore, clean out the places where we open-coded feature tests. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com>
-
Darrick J. Wong authored
Instead of overloading EIO for CRC errors and corrupt structures, return the same error codes that XFS returns for the same issues. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-