1. 05 Nov, 2014 15 commits
    • Dmitry Monakhov's avatar
      ext4: move error report out of atomic context in ext4_init_block_bitmap() · f6770a16
      Dmitry Monakhov authored
      commit aef4885a upstream.
      
      Error report likely result in IO so it is bad idea to do it from
      atomic context.
      
      This patch should fix following issue:
      
      BUG: sleeping function called from invalid context at include/linux/buffer_head.h:349
      in_atomic(): 1, irqs_disabled(): 0, pid: 137, name: kworker/u128:1
      5 locks held by kworker/u128:1/137:
       #0:  ("writeback"){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0
       #1:  ((&(&wb->dwork)->work)){......}, at: [<ffffffff81085618>] process_one_work+0x228/0x4d0
       #2:  (jbd2_handle){......}, at: [<ffffffff81242622>] start_this_handle+0x712/0x7b0
       #3:  (&ei->i_data_sem){......}, at: [<ffffffff811fa387>] ext4_map_blocks+0x297/0x430
       #4:  (&(&bgl->locks[i].lock)->rlock){......}, at: [<ffffffff811f3180>] ext4_read_block_bitmap_nowait+0x5d0/0x630
      CPU: 3 PID: 137 Comm: kworker/u128:1 Not tainted 3.17.0-rc2-00184-g82752e4 #165
      Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
      Workqueue: writeback bdi_writeback_workfn (flush-1:0)
       0000000000000411 ffff880813777288 ffffffff815c7fdc ffff880813777288
       ffff880813a8bba0 ffff8808137772a8 ffffffff8108fb30 ffff880803e01e38
       ffff880803e01e38 ffff8808137772c8 ffffffff811a8d53 ffff88080ecc6000
      Call Trace:
       [<ffffffff815c7fdc>] dump_stack+0x51/0x6d
       [<ffffffff8108fb30>] __might_sleep+0xf0/0x100
       [<ffffffff811a8d53>] __sync_dirty_buffer+0x43/0xe0
       [<ffffffff811a8e03>] sync_dirty_buffer+0x13/0x20
       [<ffffffff8120f581>] ext4_commit_super+0x1d1/0x230
       [<ffffffff8120fa03>] save_error_info+0x23/0x30
       [<ffffffff8120fd06>] __ext4_error+0xb6/0xd0
       [<ffffffff8120f260>] ? ext4_group_desc_csum+0x140/0x190
       [<ffffffff811f2d8c>] ext4_read_block_bitmap_nowait+0x1dc/0x630
       [<ffffffff8122e23a>] ext4_mb_init_cache+0x21a/0x8f0
       [<ffffffff8113ae95>] ? lru_cache_add+0x55/0x60
       [<ffffffff8112e16c>] ? add_to_page_cache_lru+0x6c/0x80
       [<ffffffff8122eaa0>] ext4_mb_init_group+0x190/0x280
       [<ffffffff8122ec51>] ext4_mb_good_group+0xc1/0x190
       [<ffffffff8123309a>] ext4_mb_regular_allocator+0x17a/0x410
       [<ffffffff8122c821>] ? ext4_mb_use_preallocated+0x31/0x380
       [<ffffffff81233535>] ? ext4_mb_new_blocks+0x205/0x8e0
       [<ffffffff8116ed5c>] ? kmem_cache_alloc+0xfc/0x180
       [<ffffffff812335b0>] ext4_mb_new_blocks+0x280/0x8e0
       [<ffffffff8116f2c4>] ? __kmalloc+0x144/0x1c0
       [<ffffffff81221797>] ? ext4_find_extent+0x97/0x320
       [<ffffffff812257f4>] ext4_ext_map_blocks+0xbc4/0x1050
       [<ffffffff811fa387>] ? ext4_map_blocks+0x297/0x430
       [<ffffffff811fa3ab>] ext4_map_blocks+0x2bb/0x430
       [<ffffffff81200e43>] ? ext4_init_io_end+0x23/0x50
       [<ffffffff811feb44>] ext4_writepages+0x564/0xaf0
       [<ffffffff815cde3b>] ? _raw_spin_unlock+0x2b/0x40
       [<ffffffff810ac7bd>] ? lock_release_non_nested+0x2fd/0x3c0
       [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490
       [<ffffffff811a009e>] ? writeback_sb_inodes+0x10e/0x490
       [<ffffffff811377e3>] do_writepages+0x23/0x40
       [<ffffffff8119c8ce>] __writeback_single_inode+0x9e/0x280
       [<ffffffff811a026b>] writeback_sb_inodes+0x2db/0x490
       [<ffffffff811a0664>] wb_writeback+0x174/0x2d0
       [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
       [<ffffffff811a0863>] wb_do_writeback+0xa3/0x200
       [<ffffffff811a0a40>] bdi_writeback_workfn+0x80/0x230
       [<ffffffff81085618>] ? process_one_work+0x228/0x4d0
       [<ffffffff810856cd>] process_one_work+0x2dd/0x4d0
       [<ffffffff81085618>] ? process_one_work+0x228/0x4d0
       [<ffffffff81085c1d>] worker_thread+0x35d/0x460
       [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0
       [<ffffffff810858c0>] ? process_one_work+0x4d0/0x4d0
       [<ffffffff8108a885>] kthread+0xf5/0x100
       [<ffffffff810990e5>] ? local_clock+0x25/0x30
       [<ffffffff8108a790>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff815ce2ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff8108a790>] ? __init_kthread_work
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f6770a16
    • Dmitry Monakhov's avatar
      ext4: Replace open coded mdata csum feature to helper function · 10e55cd4
      Dmitry Monakhov authored
      commit 9aa5d32b upstream.
      
      Besides the fact that this replacement improves code readability
      it also protects from errors caused direct EXT4_S(sb)->s_es manipulation
      which may result attempt to use uninitialized  csum machinery.
      
      #Testcase_BEGIN
      IMG=/dev/ram0
      MNT=/mnt
      mkfs.ext4 $IMG
      mount $IMG $MNT
      #Enable feature directly on disk, on mounted fs
      tune2fs -O metadata_csum  $IMG
      # Provoke metadata update, likey result in OOPS
      touch $MNT/test
      umount $MNT
      #Testcase_END
      
      # Replacement script
      @@
      expression E;
      @@
      - EXT4_HAS_RO_COMPAT_FEATURE(E, EXT4_FEATURE_RO_COMPAT_METADATA_CSUM)
      + ext4_has_metadata_csum(E)
      
      https://bugzilla.kernel.org/show_bug.cgi?id=82201Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      10e55cd4
    • Eric Sandeen's avatar
      ext4: fix reservation overflow in ext4_da_write_begin · 8035ac91
      Eric Sandeen authored
      commit 0ff8947f upstream.
      
      Delalloc write journal reservations only reserve 1 credit,
      to update the inode if necessary.  However, it may happen
      once in a filesystem's lifetime that a file will cross
      the 2G threshold, and require the LARGE_FILE feature to
      be set in the superblock as well, if it was not set already.
      
      This overruns the transaction reservation, and can be
      demonstrated simply on any ext4 filesystem without the LARGE_FILE
      feature already set:
      
      dd if=/dev/zero of=testfile bs=1 seek=2147483646 count=1 \
      	conv=notrunc of=testfile
      sync
      dd if=/dev/zero of=testfile bs=1 seek=2147483647 count=1 \
      	conv=notrunc of=testfile
      
      leads to:
      
      EXT4-fs: ext4_do_update_inode:4296: aborting transaction: error 28 in __ext4_handle_dirty_super
      EXT4-fs error (device loop0) in ext4_do_update_inode:4301: error 28
      EXT4-fs error (device loop0) in ext4_reserve_inode_write:4757: Readonly filesystem
      EXT4-fs error (device loop0) in ext4_dirty_inode:4876: error 28
      EXT4-fs error (device loop0) in ext4_da_write_end:2685: error 28
      
      Adjust the number of credits based on whether the flag is
      already set, and whether the current write may extend past the
      LARGE_FILE limit.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      8035ac91
    • Theodore Ts'o's avatar
      ext4: add ext4_iget_normal() which is to be used for dir tree lookups · ba55c5e8
      Theodore Ts'o authored
      commit f4bb2981 upstream.
      
      If there is a corrupted file system which has directory entries that
      point at reserved, metadata inodes, prohibit them from being used by
      treating them the same way we treat Boot Loader inodes --- that is,
      mark them to be bad inodes.  This prohibits them from being opened,
      deleted, or modified via chmod, chown, utimes, etc.
      
      In particular, this prevents a corrupted file system which has a
      directory entry which points at the journal inode from being deleted
      and its blocks released, after which point Much Hilarity Ensues.
      Reported-by: default avatarSami Liedes <sami.liedes@iki.fi>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ba55c5e8
    • Theodore Ts'o's avatar
      ext4: don't orphan or truncate the boot loader inode · e1956e85
      Theodore Ts'o authored
      commit e2bfb088 upstream.
      
      The boot loader inode (inode #5) should never be visible in the
      directory hierarchy, but it's possible if the file system is corrupted
      that there will be a directory entry that points at inode #5.  In
      order to avoid accidentally trashing it, when such a directory inode
      is opened, the inode will be marked as a bad inode, so that it's not
      possible to modify (or read) the inode from userspace.
      
      Unfortunately, when we unlink this (invalid/illegal) directory entry,
      we will put the bad inode on the ophan list, and then when try to
      unlink the directory, we don't actually remove the bad inode from the
      orphan list before freeing in-memory inode structure.  This means the
      in-memory orphan list is corrupted, leading to a kernel oops.
      
      In addition, avoid truncating a bad inode in ext4_destroy_inode(),
      since truncating the boot loader inode is not a smart thing to do.
      Reported-by: default avatarSami Liedes <sami.liedes@iki.fi>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e1956e85
    • Nicholas Bellinger's avatar
      iser-target: Disable TX completion interrupt coalescing · faed75ef
      Nicholas Bellinger authored
      commit 0d0f660d upstream.
      
      This patch explicitly disables TX completion interrupt coalescing logic
      in isert_put_response() and isert_put_datain() that was originally added
      as an efficiency optimization in commit 95b60f07.
      
      It has been reported that this change can trigger ABORT_TASK timeouts
      under certain small block workloads, where disabling coalescing was
      required for stability.  According to Sagi, this doesn't impact
      overall performance, so go ahead and disable it for now.
      Reported-by: default avatarMoussa Ba <moussaba@micron.com>
      Reported-by: default avatarSagi Grimberg <sagig@dev.mellanox.co.il>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      faed75ef
    • Nicholas Bellinger's avatar
      target: Fix APTPL metadata handling for dynamic MappedLUNs · 7968b755
      Nicholas Bellinger authored
      commit e2480563 upstream.
      
      This patch fixes a bug in handling of SPC-3 PR Activate Persistence
      across Target Power Loss (APTPL) logic where re-creation of state for
      MappedLUNs from dynamically generated NodeACLs did not occur during
      I_T Nexus establishment.
      
      It adds the missing core_scsi3_check_aptpl_registration() call during
      core_tpg_check_initiator_node_acl() -> core_tpg_add_node_to_devs() in
      order to replay any pre-loaded APTPL metadata state associated with
      the newly connected SCSI Initiator Port.
      
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      7968b755
    • Joern Engel's avatar
      qla_target: don't delete changed nacls · 53f79633
      Joern Engel authored
      commit f4c24db1 upstream.
      
      The code is currently riddled with "drop the hardware_lock to avoid a
      deadlock" bugs that expose races.  One of those races seems to expose a
      valid warning in tcm_qla2xxx_clear_nacl_from_fcport_map.  Add some
      bandaid to it.
      Signed-off-by: default avatarJoern Engel <joern@logfs.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      53f79633
    • Dmitry Monakhov's avatar
      ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT · 6f3df827
      Dmitry Monakhov authored
      commit 3e67cfad upstream.
      
      Otherwise this provokes complain like follows:
      WARNING: CPU: 12 PID: 5795 at fs/ext4/ext4_jbd2.c:48 ext4_journal_check_start+0x4e/0xa0()
      Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
      CPU: 12 PID: 5795 Comm: python Not tainted 3.17.0-rc2-00175-gae5344f #158
      Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
       0000000000000030 ffff8808116cfd28 ffffffff815c7dfc 0000000000000030
       0000000000000000 ffff8808116cfd68 ffffffff8106ce8c ffff8808116cfdc8
       ffff880813b16000 ffff880806ad6ae8 ffffffff81202008 0000000000000000
      Call Trace:
       [<ffffffff815c7dfc>] dump_stack+0x51/0x6d
       [<ffffffff8106ce8c>] warn_slowpath_common+0x8c/0xc0
       [<ffffffff81202008>] ? ext4_ioctl+0x9e8/0xeb0
       [<ffffffff8106ceda>] warn_slowpath_null+0x1a/0x20
       [<ffffffff8122867e>] ext4_journal_check_start+0x4e/0xa0
       [<ffffffff81228c10>] __ext4_journal_start_sb+0x90/0x110
       [<ffffffff81202008>] ext4_ioctl+0x9e8/0xeb0
       [<ffffffff8107b0bd>] ? ptrace_stop+0x24d/0x2f0
       [<ffffffff81088530>] ? alloc_pid+0x480/0x480
       [<ffffffff8107b1f2>] ? ptrace_do_notify+0x92/0xb0
       [<ffffffff81186545>] do_vfs_ioctl+0x4e5/0x550
       [<ffffffff815cdbcb>] ? _raw_spin_unlock_irq+0x2b/0x40
       [<ffffffff81186603>] SyS_ioctl+0x53/0x80
       [<ffffffff815ce2ce>] tracesys+0xd0/0xd5
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6f3df827
    • Jan Kara's avatar
      ext4: fix mmap data corruption when blocksize < pagesize · 6c49ff6f
      Jan Kara authored
      commit d6320cbf upstream.
      
      Use truncate_isize_extended() when hole is being created in a file so that
      ->page_mkwrite() will get called for the partial tail page if it is
      mmaped (see the first patch in the series for details).
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6c49ff6f
    • Jan Kara's avatar
      vfs: fix data corruption when blocksize < pagesize for mmaped data · c0552c08
      Jan Kara authored
      commit 90a80202 upstream.
      
      ->page_mkwrite() is used by filesystems to allocate blocks under a page
      which is becoming writeably mmapped in some process' address space. This
      allows a filesystem to return a page fault if there is not enough space
      available, user exceeds quota or similar problem happens, rather than
      silently discarding data later when writepage is called.
      
      However VFS fails to call ->page_mkwrite() in all the cases where
      filesystems need it when blocksize < pagesize. For example when
      blocksize = 1024, pagesize = 4096 the following is problematic:
        ftruncate(fd, 0);
        pwrite(fd, buf, 1024, 0);
        map = mmap(NULL, 1024, PROT_WRITE, MAP_SHARED, fd, 0);
        map[0] = 'a';       ----> page_mkwrite() for index 0 is called
        ftruncate(fd, 10000); /* or even pwrite(fd, buf, 1, 10000) */
        mremap(map, 1024, 10000, 0);
        map[4095] = 'a';    ----> no page_mkwrite() called
      
      At the moment ->page_mkwrite() is called, filesystem can allocate only
      one block for the page because i_size == 1024. Otherwise it would create
      blocks beyond i_size which is generally undesirable. But later at
      ->writepage() time, we also need to store data at offset 4095 but we
      don't have block allocated for it.
      
      This patch introduces a helper function filesystems can use to have
      ->page_mkwrite() called at all the necessary moments.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c0552c08
    • Quinn Tran's avatar
      target: Fix queue full status NULL pointer for SCF_TRANSPORT_TASK_SENSE · c1ab3b96
      Quinn Tran authored
      commit 082f58ac upstream.
      
      During temporary resource starvation at lower transport layer, command
      is placed on queue full retry path, which expose this problem.  The TCM
      queue full handling of SCF_TRANSPORT_TASK_SENSE currently sends the same
      cmd twice to lower layer.  The 1st time led to cmd normal free path.
      The 2nd time cause Null pointer access.
      
      This regression bug was originally introduced v3.1-rc code in the
      following commit:
      
      commit e057f533
      Author: Christoph Hellwig <hch@infradead.org>
      Date:   Mon Oct 17 13:56:41 2011 -0400
      
          target: remove the transport_qf_callback se_cmd callback
      Signed-off-by: default avatarQuinn Tran <quinn.tran@qlogic.com>
      Signed-off-by: default avatarSaurav Kashyap <saurav.kashyap@qlogic.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c1ab3b96
    • Jan Kara's avatar
      ext4: don't check quota format when there are no quota files · a3b3f68d
      Jan Kara authored
      commit 279bf6d3 upstream.
      
      The check whether quota format is set even though there are no
      quota files with journalled quota is pointless and it actually
      makes it impossible to turn off journalled quotas (as there's
      no way to unset journalled quota format). Just remove the check.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a3b3f68d
    • Darrick J. Wong's avatar
      jbd2: free bh when descriptor block checksum fails · 817de83c
      Darrick J. Wong authored
      commit 064d8389 upstream.
      
      Free the buffer head if the journal descriptor block fails checksum
      verification.
      
      This is the jbd2 port of the e2fsprogs patch "e2fsck: free bh on csum
      verify error in do_one_pass".
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      817de83c
    • Darrick J. Wong's avatar
      ext4: check EA value offset when loading · a43416db
      Darrick J. Wong authored
      commit a0626e75 upstream.
      
      When loading extended attributes, check each entry's value offset to
      make sure it doesn't collide with the entries.
      
      Without this check it is easy to crash the kernel by mounting a
      malicious FS containing a file with an EA wherein e_value_offs = 0 and
      e_value_size > 0 and then deleting the EA, which corrupts the name
      list.
      
      (See the f_ea_value_crash test's FS image in e2fsprogs for an example.)
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a43416db
  2. 03 Nov, 2014 11 commits
  3. 30 Oct, 2014 14 commits