1. 19 May, 2014 6 commits
    • Dave Chinner's avatar
      Merge branch 'xfs-misc-fixes-2-for-3.16' into for-next · 0d907a3b
      Dave Chinner authored
      Conflicts:
      	fs/xfs/xfs_ialloc.c
      0d907a3b
    • Roger Willcocks's avatar
      xfs: fix compile error when libxfs header used in C++ code · 376c2f3a
      Roger Willcocks authored
      xfs_ialloc.h:102: error: expected ',' or '...' before 'delete'
      
      Simple parameter rename, no changes to behaviour.
      Signed-off-by: default avatarRoger Willcocks <roger@filmlight.ltd.uk>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      
      376c2f3a
    • Jie Liu's avatar
      xfs: fix infinite loop at xfs_vm_writepage on 32bit system · 8695d27e
      Jie Liu authored
      Write to a file with an offset greater than 16TB on 32-bit system and
      then trigger page write-back via sync(1) will cause task hang.
      
      # block_size=4096
      # offset=$(((2**32 - 1) * $block_size))
      # xfs_io -f -c "pwrite $offset $block_size" /storage/test_file
      # sync
      
      INFO: task sync:2590 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      sync            D c1064a28     0  2590   2097 0x00000000
      .....
      Call Trace:
      [<c1064a28>] ? ttwu_do_wakeup+0x18/0x130
      [<c1066d0e>] ? try_to_wake_up+0x1ce/0x220
      [<c1066dbf>] ? wake_up_process+0x1f/0x40
      [<c104fc2e>] ? wake_up_worker+0x1e/0x30
      [<c15b6083>] schedule+0x23/0x60
      [<c15b3c2d>] schedule_timeout+0x18d/0x1f0
      [<c12a143e>] ? do_raw_spin_unlock+0x4e/0x90
      [<c10515f1>] ? __queue_delayed_work+0x91/0x150
      [<c12a12ef>] ? do_raw_spin_lock+0x3f/0x100
      [<c12a143e>] ? do_raw_spin_unlock+0x4e/0x90
      [<c15b5b5d>] wait_for_completion+0x7d/0xc0
      [<c1066d60>] ? try_to_wake_up+0x220/0x220
      [<c116a4d2>] sync_inodes_sb+0x92/0x180
      [<c116fb05>] sync_inodes_one_sb+0x15/0x20
      [<c114a8f8>] iterate_supers+0xb8/0xc0
      [<c116faf0>] ? fdatawrite_one_bdev+0x20/0x20
      [<c116fc21>] sys_sync+0x31/0x80
      [<c15be18d>] sysenter_do_call+0x12/0x28
      
      This issue can be triggered via xfstests/generic/308.
      
      The reason is that the end_index is unsigned long with maximum value
      '2^32-1=4294967295' on 32-bit platform, and the given offset cause it
      wrapped to 0, so that the following codes will repeat again and again
      until the task schedule time out:
      
      end_index = offset >> PAGE_CACHE_SHIFT;
      last_index = (offset - 1) >> PAGE_CACHE_SHIFT;
      if (page->index >= end_index) {
      	unsigned offset_into_page = offset & (PAGE_CACHE_SIZE - 1);
              /*
               * Just skip the page if it is fully outside i_size, e.g. due
               * to a truncate operation that is in progress.
               */
              if (page->index >= end_index + 1 || offset_into_page == 0) {
      	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      		unlock_page(page);
      		return 0;
      	}
      
      In order to check if a page is fully outsids i_size or not, we can fix
      the code logic as below:
      	if (page->index > end_index ||
      	    (page->index == end_index && offset_into_page == 0))
      
      Secondly, there still has another similar issue when calculating the
      end offset for mapping the filesystem blocks to the file blocks for
      delalloc.  With the same tests to above, run unmount(8) will cause
      kernel panic if CONFIG_XFS_DEBUG is enabled:
      
      XFS: Assertion failed: XFS_FORCED_SHUTDOWN(ip->i_mount) || \
      	ip->i_delayed_blks == 0, file: fs/xfs/xfs_super.c, line: 964
      
      kernel BUG at fs/xfs/xfs_message.c:108!
      invalid opcode: 0000 [#1] SMP
      task: edddc100 ti: ec6ee000 task.ti: ec6ee000
      EIP: 0060:[<f83d87cb>] EFLAGS: 00010296 CPU: 1
      EIP is at assfail+0x2b/0x30 [xfs]
      ..............
      Call Trace:
      [<f83d9cd4>] xfs_fs_destroy_inode+0x74/0x120 [xfs]
      [<c115ddf1>] destroy_inode+0x31/0x50
      [<c115deff>] evict+0xef/0x170
      [<c115dfb2>] dispose_list+0x32/0x40
      [<c115ea3a>] evict_inodes+0xca/0xe0
      [<c1149706>] generic_shutdown_super+0x46/0xd0
      [<c11497b9>] kill_block_super+0x29/0x70
      [<c1149a14>] deactivate_locked_super+0x44/0x70
      [<c114a427>] deactivate_super+0x47/0x60
      [<c1161c3d>] mntput_no_expire+0xcd/0x120
      [<c1162ae8>] SyS_umount+0xa8/0x370
      [<c1162dce>] SyS_oldumount+0x1e/0x20
      [<c15be18d>] sysenter_do_call+0x12/0x28
      
      That because the end_offset is evaluated to 0 which is the same reason
      to above, hence the mapping and covertion for dealloc file blocks to
      file system blocks did not happened.
      
      This patch just fixed both issues.
      Reported-by: default avatarMichael L. Semon <mlsemon35@gmail.com>
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      
      8695d27e
    • Dave Chinner's avatar
      xfs: remove redundant checks from xfs_da_read_buf · 7c166350
      Dave Chinner authored
      All of the verification checks of magic numbers are now done by
      verifiers, so ther eis no need to check them again once the buffer
      has been successfully read. If the magic number is bad, it won't
      even get to that code to verify it so it really serves no purpose at
      all anymore. Remove it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      
      7c166350
    • Dave Chinner's avatar
      xfs: log vector rounding leaks log space · 110dc24a
      Dave Chinner authored
      The addition of direct formatting of log items into the CIL
      linear buffer added alignment restrictions that the start of each
      vector needed to be 64 bit aligned. Hence padding was added in
      xlog_finish_iovec() to round up the vector length to ensure the next
      vector started with the correct alignment.
      
      This adds a small number of bytes to the size of
      the linear buffer that is otherwise unused. The issue is that we
      then use the linear buffer size to determine the log space used by
      the log item, and this includes the unused space. Hence when we
      account for space used by the log item, it's more than is actually
      written into the iclogs, and hence we slowly leak this space.
      
      This results on log hangs when reserving space, with threads getting
      stuck with these stack traces:
      
      Call Trace:
      [<ffffffff81d15989>] schedule+0x29/0x70
      [<ffffffff8150d3a2>] xlog_grant_head_wait+0xa2/0x1a0
      [<ffffffff8150d55d>] xlog_grant_head_check+0xbd/0x140
      [<ffffffff8150ee33>] xfs_log_reserve+0x103/0x220
      [<ffffffff814b7f05>] xfs_trans_reserve+0x2f5/0x310
      .....
      
      The 4 bytes is significant. Brain Foster did all the hard work in
      tracking down a reproducable leak to inode chunk allocation (it went
      away with the ikeep mount option). His rough numbers were that
      creating 50,000 inodes leaked 11 log blocks. This turns out to be
      roughly 800 inode chunks or 1600 inode cluster buffers. That
      works out at roughly 4 bytes per cluster buffer logged, and at that
      I started looking for a 4 byte leak in the buffer logging code.
      
      What I found was that a struct xfs_buf_log_format structure for an
      inode cluster buffer is 28 bytes in length. This gets rounded up to
      32 bytes, but the vector length remains 28 bytes. Hence the CIL
      ticket reservation is decremented by 32 bytes (via lv->lv_buf_len)
      for that vector rather than 28 bytes which are written into the log.
      
      The fix for this problem is to separately track the bytes used by
      the log vectors in the item and use that instead of the buffer
      length when accounting for the log space that will be used by the
      formatted log item.
      
      Again, thanks to Brian Foster for doing all the hard work and long
      hours to isolate this leak and make finding the bug relatively
      simple.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      
      110dc24a
    • Namjae Jeon's avatar
      xfs: remove XFS_TRANS_RESERVE in collapse range · ce576f1c
      Namjae Jeon authored
      There is no need to dip into reserve pool. Reserve pool is used for much
      more important things. And xfs_trans_reserve will never return ENOSPC
      because punch hole is already done. If we get ENOSPC, collapse range
      will be simply failed.
      
      Cc: Brian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarNamjae Jeon <namjae.jeon@samsung.com>
      Signed-off-by: default avatarAshish Sangwan <a.sangwan@samsung.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      
      ce576f1c
  2. 14 May, 2014 14 commits
  3. 13 May, 2014 5 commits
  4. 09 May, 2014 6 commits
  5. 08 May, 2014 9 commits