1. 21 Jun, 2019 2 commits
    • Eric Biggers's avatar
      f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags · 36098557
      Eric Biggers authored
      f2fs copied all the on-disk i_flags from ext4, and along with it the
      assumption that the on-disk i_flags are the same as the bits used by
      FS_IOC_GETFLAGS and FS_IOC_SETFLAGS.  This is problematic because
      reserving an on-disk inode flag in either filesystem's i_flags or in
      these ioctls effectively reserves it in all the other places too.  In
      fact, most of the "f2fs i_flags" are not used by f2fs at all.
      
      Fix this by separating f2fs's i_flags from the ioctl bits and ext4's
      i_flags.
      
      In the process, un-reserve all "f2fs i_flags" that aren't actually
      supported by f2fs.  This included various flags that were not settable
      at all, as well as various flags that were settable by FS_IOC_SETFLAGS
      but didn't actually do anything.
      
      There's a slight chance we'll need to add some flag(s) back to
      FS_IOC_SETFLAGS in order to avoid breaking users who expect f2fs to
      accept some random flag(s).  But hopefully such users don't exist.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      36098557
    • Kimberly Brown's avatar
      f2fs: replace ktype default_attrs with default_groups · 176ef3c4
      Kimberly Brown authored
      The kobj_type default_attrs field is being replaced by the
      default_groups field. Replace the default_attrs fields in f2fs_sb_ktype
      and f2fs_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
      to create f2fs_groups and f2fs_feat_groups.
      
      Fixes: fef4129e ("f2fs: fix to be aware discard/preflush/dio command in is_idle()")
      Signed-off-by: default avatarKimberly Brown <kimbrownkd@gmail.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      176ef3c4
  2. 03 Jun, 2019 4 commits
    • Daniel Rosenberg's avatar
      f2fs: Add option to limit required GC for checkpoint=disable · 4d3aed70
      Daniel Rosenberg authored
      This extends the checkpoint option to allow checkpoint=disable:%u[%]
      This allows you to specify what how much of the disk you are willing
      to lose access to while mounting with checkpoint=disable. If the amount
      lost would be higher, the mount will return -EAGAIN. This can be given
      as a percent of total space, or in blocks.
      
      Currently, we need to run garbage collection until the amount of holes
      is smaller than the OVP space. With the new option, f2fs can mark
      space as unusable up front instead of requiring garbage collection until
      the number of holes is small enough.
      Signed-off-by: default avatarDaniel Rosenberg <drosen@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4d3aed70
    • Daniel Rosenberg's avatar
      f2fs: Fix accounting for unusable blocks · a4c3ecaa
      Daniel Rosenberg authored
      Fixes possible underflows when dealing with unusable blocks.
      Signed-off-by: default avatarDaniel Rosenberg <drosen@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a4c3ecaa
    • Daniel Rosenberg's avatar
      f2fs: Fix root reserved on remount · 9a9aecaa
      Daniel Rosenberg authored
      On a remount, you can currently set root reserved if it was not
      previously set. This can cause an underflow if reserved has been set to
      a very high value, since then root reserved + current reserved could be
      greater than user_block_count. inc_valid_block_count later subtracts out
      these values from user_block_count, causing an underflow.
      Signed-off-by: default avatarDaniel Rosenberg <drosen@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      9a9aecaa
    • Daniel Rosenberg's avatar
      f2fs: Lower threshold for disable_cp_again · ae4ad7ea
      Daniel Rosenberg authored
      The existing threshold for allowable holes at checkpoint=disable time is
      too high. The OVP space contains reserved segments, which are always in
      the form of free segments. These must be subtracted from the OVP value.
      
      The current threshold is meant to be the maximum value of holes of a
      single type we can have and still guarantee that we can fill the disk
      without failing to find space for a block of a given type.
      
      If the disk is full, ignoring current reserved, which only helps us,
      the amount of unused blocks is equal to the OVP area. Of that, there
      are reserved segments, which must be free segments, and the rest of the
      ovp area, which can come from either free segments or holes. The maximum
      possible amount of holes is OVP-reserved.
      
      Now, consider the disk when mounting with checkpoint=disable.
      We must be able to fill all available free space with either data or
      node blocks. When we start with checkpoint=disable, holes are locked to
      their current type. Say we have H of one type of hole, and H+X of the
      other. We can fill H of that space with arbitrary typed blocks via SSR.
      For the remaining H+X blocks, we may not have any of a given block type
      left at all. For instance, if we were to fill the disk entirely with
      blocks of the type with fewer holes, the H+X blocks of the opposite type
      would not be used. If H+X > OVP-reserved, there would be more holes than
      could possibly exist, and we would have failed to find a suitable block
      earlier on, leading to a crash in update_sit_entry.
      
      If H+X <= OVP-reserved, then the holes end up effectively masked by the OVP
      region in this case.
      Signed-off-by: default avatarDaniel Rosenberg <drosen@google.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ae4ad7ea
  3. 30 May, 2019 5 commits
  4. 23 May, 2019 6 commits
    • Chao Yu's avatar
      f2fs: fix to avoid deadloop if data_flush is on · 040d2bb3
      Chao Yu authored
      As Hagbard Celine reported:
      
      [  615.697824] INFO: task kworker/u16:5:344 blocked for more than 120 seconds.
      [  615.697825]       Not tainted 5.0.15-gentoo-f2fslog #4
      [  615.697826] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
      disables this message.
      [  615.697827] kworker/u16:5   D    0   344      2 0x80000000
      [  615.697831] Workqueue: writeback wb_workfn (flush-259:0)
      [  615.697832] Call Trace:
      [  615.697836]  ? __schedule+0x2c5/0x8b0
      [  615.697839]  schedule+0x32/0x80
      [  615.697841]  schedule_preempt_disabled+0x14/0x20
      [  615.697842]  __mutex_lock.isra.8+0x2ba/0x4d0
      [  615.697845]  ? log_store+0xf5/0x260
      [  615.697848]  f2fs_write_data_pages+0x133/0x320
      [  615.697851]  ? trace_hardirqs_on+0x2c/0xe0
      [  615.697854]  do_writepages+0x41/0xd0
      [  615.697857]  __filemap_fdatawrite_range+0x81/0xb0
      [  615.697859]  f2fs_sync_dirty_inodes+0x1dd/0x200
      [  615.697861]  f2fs_balance_fs_bg+0x2a7/0x2c0
      [  615.697863]  ? up_read+0x5/0x20
      [  615.697865]  ? f2fs_do_write_data_page+0x2cb/0x940
      [  615.697867]  f2fs_balance_fs+0xe5/0x2c0
      [  615.697869]  __write_data_page+0x1c8/0x6e0
      [  615.697873]  f2fs_write_cache_pages+0x1e0/0x450
      [  615.697878]  f2fs_write_data_pages+0x14b/0x320
      [  615.697880]  ? trace_hardirqs_on+0x2c/0xe0
      [  615.697883]  do_writepages+0x41/0xd0
      [  615.697885]  __filemap_fdatawrite_range+0x81/0xb0
      [  615.697887]  f2fs_sync_dirty_inodes+0x1dd/0x200
      [  615.697889]  f2fs_balance_fs_bg+0x2a7/0x2c0
      [  615.697891]  f2fs_write_node_pages+0x51/0x220
      [  615.697894]  do_writepages+0x41/0xd0
      [  615.697897]  __writeback_single_inode+0x3d/0x3d0
      [  615.697899]  writeback_sb_inodes+0x1e8/0x410
      [  615.697902]  __writeback_inodes_wb+0x5d/0xb0
      [  615.697904]  wb_writeback+0x28f/0x340
      [  615.697906]  ? cpumask_next+0x16/0x20
      [  615.697908]  wb_workfn+0x33e/0x420
      [  615.697911]  process_one_work+0x1a1/0x3d0
      [  615.697913]  worker_thread+0x30/0x380
      [  615.697915]  ? process_one_work+0x3d0/0x3d0
      [  615.697916]  kthread+0x116/0x130
      [  615.697918]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  615.697921]  ret_from_fork+0x3a/0x50
      
      There is still deadloop in below condition:
      
      d A
      - do_writepages
       - f2fs_write_node_pages
        - f2fs_balance_fs_bg
         - f2fs_sync_dirty_inodes
          - f2fs_write_cache_pages
           - mutex_lock(&sbi->writepages)	-- lock once
           - __write_data_page
            - f2fs_balance_fs_bg
             - f2fs_sync_dirty_inodes
              - f2fs_write_data_pages
               - mutex_lock(&sbi->writepages)	-- lock again
      
      Thread A			Thread B
      - do_writepages
       - f2fs_write_node_pages
        - f2fs_balance_fs_bg
         - f2fs_sync_dirty_inodes
          - .cp_task = current
      				- f2fs_sync_dirty_inodes
      				 - .cp_task = current
      				 - filemap_fdatawrite
      				 - .cp_task = NULL
          - filemap_fdatawrite
           - f2fs_write_cache_pages
            - enter f2fs_balance_fs_bg since .cp_task is NULL
          - .cp_task = NULL
      
      Change as below to avoid this:
      - add condition to avoid holding .writepages mutex lock in path
      of data flush
      - introduce mutex lock sbi.flush_lock to exclude concurrent data
      flush in background.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      040d2bb3
    • Park Ju Hyung's avatar
      f2fs: always assume that the device is idle under gc_urgent · f7dfd9f3
      Park Ju Hyung authored
      This allows more aggressive discards and balancing job to be done
      under gc_urgent.
      Signed-off-by: default avatarPark Ju Hyung <qkrwngud825@gmail.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f7dfd9f3
    • Chao Yu's avatar
      f2fs: add bio cache for IPU · 8648de2c
      Chao Yu authored
      SQLite in Wal mode may trigger sequential IPU write in db-wal file, after
      commit d1b3e72d ("f2fs: submit bio of in-place-update pages"), we
      lost the chance of merging page in inner managed bio cache, result in
      submitting more small-sized IO.
      
      So let's add temporary bio in writepages() to cache mergeable write IO as
      much as possible.
      
      Test case:
      1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 65536" -c "fsync"
      2. xfs_io -f /mnt/f2fs/file -c "pwrite 0 65536" -c "fsync"
      
      Before:
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65544, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65552, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65560, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65568, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65576, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65584, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65592, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65600, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65608, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65616, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65624, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65632, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65640, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65648, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65656, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65664, size = 4096
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), NODE, sector = 57352, size = 4096
      
      After:
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65544, size = 65536
      f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), NODE, sector = 57368, size = 4096
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8648de2c
    • Jaegeuk Kim's avatar
      f2fs: allow ssr block allocation during checkpoint=disable period · 49dd883c
      Jaegeuk Kim authored
      This patch allows to use ssr during checkpoint is disabled.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      49dd883c
    • Chao Yu's avatar
      f2fs: fix to check layout on last valid checkpoint park · 5dae2d39
      Chao Yu authored
      As Ju Hyung reported:
      
      "
      I was semi-forced today to use the new kernel and test f2fs.
      
      My Ubuntu initramfs got a bit wonky and I had to boot into live CD and
      fix some stuffs. The live CD was using 4.15 kernel, and just mounting
      the f2fs partition there corrupted f2fs and my 4.19(with 5.1-rc1-4.19
      f2fs-stable merged) refused to mount with "SIT is corrupted node"
      message.
      
      I used the latest f2fs-tools sent by Chao including "fsck.f2fs: fix to
      repair cp_loads blocks at correct position"
      
      It spit out 140M worth of output, but at least I didn't have to run it
      twice. Everything returned "Ok" in the 2nd run.
      The new log is at
      http://arter97.com/f2fs/final
      
      After fixing the image, I used my 4.19 kernel with 5.2-rc1-4.19
      f2fs-stable merged and it mounted.
      
      But, I got this:
      [    1.047791] F2FS-fs (nvme0n1p3): layout of large_nat_bitmap is
      deprecated, run fsck to repair, chksum_offset: 4092
      [    1.081307] F2FS-fs (nvme0n1p3): Found nat_bits in checkpoint
      [    1.161520] F2FS-fs (nvme0n1p3): recover fsync data on readonly fs
      [    1.162418] F2FS-fs (nvme0n1p3): Mounted with checkpoint version = 761c7e00
      
      But after doing a reboot, the message is gone:
      [    1.098423] F2FS-fs (nvme0n1p3): Found nat_bits in checkpoint
      [    1.177771] F2FS-fs (nvme0n1p3): recover fsync data on readonly fs
      [    1.178365] F2FS-fs (nvme0n1p3): Mounted with checkpoint version = 761c7eda
      
      I'm not exactly sure why the kernel detected that I'm still using the
      old layout on the first boot. Maybe fsck didn't fix it properly, or
      the check from the kernel is improper.
      "
      
      Although we have rebuild the old deprecated checkpoint with new layout
      during repair, we only repair last checkpoint park, the other old one is
      remained.
      
      Once the image was mounted, we will 1) sanity check layout and 2) decide
      which checkpoint park to use according to cp_ver. So that we will print
      reported message unnecessarily at step 1), to avoid it, we simply move
      layout check into f2fs_sanity_check_ckpt() after step 2).
      Reported-by: default avatarPark Ju Hyung <qkrwngud825@gmail.com>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5dae2d39
    • Jaegeuk Kim's avatar
      f2fs: link f2fs quota ops for sysfile · bc88ac96
      Jaegeuk Kim authored
      This patch reverts:
      commit fb40d618 ("f2fs: don't clear CP_QUOTA_NEED_FSCK_FLAG").
      
      We were missing error handlers used in f2fs quota ops.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bc88ac96
  5. 14 May, 2019 23 commits