1. 27 Feb, 2017 4 commits
  2. 24 Feb, 2017 6 commits
  3. 23 Feb, 2017 30 commits
    • Hou Pengyang's avatar
      f2fs: add ovp valid_blocks check for bg gc victim to fg_gc · e93b9865
      Hou Pengyang authored
      For foreground gc, greedy algorithm should be adapted, which makes
      this formula work well:
      
      	(2 * (100 / config.overprovision + 1) + 6)
      
      But currently, we fg_gc have a prior to select bg_gc victim segments to gc
      first, these victims are selected by cost-benefit algorithm, we can't guarantee
      such segments have the small valid blocks, which may destroy the f2fs rule, on
      the worstest case, would consume all the free segments.
      
      This patch fix this by add a filter in check_bg_victims, if segment's has # of
      valid blocks over overprovision ratio, skip such segments.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHou Pengyang <houpengyang@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e93b9865
    • Jaegeuk Kim's avatar
      f2fs: do not wait for writeback in write_begin · 86d54795
      Jaegeuk Kim authored
      Otherwise we can get livelock like below.
      
      [79880.428136] dbench          D    0 18405  18404 0x00000000
      [79880.428139] Call Trace:
      [79880.428142]  __schedule+0x219/0x6b0
      [79880.428144]  schedule+0x36/0x80
      [79880.428147]  schedule_timeout+0x243/0x2e0
      [79880.428152]  ? update_sd_lb_stats+0x16b/0x5f0
      [79880.428155]  ? ktime_get+0x3c/0xb0
      [79880.428157]  io_schedule_timeout+0xa6/0x110
      [79880.428161]  __lock_page+0xf7/0x130
      [79880.428164]  ? unlock_page+0x30/0x30
      [79880.428167]  pagecache_get_page+0x16b/0x250
      [79880.428171]  grab_cache_page_write_begin+0x20/0x40
      [79880.428182]  f2fs_write_begin+0xa2/0xdb0 [f2fs]
      [79880.428192]  ? f2fs_mark_inode_dirty_sync+0x16/0x30 [f2fs]
      [79880.428197]  ? kmem_cache_free+0x79/0x200
      [79880.428203]  ? __mark_inode_dirty+0x17f/0x360
      [79880.428206]  generic_perform_write+0xbb/0x190
      [79880.428213]  ? file_update_time+0xa4/0xf0
      [79880.428217]  __generic_file_write_iter+0x19b/0x1e0
      [79880.428226]  f2fs_file_write_iter+0x9c/0x180 [f2fs]
      [79880.428231]  __vfs_write+0xc5/0x140
      [79880.428235]  vfs_write+0xb2/0x1b0
      [79880.428238]  SyS_write+0x46/0xa0
      [79880.428242]  entry_SYSCALL_64_fastpath+0x1e/0xad
      
      Fixes: cae96a5c8ab6 ("f2fs: check io submission more precisely")
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      86d54795
    • Yunlei He's avatar
      f2fs: replace __get_victim by dirty_segments in FG_GC · 05eeb118
      Yunlei He authored
      In FG_GC process, it will search victim section twice. This will
      cause some dirty section with less valid blocks skip garbage
      collection.
      
      section # 26425 : valid blocks # 3
      142.037567: get_victim_by_default: victim 26425 : valid blocks # 3
      142.037585: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.039494: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19022 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.070247: new_curseg: Debug: alloc new segment 26746
      142.244341: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26054 ofs_unit = 1, pre_victim_secno = 26054, prefree = 0, free = 243
      142.254475: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.293131: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23466 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.319001: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23467 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.368879: get_victim_by_default: victim 26425 : valid blocks # 3
      142.368894: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.378127: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19612 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.416917: new_curseg: Debug: alloc new segment 26054
      142.656794: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 25404 ofs_unit = 1, pre_victim_secno = 25404, prefree = 0, free = 243
      142.662139: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.684159: new_curseg: Debug: alloc new segment 25197
      142.685059: get_victim_by_default: victim 26425 : valid blocks # 3
      142.685079: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 243
      142.701427: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26238 ofs_unit = 1, pre_victim_secno = 26238, prefree = 0, free = 243
      142.707105: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      142.802444: f2fs_get_victim: dev = (259,30), type = Warm DATA, policy = (Background GC, SSR-mode, Greedy), victim = 23473 ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 244
      142.804422: get_victim_by_default: victim 26425 : valid blocks # 3
      142.804443: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      142.851567: f2fs_get_victim: dev = (259,30), type = Hot DATA, policy = (Background GC, SSR-mode, Greedy), victim = 19092 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 24
      142.865014: new_curseg: Debug: alloc new segment 26238
      143.082245: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26307 ofs_unit = 1, pre_victim_secno = 26307, prefree = 0, free = 244
      143.088252: do_garbage_collect: Debug: FG_GC, seg_freed = 1
      143.128307: new_curseg: Debug: alloc new segment 25404
      143.181846: get_victim_by_default: victim 26425 : valid blocks # 3
      143.181872: f2fs_get_victim: dev = (259,30), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 26425 ofs_unit = 1, pre_victim_secno = 26425, prefree = 0, free = 244
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      05eeb118
    • Jaegeuk Kim's avatar
      f2fs: trace victim's cost selectecd by f2fs_gc · 5012de20
      Jaegeuk Kim authored
      This patch adds min_cost of each victims.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5012de20
    • Jaegeuk Kim's avatar
      f2fs: fix multiple f2fs_add_link() calls having same name · 88c5c13a
      Jaegeuk Kim authored
      It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple
      vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries
      having same name which breaks filesystem consistency.
      
      Until upper layer fixes, let's work around by f2fs, which shows actually not
      much performance regression.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      88c5c13a
    • Jaegeuk Kim's avatar
      f2fs: show actual device info in tracepoints · d50aaeec
      Jaegeuk Kim authored
      This patch shows actual device information in the tracepoints.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d50aaeec
    • Jaegeuk Kim's avatar
      f2fs: use SSR for warm node as well · 5b6c6be2
      Jaegeuk Kim authored
      We have had node chains, but haven't used it so far due to stale node blocks.
      Now, we have crc|cp_ver in node footer and give random cp_ver at format time,
      we can start to use it again.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5b6c6be2
    • Chao Yu's avatar
      f2fs: enable inline_xattr by default · 39133a50
      Chao Yu authored
      In android, since SElinux is enable, security policy will be appliedd for
      each file, it stores in inode as an xattr entry, so it will take one 4k
      size node block additionally for each file.
      
      Let's enable inline_xattr by default in order to save storage space.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      39133a50
    • Chao Yu's avatar
      f2fs: introduce noinline_xattr mount option · 23cf7212
      Chao Yu authored
      This patch introduces new mount option 'noinline_xattr', so we can disable
      inline xattr functionality which is already set as a default mount option.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      23cf7212
    • Jaegeuk Kim's avatar
      f2fs: avoid reading NAT page by get_node_info · 25cc5d3b
      Jaegeuk Kim authored
      We've not seen this buggy case for a long time, so it's time to avoid this
      unnecessary get_node_info() call which reading NAT page to cache nat entry.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      25cc5d3b
    • Jaegeuk Kim's avatar
      f2fs: remove build_free_nids() during checkpoint · 9b064f7d
      Jaegeuk Kim authored
      Let's avoid build_free_nids() in checkpoint path.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      9b064f7d
    • Chao Yu's avatar
      f2fs: change recovery policy of xattr node block · d260081c
      Chao Yu authored
      Currently, if we call fsync after updating the xattr date belongs to the
      file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
      this policy cause low performance as checkpoint will block most foreground
      operations and cause unneeded and unrelated IOs around checkpoint.
      
      This patch will reuse regular file recovery policy for xattr node block,
      so, we change to write xattr node block tagged with fsync flag to warm
      area instead of cold area, and during recovery, we search warm node chain
      for fsynced xattr block, and do the recovery.
      
      So, for below application IO pattern, performance can be improved
      obviously:
      - touch file
      - create/update/delete xattr entry in file
      - fsync file
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d260081c
    • Bhumika Goyal's avatar
      f2fs: super: constify fscrypt_operations structure · 2ad0ef84
      Bhumika Goyal authored
      Declare fscrypt_operations structure as const as it is only stored in
      the s_cop field of a super_block structure. This field is of type const,
      so fscrypt_operations structure having this property can be made const
      too.
      
      File size before: fs/f2fs/super.o
         text	   data	    bss	    dec	    hex	filename
        54131	  31355	    184	  85670	  14ea6	fs/f2fs/super.o
      
      File size after: fs/f2fs/super.o
         text	   data	    bss	    dec	    hex	filename
        54227	  31259	    184	  85670	  14ea6	fs/f2fs/super.o
      Signed-off-by: default avatarBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2ad0ef84
    • Jaegeuk Kim's avatar
      f2fs: show checkpoint version at mount time · 1200abb2
      Jaegeuk Kim authored
      If we mounted f2fs successfully, let's show current checkpoint version.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1200abb2
    • Tiezhu Yang's avatar
      f2fs: fix a typo in f2fs.txt · 6de3f12e
      Tiezhu Yang authored
      There is a typo "f2f2" in f2fs.txt, this patch fixes it.
      Signed-off-by: default avatarTiezhu Yang <kernelpatch@126.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6de3f12e
    • Jaegeuk Kim's avatar
      f2fs: remove preflush for nobarrier case · 7f54f51f
      Jaegeuk Kim authored
      This patch removes REQ_PREFLUSH in the nobarrier case.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7f54f51f
    • Jaegeuk Kim's avatar
      f2fs: check last page index in cached bio to decide submission · 942fd319
      Jaegeuk Kim authored
      If the cached bio has the last page's index, then we need to submit it.
      Otherwise, we don't need to submit it and can wait for further IO merges.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      942fd319
    • Jaegeuk Kim's avatar
      f2fs: check io submission more precisely · d68f735b
      Jaegeuk Kim authored
      This patch check IO submission more precisely than previous rough check.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d68f735b
    • Jaegeuk Kim's avatar
      f2fs: call internal __write_data_page directly · f566bae8
      Jaegeuk Kim authored
      This patch introduces __write_data_page to call it by f2fs_write_cache_pages
      directly..
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f566bae8
    • Jaegeuk Kim's avatar
      f2fs: avoid out-of-order execution of atomic writes · e7c75ab0
      Jaegeuk Kim authored
      We need to flush data writes before flushing last node block writes by using
      FUA with PREFLUSH. We don't need to guarantee precedent node writes since if
      those are not written, we can't reach to the last node block when scanning
      node block chain during roll-forward recovery.
      Afterwards f2fs_wait_on_page_writeback guarantees all the IO submission to
      disk, which builds a valid node block chain.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e7c75ab0
    • Jaegeuk Kim's avatar
      f2fs: move write_node_page above fsync_node_pages · faa24895
      Jaegeuk Kim authored
      This patch just moves write_node_page and introduces an inner function.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      faa24895
    • Jaegeuk Kim's avatar
      f2fs: move flush tracepoint · c1b22107
      Jaegeuk Kim authored
      This patch moves the tracepoint location for flush command.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c1b22107
    • Jaegeuk Kim's avatar
      f2fs: show # of APPEND and UPDATE inodes · a00861db
      Jaegeuk Kim authored
      This patch shows cached # of APPEND and UPDATE inode entries.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a00861db
    • DongOh Shin's avatar
      f2fs: fix 446 coding style warnings in f2fs.h · cac5a3d8
      DongOh Shin authored
      1) Nine coding style warnings below have been resolved:
      "Missing a blank line after declarations"
      
      2) 435 coding style warnings below have been resolved:
      "function definition argument 'x' should also have an identifier name"
      
      3) Two coding style warnings below have been resolved:
      "macros should not use a trailing semicolon"
      Signed-off-by: default avatarDongOh Shin <doscode.kr@gmail.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cac5a3d8
    • DongOh Shin's avatar
      f2fs: fix 3 coding style errors in f2fs.h · c64ab12e
      DongOh Shin authored
      Two coding style errors below have been resolved:
      "Macros with complex values should be enclosed in parentheses"
      
      And a coding style error below has been resolved:
      "space prohibited before that ',' (ctx:WxW)"
      Signed-off-by: default avatarDongOh Shin <doscode.kr@gmail.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c64ab12e
    • Jaegeuk Kim's avatar
      f2fs: declare missing static function · 8ed59745
      Jaegeuk Kim authored
      We missed two functions declared as static functions.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8ed59745
    • Kaixu Xia's avatar
      f2fs: show the fault injection mount option · 0cc0dec2
      Kaixu Xia authored
      This patch shows the fault injection mount option in
      f2fs_show_options().
      Signed-off-by: default avatarKaixu Xia <xiakaixu@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0cc0dec2
    • Chao Yu's avatar
      f2fs: fix null pointer dereference when issuing flush in ->fsync · 73545817
      Chao Yu authored
      We only allocate flush merge control structure sbi::sm_info::fcc_info when
      flush_merge option is on, but in f2fs_issue_flush we still try to access
      member of the control structure without that option, it incurs panic as
      show below, fix it.
      
      Call Trace:
       __remove_ino_entry+0xa9/0xc0 [f2fs]
       f2fs_do_sync_file.isra.27+0x214/0x6d0 [f2fs]
       f2fs_sync_file+0x18/0x20 [f2fs]
       vfs_fsync_range+0x3d/0xb0
       __do_page_fault+0x261/0x4d0
       do_fsync+0x3d/0x70
       SyS_fsync+0x10/0x20
       do_syscall_64+0x6e/0x180
       entry_SYSCALL64_slow_path+0x25/0x25
      RIP: 0033:0x7f18ce260de0
      RSP: 002b:00007ffdd4589258 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
      RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f18ce260de0
      RDX: 0000000000000006 RSI: 00000000016c0360 RDI: 0000000000000003
      RBP: 00000000016c0360 R08: 000000000000ffff R09: 000000000000001f
      R10: 00007ffdd4589020 R11: 0000000000000246 R12: 00000000016c0100
      R13: 0000000000000000 R14: 00000000016c1f00 R15: 00000000016c0100
      Code: fb 81 e3 00 08 00 00 48 89 45 a0 0f 1f 44 00 00 31 c0 85 db 75 27 41 81 e7 00 04 00 00 74 0c 41 8b 45 20 85 c0 0f 85 81 00 00 00 <f0> 41 ff 45 20 4c 89 e7 e8 f8 e9 ff ff f0 41 ff 4d 20 48 83 c4
      RIP: f2fs_issue_flush+0x5b/0x170 [f2fs] RSP: ffffc90003b5fd78
      CR2: 0000000000000020
      ---[ end trace a09314c24f037648 ]---
      Reported-by: default avatarShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      73545817
    • Chao Yu's avatar
      f2fs: fix to avoid overflow when left shifting page offset · dba79f38
      Chao Yu authored
      We use following method to calculate size with current page index:
      size = index << PAGE_SHIFT
      If type of index has only 32-bits size, left shifting will incur overflow,
      which makes result incorrect.
      
      So let's cast index with 64-bits type to avoid such issue.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      dba79f38
    • Chao Yu's avatar
      f2fs: enhance lookup xattr · ba38c27e
      Chao Yu authored
      Previously, in getxattr we will load all entries both in inline xattr and
      xattr node block, and then do the lookup in all entries, but our lookup
      flow shows low efficiency, since if we can lookup and hit in inline xattr
      of inode page cache first, we don't need to load and lookup xattr node
      block, which can obviously save cpu time and IO latency.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: initialize NULL to avoid warning]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ba38c27e