Commits · 4086d3f61b6573f65ddc13fc375c0c7b0ac482a0 · nexedi / linux

24 Apr, 2017 7 commits

f2fs: skip encrypted inode in ASYNC IPU policy · 4086d3f6

Hou Pengyang authored Apr 21, 2017

Async request may be throttled in block layer, so page for async may keep WRITE_BACK
for a long time.

For encrytped inode, we need wait on page writeback no matter if the device supports
BDI_CAP_STABLE_WRITES. This may result in a higher waiting page writeback time for
async encrypted inode page.

This patch skips IPU for encrypted inode's updating write.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

4086d3f6

f2fs: fix out-of free segments · a7881893

Jaegeuk Kim authored Apr 20, 2017

This patch also reverts d0db7703 ("f2fs: do SSR in higher priority").

This patch fixes out of free segments caused by many small file creation by
1) mkfs -s 1 2G
2) mount
3) untar
 - preoduce 60000 small files burstly
4) sync
 - flush node pages
 - flush imeta

Here, when we do f2fs_balance_fs, we missed # of imeta blocks, resulting in
skipping to check has_not_enough_free_secs.

Another test is done by
1) mkfs -s 12 2G
2) mount
3) untar
 - preoduce 60000 small files burstly
4) sync
 - flush node pages
 - flush imeta

In this case, this patch also fixes wrong block allocation under large section
size.
Reported-by: William Brana <wbrana@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

a7881893

f2fs: add parentheses for macro variables more · 29fa6c56

Jaegeuk Kim authored Apr 19, 2017

This patch adds parentheses for macro variables more in include/linux/f2fs_fs.h.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

29fa6c56

f2fs: improve definition of statistic macros · d66450e7

Arnd Bergmann authored Apr 19, 2017

With a recent addition of f2fs_lookup_extent_tree(), we get a warning about
the use of empty macros:

fs/f2fs/extent_cache.c: In function 'f2fs_lookup_extent_tree':
fs/f2fs/extent_cache.c:358:32: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body]
   stat_inc_rbtree_node_hit(sbi);

A good way to avoid the warning and make the code more robust is to define
all no-op macros as 'do { } while (0)'.

Fixes: 54c2258c ("f2fs: extract rb-tree operation infrastructure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reivewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d66450e7

f2fs: assign allocation hint for warm/cold data · d5793249

Jaegeuk Kim authored Apr 18, 2017

This patch gives slower device region to warm/cold data area more eagerly.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d5793249

f2fs: fix _IOW usage · d07efb50

Jaegeuk Kim authored Apr 18, 2017

This patch fixes wrong _IOW usage.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d07efb50

f2fs: add ioctl to flush data from faster device to cold area · e066b83c

Jaegeuk Kim authored Apr 13, 2017

This patch adds an ioctl to flush data in faster device to cold area. User can
give device number and number of segments to move. It doesn't move it if there
is only one device.

The parameter looks like:

struct f2fs_flush_device {
	u32 dev_num;		/* device number to flush */
	u32 segments;		/* # of segments to flush */
};
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

e066b83c

19 Apr, 2017 8 commits

f2fs: introduce async IPU policy · 04485987

Hou Pengyang authored Apr 18, 2017

This patch introduces an ASYNC IPU policy.

Under senario of large # of async updating(e.g. log writing in Android),
disk would be seriously fragmented, and higher frequent gc would be triggered.

This patch uses IPU to rewrite the async update writting, since async is
NOT sensitive to io latency.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>

04485987

f2fs: add undiscard blocks stat · d84d1cbd

Chao Yu authored Apr 18, 2017

This patch adds to account undiscard blocks.
Signed-off-by: Chao Yu <yuchao0@huawei.com>

d84d1cbd

f2fs: unlock cp_rwsem early for IPU writes · 001c584c

Chao Yu authored Apr 18, 2017

For IPU writes, there won't be any udpates in dnode page since we
will reuse old block address instead of allocating new one, so we
don't need to lock cp_rwsem during IPU IO submitting.
Signed-off-by: Chao Yu <yuchao0@huawei.com>

001c584c

f2fs: introduce __check_rb_tree_consistence · df0f6b44

Chao Yu authored Apr 17, 2017

Introduce __check_rb_tree_consistence to check consistence of rb-tree
based discard cache in runtime.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

df0f6b44

f2fs: trace __submit_discard_cmd · 0243a5f9

Chao Yu authored Apr 15, 2017

Add an even class f2fs_discard for introducing f2fs_queue_discard, then
use f2fs_{queue,issue}_discard to trace __{queue,submit}_discard_cmd.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

0243a5f9

f2fs: in prior to issue big discard · ba48a33e

Chao Yu authored Apr 15, 2017

Keep issuing big size discard in prior instead of the one with random
size, so that we expect that it will help to:
- be quick to recycle unused large space in flash storage device.
- give a chance for
  a) wait to merge small piece discards into bigger one, or
  b) avoid issuing discards while they have being reallocated by SSR.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

ba48a33e

f2fs: clean up discard_cmd_control structure · 46f84c2c

Chao Yu authored Apr 15, 2017

Avoid long variable name in discard_cmd_control structure, no logic
change.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

46f84c2c

f2fs: use rb-tree to track pending discard commands · 004b6862

Chao Yu authored Apr 14, 2017

Introduce rb-tree based discard cache infrastructure to speed up lookup and
merge operation of discard entry.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: initialize dc to avoid build warning]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

004b6862

18 Apr, 2017 1 commit

f2fs: avoid dirty node pages in check_only recovery · d40d30c5

Jaegeuk Kim authored Apr 14, 2017

In the check_only mode, we should not make any dirty node pages. Otherwise,
we can get this panic:

F2FS-fs (nvme0n1p1): Need to recover fsync data
------------[ cut here ]------------
kernel BUG at fs/f2fs/node.c:2204!
CPU: 7 PID: 19923 Comm: mount Tainted: G           OE   4.9.8 #2
RIP: 0010:[<ffffffffc0979c0b>]  [<ffffffffc0979c0b>] flush_nat_entries+0x43b/0x7d0 [f2fs]
Call Trace:
 [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs]
 [<ffffffffc096ddaa>] ? __f2fs_submit_merged_bio+0x5a/0xd0 [f2fs]
 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs]
 [<ffffffff860e450f>] ? up_write+0x1f/0x40
 [<ffffffffc096dddb>] ? __f2fs_submit_merged_bio+0x8b/0xd0 [f2fs]
 [<ffffffffc0969f04>] write_checkpoint+0x2f4/0xf20 [f2fs]
 [<ffffffff860e938d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
 [<ffffffffc0960bc9>] ? f2fs_sync_fs+0x79/0x190 [f2fs]
 [<ffffffffc0960bd5>] f2fs_sync_fs+0x85/0x190 [f2fs]
 [<ffffffffc097b6de>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs]
 [<ffffffffc0977b64>] f2fs_write_node_pages+0x34/0x350 [f2fs]
 [<ffffffff860e5f42>] ? __lock_is_held+0x52/0x70
 [<ffffffff861d9b31>] do_writepages+0x21/0x30
 [<ffffffff86298ce1>] __writeback_single_inode+0x61/0x760
 [<ffffffff86909127>] ? _raw_spin_unlock+0x27/0x40
 [<ffffffff8629a735>] writeback_single_inode+0xd5/0x190
 [<ffffffff8629a889>] write_inode_now+0x99/0xc0
 [<ffffffff86283876>] iput+0x1f6/0x2c0
 [<ffffffffc0964b52>] f2fs_fill_super+0xc32/0x10c0 [f2fs]
 [<ffffffff86266462>] mount_bdev+0x182/0x1b0
 [<ffffffffc0963f20>] ? f2fs_commit_super+0x100/0x100 [f2fs]
 [<ffffffffc0960da5>] f2fs_mount+0x15/0x20 [f2fs]
 [<ffffffff86266e08>] mount_fs+0x38/0x170
 [<ffffffff86288bab>] vfs_kern_mount+0x6b/0x160
 [<ffffffff8628bcfe>] do_mount+0x1be/0xd60
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d40d30c5

12 Apr, 2017 5 commits

f2fs: fix not to set fsync/dentry mark · d29fd172

Jaegeuk Kim authored Apr 12, 2017

Otherwise, we can see stale fsync/dentry mark given by previous calls, resulting
in giving up roll-forward recovery due to wrong dentry mark.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d29fd172

f2fs: allocate hot_data for atomic writes · 6c3acd97

Jaegeuk Kim authored Apr 12, 2017

We'd better allocate atomic writes to hot_data zone.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

6c3acd97

f2fs: give time to flush dirty pages for checkpoint · 30973883

Jaegeuk Kim authored Apr 11, 2017

If all the threads are waiting for checkpoint, we have no chance to flush
required dirty pages.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

30973883

f2fs: fix fs corruption due to zero inode page · 9bb02c36

Jaegeuk Kim authored Apr 11, 2017

This patch fixes the following scenario.

- f2fs_create/f2fs_mkdir             - write_checkpoint
 - f2fs_mark_inode_dirty_sync         - block_operations
                                       - f2fs_lock_all
                                       - f2fs_sync_inode_meta
                                        - f2fs_unlock_all
                                        - sync_inode_metadata
 - f2fs_lock_op
                                         - f2fs_write_inode
                                          - update_inode_page
                                           - get_node_page
                                             return -ENOENT
 - new_inode_page
  - fill_node_footer
 - f2fs_mark_inode_dirty_sync
 - ...
 - f2fs_unlock_op
                                          - f2fs_inode_synced
                                       - f2fs_lock_all
                                       - do_checkpoint

In this checkpoint, we can get an inode page which contains zeros having valid
node footer only.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

9bb02c36

f2fs: shrink blk plug region · a54455f5

Chao Yu authored Mar 27, 2017

Don't use blk plug covering area where there won't be any IOs being issued.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

a54455f5

11 Apr, 2017 11 commits

f2fs: extract rb-tree operation infrastructure · 54c2258c

Chao Yu authored Apr 11, 2017

rb-tree lookup/update functions are deeply coupled into extent cache
codes, it's very hard to reuse these basic functions, this patch
extracts common rb-tree operation infrastructure for latter reusing.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

54c2258c

f2fs: avoid frequent checkpoint during f2fs_gc · 8fd5a37e

Jaegeuk Kim authored Apr 07, 2017

Now we're doing SSR aggressively more than ever before, so once we reach to
the reserved_segment, f2fs_balance_fs will call f2fs_gc, which triggers
checkpoint everytime. We actually must avoid that.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

8fd5a37e

f2fs: clean up some macros in terms of GET_SEGNO · 4ddb1a4d

Jaegeuk Kim authored Apr 07, 2017

This patch cleans several macros by introducing:
- BLKS_PER_SEC
- GET_SEC_FROM_SEG
- GET_SEG_FROM_SEC
- GET_ZONE_FROM_SEC
- GET_ZONE_FROM_SEG
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

4ddb1a4d

f2fs: clean up get_valid_blocks with consistent parameter · 302bd348

Jaegeuk Kim authored Apr 07, 2017

This patch cleans up get_valid_blocks, which has no functional change.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

302bd348

f2fs: use segment number for get_valid_blocks · 63fcf8e8

Jaegeuk Kim authored Apr 07, 2017

This patch fixes to submit a segment number for get_valid_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

63fcf8e8

f2fs: guard macro variables with braces · 68afcf2d

Tomohiro Kusumi authored Apr 09, 2017

Add braces around variables used within macros for those make sense
to do it. Many of the macros in f2fs already do this. What this commit
doesn't do is anything that changes line# as a result of adding braces,
which usually affects the binary via __LINE__.

Confirmed no diff in fs/f2fs/f2fs.ko before/after this commit on x86_64,
to make sure this has no functional change as well as there's been no
unexpected side effect due to callers' arithmetics within the existing
code.
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

68afcf2d

f2fs: fix comment on f2fs_flush_merged_bios() after · 771a9a71

Tomohiro Kusumi authored Apr 05, 2017

Callers are to unlock the page on failure after 86531d6b.
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

771a9a71

f2fs: prevent waiter encountering incorrect discard states · fa64a003

Chao Yu authored Apr 05, 2017

In f2fs_submit_discard_endio, we will wake up waiter before setting
discard command states, so waiter may use incorrect states. Change
the order between complete() and states setting to fix this issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

fa64a003

f2fs: introduce f2fs_wait_discard_bios · d431413f

Chao Yu authored Apr 05, 2017

Split f2fs_wait_discard_bios from f2fs_wait_discard_bio, just for cleanup,
no logic change.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d431413f

f2fs: split discard_cmd_list · 22d375dd

Chao Yu authored Apr 05, 2017

Split discard_cmd_list to discard_{pend,wait}_list, so while sending/waiting
discard command, we can avoid traversing unneeded entries in original list.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

22d375dd

Revert "f2fs: put allocate_segment after refresh_sit_entry" · c6f82fe9

Jaegeuk Kim authored Apr 04, 2017

This reverts commit 3436c4bd.

This makes a leak to register dirty segments. I reproduced the issue by
modified postmark which injects a lot of file create/delete/update and
finally triggers huge number of SSR allocations.

Cc: <stable@vger.kernel.org> # v4.10+
[Jaegeuk Kim: Change missing incorrect comment]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

c6f82fe9

05 Apr, 2017 8 commits

f2fs: split make_dentry_ptr() into block and inline versions · 64c24ecb

Tomohiro Kusumi authored Apr 04, 2017

Since callers statically know which type to use, make_dentry_ptr()
can simply be splitted into two inline functions. This way, the code
has less inlined, fewer arguments, and no cast.
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

64c24ecb

f2fs: submit bio of in-place-update pages · d1b3e72d

Jaegeuk Kim authored Mar 30, 2017

This patch tries to split in-place-update bios from sequential bios.
Suggested-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

d1b3e72d

f2fs: remove the redundant variable definition · fc2e2875

Kaixu Xia authored Apr 02, 2017

The variable 'i' has been defined before, so here we can
use it directly.
Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

fc2e2875

f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE · 687de7f1

Jaegeuk Kim authored Mar 28, 2017

If two threads try to flush dirty pages in different inodes respectively,
f2fs_write_data_pages() will produce WRITE and WRITE_SYNC one at a time,
resulting in a lot of 4KB seperated IOs.

So, this patch gives higher priority to WB_SYNC_ALL IOs and gathers write
IOs with a big WRITE_SYNC'ed bio.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

687de7f1

f2fs: write small sized IO to hot log · ef095d19

Jaegeuk Kim authored Mar 24, 2017

It would better split small and large IOs separately in order to get more
consecutive big writes.

The default threshold is set to 64KB, but configurable by sysfs/min_hot_blocks.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

ef095d19

f2fs: use bitmap in discard_entry · a7eeb823

Chao Yu authored Mar 28, 2017

This patch changes to use bitmap instead of extent in struct discard_entry
to indicate discard range in one segment, for fragmented space, this
implementation can save memory footprint.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

a7eeb823

f2fs: clean up destroy_discard_cmd_control · f099405f

Chao Yu authored Mar 27, 2017

Remove unneeded parameter and simply change flow in
destroy_discard_cmd_control.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f099405f

f2fs: count discard command entry · 5f32366a

Chao Yu authored Mar 25, 2017

Adds to count discard command entry and show the number in debugfs,
also fix to add cost of discard command cache into total comsumed
memory footprint.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

5f32366a