Commits · a2ee0a300344a6da76186129b078113354fe13d2 · Kirill Smelkov / linux

08 Jul, 2016 14 commits

f2fs: move i_size_write in f2fs_write_end · a2ee0a30

Jaegeuk Kim authored Jul 07, 2016

We don't need to do i_size_write under page lock.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

a2ee0a30

f2fs: fix to avoid redundant discard during fstrim · c24a0fd6

Chao Yu authored Jul 07, 2016

With below test steps, f2fs will issue redundant discard when doing fstrim,
the reason is that we issue discards for both prefree segments and
consecutive freed region user wants to trim, part regions they covered are
overlapped, here, we change to do not to issue any discards for prefree
segments in trimmed range.

1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs
2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
3. dd if=/dev/zero  of=/mnt/f2fs/a bs=2M count=1
4. dd if=/dev/zero  of=/mnt/f2fs/b bs=1M count=1
5. sync
6. rm /mnt/f2fs/a /mnt/f2fs/b
7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/

Before:
<...>-5428  [001] ...1  9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200
<...>-5428  [001] ...1  9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300

After:
<...>-6764  [000] ...1  9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

c24a0fd6

f2fs: avoid mismatching block range for discard · c7b41e16

Yunlei He authored Jul 07, 2016

This patch skip discard block range smaller than trim_minlen,
and can not be merged by neighbour
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

c7b41e16

f2fs: fix incorrect f_bfree calculation in ->statfs · 3e6d0b4d

Chao Yu authored Jul 06, 2016

As manual described, f_bfree indicates total free blocks in fs, in f2fs, it
includes two parts: visible free blocks and over-provision blocks. This
patch corrrects the calculation.

fsblkcnt_t   f_bfree;   /* free blocks in fs */
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

3e6d0b4d

f2fs: use percpu_rw_semaphore · ec795418

Jaegeuk Kim authored Jun 30, 2016

This patch replaces rw_semaphore with percpu_rw_semaphore for:
sbi->cp_rwsem
nm_i->nat_tree_lock
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

ec795418

f2fs: skip to check the block address of node page · 3bdad3c7

Jaegeuk Kim authored Jun 30, 2016

If the node page is up-to-date, it should be alive.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

3bdad3c7

f2fs: shrink critical region in spin_lock · 2555a2d5

Jaegeuk Kim authored Jun 30, 2016

This patch shrinks the critical region in spin_lock.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

2555a2d5

f2fs: call SetPageUptodate if needed · 237c0790

Jaegeuk Kim authored Jun 30, 2016

SetPageUptodate() issues memory barrier, resulting in performance degrdation.
Let's avoid that.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

237c0790

f2fs: introduce f2fs_set_page_dirty_nobuffer · fe76b796

Jaegeuk Kim authored Jun 30, 2016

This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer.
When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the
overall performance.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

fe76b796

f2fs: remove unnecessary goto statement · a0995af6

Tiezhu Yang authored Jun 28, 2016

When base_addr is NULL, there is no need to call kzfree,
it should return -ENOMEM directly. Additionally, it is
better to initialize variable 'error' with 0.
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

a0995af6

f2fs: add nodiscard mount option · 64058be9

Chao Yu authored Jul 03, 2016

This patch adds 'nodiscard' mount option.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

64058be9

f2fs: fix to redirty page if fail to gc data page · 72e1c797

Chao Yu authored Jul 03, 2016

If we fail to move data page during foreground GC, we should give another
chance to writeback that page which was set dirty previously by writer.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

72e1c797

f2fs: fix to detect truncation prior rather than EIO during read · 1563ac75

Chao Yu authored Jul 03, 2016

In procedure of synchonized read, after sending out the read request, reader
will try to lock the page for waiting device to finish the read jobs and
unlock the page, but meanwhile, truncater will race with reader, so after
reader get lock of the page, it should check page's mapping to detect
whether someone has truncated the page in advance, then reader has the
chance to do the retry if truncation was done, otherwise read can be failed
due to previous condition check.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

1563ac75

f2fs: fix to avoid reading out encrypted data in page cache · 78682f79

Chao Yu authored Jul 03, 2016

For encrypted inode, if user overwrites data of the inode, f2fs will read
encrypted data into page cache, and then do the decryption.

However reader can race with overwriter, and it will see encrypted data
which has not been decrypted by overwriter yet. Fix it by moving decrypting
work to background and keep page non-uptodated until data is decrypted.

Thread A				Thread B
- f2fs_file_write_iter
 - __generic_file_write_iter
  - generic_perform_write
   - f2fs_write_begin
    - f2fs_submit_page_bio
					- generic_file_read_iter
					 - do_generic_file_read
					  - lock_page_killable
					  - unlock_page
					  - copy_page_to_iter
					  hit the encrypted data in updated page
    - lock_page
    - fscrypt_decrypt_page
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

78682f79

06 Jul, 2016 6 commits

f2fs: avoid latency-critical readahead of node pages · ac6f1999

Jaegeuk Kim authored Jun 16, 2016

The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

ac6f1999

f2fs: avoid writing node/metapages during writes · 2c237eba
Jaegeuk Kim authored Jun 16, 2016
```
Let's keep more node/meta pages in run time.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
```
2c237eba

f2fs: produce more nids and reduce readahead nats · ad4edb83

Jaegeuk Kim authored Jun 16, 2016

The readahead nat pages are more likely to be reclaimed quickly, so it'd better
to gather more free nids in advance.

And, let's keep some free nids as much as possible.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

ad4edb83

f2fs: detect host-managed SMR by feature flag · 52763a4b

Jaegeuk Kim authored Jun 13, 2016

If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs
by default.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

52763a4b

f2fs: call update_inode_page for orphan inodes · 67c3758d
Jaegeuk Kim authored Jun 13, 2016
```
Let's store orphan inode pages right away.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
```
67c3758d

f2fs: report error for f2fs_parent_dir · 3e19886e

Jaegeuk Kim authored Jun 09, 2016

If there is no dentry, we can report its error correctly.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

3e19886e

15 Jun, 2016 1 commit

f2fs: find parent dentry correctly · 8be0fea9

Sheng Yong authored Jun 04, 2016

If dotdot directory is corrupted, its slot may be ocupied by another
file. In this case, dentry[1] is not the parent directory. Rename and
cross-rename will update the inode in dentry[1] incorrectly.   This
patch finds dotdot dentry by name.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
[Jaegeuk Kim: remove wron bug_on]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

8be0fea9

13 Jun, 2016 2 commits

f2fs: fix deadlock in add_link failure · c92737ce

Jaegeuk Kim authored Jun 07, 2016

mkdir                        sync_dirty_inode
 - init_inode_metadata
   - lock_page(node)
   - make_empty_dir
                             - filemap_fdatawrite()
                              - do_writepages
                              - lock_page(data)
                              - write_page(data)
                               - lock_page(node)
   - f2fs_init_acl
    - error
   - truncate_inode_pages
    - lock_page(data)

So, we don't need to truncate data pages in this error case, which will
be done by f2fs_evict_inode.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

c92737ce

f2fs: introduce mode=lfs mount option · 36abef4e

Jaegeuk Kim authored Jun 03, 2016

This mount option is to enable original log-structured filesystem forcefully.
So, there should be no random writes for main area.

Especially, this supports host-managed SMR device.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

36abef4e

08 Jun, 2016 4 commits

f2fs: skip clean segment for gc · aa987273

Jaegeuk Kim authored Jun 06, 2016

If a segment in a section is clean or prefreed, we don't need to get its summary
and do gc.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

aa987273

f2fs: drop any block plugging · 19a5f5e2

Jaegeuk Kim authored Jun 04, 2016

In f2fs, we don't need to keep block plugging for NODE and DATA writes, since
we already merged bios as much as possible.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

19a5f5e2

f2fs: avoid reverse IO order for NODE and DATA · 7dfeaa32

Jaegeuk Kim authored Jun 04, 2016

There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(),
which incur unnecessary reversed bio submission.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

7dfeaa32

f2fs: set mapping error for EIO · 7f319975

Jaegeuk Kim authored Jun 03, 2016

If EIO occurred, we need to set all the mapping to avoid any further IOs.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

7f319975

07 Jun, 2016 6 commits

f2fs: control not to exceed # of cached nat entries · e589c2c4

Jaegeuk Kim authored Jun 02, 2016

This is to avoid cache entry management overhead including radix tree.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

e589c2c4

f2fs: fix wrong percentage · 29710bcf

Jaegeuk Kim authored Jun 02, 2016

This should be 1%, 10MB / 1GB.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

29710bcf

f2fs: avoid data race between FI_DIRTY_INODE flag and update_inode · 1e7c48fa

Jaegeuk Kim authored Jun 02, 2016

FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset
at any time like below.

Thread #1                        Thread #2
- lock_page(ipage)
- update i_fields
                                 - update i_size/i_blocks/and so on
				 - set FI_DIRTY_INODE
- reset FI_DIRTY_INODE
- set_page_dirty(ipage)

In this case, we can lose the latest i_field information.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

1e7c48fa

f2fs: remove obsolete parameter in f2fs_truncate · 9a449e9c

Jaegeuk Kim authored Jun 02, 2016

We don't need lock parameter, which is always true.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

9a449e9c

f2fs: avoid wrong count on dirty inodes · 338bbfa0

Jaegeuk Kim authored Jun 02, 2016

The number should be covered by spin_lock. Otherwise we can see wrong count
in f2fs_stat.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

338bbfa0

f2fs: remove deprecated parameter · 9f7c45cc
Jaegeuk Kim authored Jun 01, 2016
```
Remove deprecated paramter.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
```
9f7c45cc

03 Jun, 2016 7 commits

f2fs: handle writepage correctly · b230e6ca

Jaegeuk Kim authored May 29, 2016

Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.
Reported-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

b230e6ca

f2fs: return error of f2fs_lookup · eb4246dc

Jaegeuk Kim authored May 27, 2016

Now we can report an error to f2fs_lookup given by f2fs_find_entry.
Suggested-by: He YunLei <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

eb4246dc

f2fs: return the errno to the caller to avoid using a wrong page · 0c9df7fb

Yunlong Song authored May 26, 2016

Commit aaf96075 ("f2fs: check node page
contents all the time") pointed out that "sometimes it was reported that
its contents was missing", so it checks the page's mapping and contents.
When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the
caller. However, commit e1c51b9f ("f2fs:
clean up node page updating flow") moves "nid != nid_of_node(page)" test
to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a
wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it
was reported that its contents was missing" happens.

This patch restores to check node page contents all the time, and
returns the errno to make the caller known something is wrong and avoid
to use the page. This patch also moves f2fs_bug_on to its proper location.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

0c9df7fb

f2fs: remove two steps to flush dirty data pages · 46ae957f

Jaegeuk Kim authored May 25, 2016

If there is no cold page, we don't need to do a loop to flush dirty
data pages.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 1.1 GB/s
 After  : 1.2 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 2.2 GB/s
 After  : 2.3 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

46ae957f

f2fs: do not skip writing data pages · 28ea6162

Jaegeuk Kim authored May 25, 2016

For data pages, let's try to flush as much as possible in background.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 800 MB/s
 After  : 1.1 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 1.3 GB/s
 After  : 2.2 GB/s
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

28ea6162

f2fs: inject to produce some orphan inodes · 53aa6bbf
Jaegeuk Kim authored May 25, 2016
```
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
```
53aa6bbf

f2fs: propagate error given by f2fs_find_entry · 42d96401

Jaegeuk Kim authored May 25, 2016

If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

42d96401