• Chao Yu's avatar
    f2fs: fix potential recursive call when enabling data_flush · 186857c5
    Chao Yu authored
    As Hagbard Celine reported:
    
    Hi, this is a long standing bug that I've hit before on older kernels,
    but I was not able to get the syslog saved because of the nature of
    the bug. This time I had booted form a pen-drive, and was able to save
    the log to it's efi-partition.
    What i did to trigger it was to create a partition and format it f2fs,
    then mount it with options:
    "rw,relatime,lazytime,background_gc=on,disable_ext_identify,discard,heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,data_flush,extent_cache,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict".
    Then I unpacked a big .tar.xz to the partition (I used a
    gentoo-stage3-tarball as I was in process of installing Gentoo).
    
    Same options just without data_flush gives no problems.
    
    Mar 20 20:54:01 usbgentoo kernel: FAT-fs (nvme0n1p4): Volume was not
    properly unmounted. Some data may be corrupt. Please run fsck.
    Mar 20 21:05:23 usbgentoo kernel: kworker/dying (1588) used greatest
    stack depth: 12064 bytes left
    Mar 20 21:06:40 usbgentoo kernel: BUG: stack guard page was hit at
    00000000a4b0733c (stack is 0000000056016422..0000000096e7463f)
    Mar 20 21:06:40 usbgentoo kernel: kernel stack overflow
    
    ......
    
    Mar 20 21:06:40 usbgentoo kernel: Call Trace:
    Mar 20 21:06:40 usbgentoo kernel:  read_node_page+0x71/0xf0
    Mar 20 21:06:40 usbgentoo kernel:  ? xas_load+0x8/0x50
    Mar 20 21:06:40 usbgentoo kernel:  __get_node_page+0x73/0x2a0
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_get_dnode_of_data+0x34e/0x580
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_write_inline_data+0x5e/0x2a0
    Mar 20 21:06:40 usbgentoo kernel:  __write_data_page+0x421/0x690
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_write_cache_pages+0x1cf/0x460
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_write_data_pages+0x2b3/0x2e0
    Mar 20 21:06:40 usbgentoo kernel:  ? f2fs_inode_chksum_verify+0x1d/0xc0
    Mar 20 21:06:40 usbgentoo kernel:  ? read_node_page+0x71/0xf0
    Mar 20 21:06:40 usbgentoo kernel:  do_writepages+0x3c/0xd0
    Mar 20 21:06:40 usbgentoo kernel:  __filemap_fdatawrite_range+0x7c/0xb0
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_sync_dirty_inodes+0xf2/0x200
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_balance_fs_bg+0x2a3/0x2c0
    Mar 20 21:06:40 usbgentoo kernel:  ? f2fs_inode_dirtied+0x21/0xc0
    Mar 20 21:06:40 usbgentoo kernel:  f2fs_balance_fs+0xd6/0x2b0
    Mar 20 21:06:40 usbgentoo kernel:  __write_data_page+0x4fb/0x690
    
    ......
    
    Mar 20 21:06:40 usbgentoo kernel:  __writeback_single_inode+0x2a1/0x340
    Mar 20 21:06:40 usbgentoo kernel:  ? soft_cursor+0x1b4/0x220
    Mar 20 21:06:40 usbgentoo kernel:  writeback_sb_inodes+0x1d5/0x3e0
    Mar 20 21:06:40 usbgentoo kernel:  __writeback_inodes_wb+0x58/0xa0
    Mar 20 21:06:40 usbgentoo kernel:  wb_writeback+0x250/0x2e0
    Mar 20 21:06:40 usbgentoo kernel:  ? 0xffffffff8c000000
    Mar 20 21:06:40 usbgentoo kernel:  ? cpumask_next+0x16/0x20
    Mar 20 21:06:40 usbgentoo kernel:  wb_workfn+0x2f6/0x3b0
    Mar 20 21:06:40 usbgentoo kernel:  ? __switch_to_asm+0x40/0x70
    Mar 20 21:06:40 usbgentoo kernel:  process_one_work+0x1f5/0x3f0
    Mar 20 21:06:40 usbgentoo kernel:  worker_thread+0x28/0x3c0
    Mar 20 21:06:40 usbgentoo kernel:  ? rescuer_thread+0x330/0x330
    Mar 20 21:06:40 usbgentoo kernel:  kthread+0x10e/0x130
    Mar 20 21:06:40 usbgentoo kernel:  ? kthread_create_on_node+0x60/0x60
    Mar 20 21:06:40 usbgentoo kernel:  ret_from_fork+0x35/0x40
    
    The root cause is that we run into an infinite recursive calling in
    between f2fs_balance_fs_bg and writepage() as described below:
    
    - f2fs_write_data_pages		--- A
     - __write_data_page
      - f2fs_balance_fs
       - f2fs_balance_fs_bg		--- B
        - f2fs_sync_dirty_inodes
         - filemap_fdatawrite
          - f2fs_write_data_pages	--- A
    ...
              - f2fs_balance_fs_bg	--- B
    ...
    
    In order to fix this issue, let's detect such condition in __write_data_page()
    and just skip calling f2fs_balance_fs() recursively.
    Reported-by: default avatarHagbard Celine <hagbardcelin@gmail.com>
    Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
    Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
    186857c5
checkpoint.c 39.4 KB