• Zhihao Cheng's avatar
    ubifs: Queue up space reservation tasks if retrying many times · 556c19f5
    Zhihao Cheng authored
    Recently we catched ENOSPC returned by make_reservation() while doing
    fsstress on UBIFS, we got following information when it occurred (See
    details in Link):
    
     UBIFS error (ubi0:0 pid 3640152): make_reservation [ubifs]: cannot
     reserve 112 bytes in jhead 2, error -28
     CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G    B   W
     Hardware name: Hisilicon PhosphorHi1230 EMU (DT)
     Workqueue: writeback wb_workfn (flush-ubifs_0_0)
     Call trace:
      dump_stack+0x114/0x198
      make_reservation+0x564/0x610 [ubifs]
      ubifs_jnl_write_data+0x328/0x48c [ubifs]
      do_writepage+0x2a8/0x3e4 [ubifs]
      ubifs_writepage+0x16c/0x374 [ubifs]
      generic_writepages+0xb4/0x114
      do_writepages+0xcc/0x11c
      writeback_sb_inodes+0x2d0/0x564
      wb_writeback+0x20c/0x2b4
      wb_workfn+0x404/0x510
      process_one_work+0x304/0x4ac
      worker_thread+0x31c/0x4e4
      kthread+0x23c/0x290
      Budgeting info: data budget sum 17576, total budget sum 17768
    	budg_data_growth 4144, budg_dd_growth 13432, budg_idx_growth 192
    	min_idx_lebs 13, old_idx_sz 988640, uncommitted_idx 0
    	page_budget 4144, inode_budget 160, dent_budget 312
    	nospace 0, nospace_rp 0
    	dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
    	freeable_cnt 0, calc_idx_sz 988640, idx_gc_cnt 0
    	dirty_pg_cnt 4, dirty_zn_cnt 0, clean_zn_cnt 4811
    	gc_lnum 21, ihead_lnum 14
    	jhead 0 (GC)	 LEB 16
    	jhead 1 (base)	 LEB 34
    	jhead 2 (data)	 LEB 23
    	bud LEB 16
    	bud LEB 23
    	bud LEB 34
    	old bud LEB 33
    	old bud LEB 31
    	old bud LEB 15
    	commit state 4
     Budgeting predictions:
    	available: 33832, outstanding 17576, free 15356
     (pid 3640152) start dumping LEB properties
     (pid 3640152) Lprops statistics: empty_lebs 3, idx_lebs  11
    	taken_empty_lebs 1, total_free 1253376, total_dirty 2445736
    	total_used 3438712, total_dark 65536, total_dead 17248
     LEB 15 free 0      dirty 248000   used 5952   (taken)
     LEB 16 free 110592 dirty 896      used 142464 (taken, jhead 0 (GC))
     LEB 21 free 253952 dirty 0        used 0      (taken, GC LEB)
     LEB 23 free 0      dirty 248104   used 5848   (taken, jhead 2 (data))
     LEB 29 free 253952 dirty 0        used 0      (empty)
     LEB 33 free 0      dirty 253952   used 0      (taken)
     LEB 34 free 217088 dirty 36544    used 320    (taken, jhead 1 (base))
     LEB 37 free 253952 dirty 0        used 0      (empty)
     OTHERS: index lebs, zero-available non-index lebs
    
    According to the budget algorithm, there are 5 LEBs reserved for budget:
    three journal heads(16,23,34), 1 GC LEB(21) and 1 deletion LEB(can be
    used in make_reservation()). There are 2 empty LEBs used for index nodes,
    which is calculated as min_idx_lebs - idx_lebs = 2. In theory, LEB 15
    and 33 should be reclaimed as free state after committing, but it is now
    in taken state. After looking the realization of reserve_space(), there's
    a possible situation:
    
    LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
    LEB 23: free 2000 dirty 248104 used 3848 (bud, taken)
    LEB 33: free 2000 dirty 251952 used 0    (bud, taken)
    
          wb_workfn          wb_workfn_2
    do_writepage // write 3000 bytes
     ubifs_jnl_write_data
      make_reservation
       reserve_space
        ubifs_garbage_collect
         ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
       nospc_retries++  // 1
       ubifs_run_commit
        do_commit
    
    LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
    LEB 23: free 2000 dirty 248104 used 3848 (dirty)
    LEB 33: free 2000 dirty 251952 used 0    (dirty)
    
                       do_writepage // write 2000 bytes for 3 times
    		    ubifs_jnl_write_data
    		    // grabs 15\23\33
    
    LEB 15: free 0    dirty 248000 used 5952 (bud, taken)
    LEB 23: free 0    dirty 248104 used 5848 (jhead 2)
    LEB 33: free 0    dirty 253952 used 0    (bud, taken)
    
       reserve_space
        ubifs_garbage_collect
         ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
       if (nospc_retries++ < 2) // false
     ubifs_ro_mode !
    
    Fetch a reproducer in Link.
    
    The dirty LEBs could be grabbed by other threads, which fails finding dirty
    LEBs of GC in current thread, so make_reservation() could try many times to
    invoke GC&&committing, but current realization limits the times of retrying
    as 'nospc_retries'(twice).
    Fix it by adding a wait queue, start queuing up space reservation tasks
    when someone task has retried gc + commit for many times. Then there is
    only one task making space reservation at any time, and it can always make
    success under the premise of correct budgeting.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218164
    Fixes: 1e51764a ("UBIFS: add new flash file system")
    Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
    Signed-off-by: default avatarZhang Yi <yi.zhang@huawei.com>
    Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
    556c19f5
ubifs.h 72.6 KB