• Tao Ma's avatar
    jbd2: use WRITE_SYNC in journal checkpoint · d3ad8434
    Tao Ma authored
    In journal checkpoint, we write the buffer and wait for its finish.
    But in cfq, the async queue has a very low priority, and in our test,
    if there are too many sync queues and every queue is filled up with
    requests, the write request will be delayed for quite a long time and
    all the tasks which are waiting for journal space will end with errors like:
    
    INFO: task attr_set:3816 blocked for more than 120 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    attr_set      D ffff880028393480     0  3816      1 0x00000000
     ffff8802073fbae8 0000000000000086 ffff8802140847c8 ffff8800283934e8
     ffff8802073fb9d8 ffffffff8103e456 ffff8802140847b8 ffff8801ed728080
     ffff8801db4bc080 ffff8801ed728450 ffff880028393480 0000000000000002
    Call Trace:
     [<ffffffff8103e456>] ? __dequeue_entity+0x33/0x38
     [<ffffffff8103caad>] ? need_resched+0x23/0x2d
     [<ffffffff814006a6>] ? thread_return+0xa2/0xbc
     [<ffffffffa01f6224>] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
     [<ffffffffa01f6224>] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
     [<ffffffff81400d31>] __mutex_lock_common+0x14e/0x1a9
     [<ffffffffa021dbfb>] ? brelse+0x13/0x15 [ext4]
     [<ffffffff81400ddb>] __mutex_lock_slowpath+0x19/0x1b
     [<ffffffff81400b2d>] mutex_lock+0x1b/0x32
     [<ffffffffa01f927b>] __jbd2_journal_insert_checkpoint+0xe3/0x20c [jbd2]
     [<ffffffffa01f547b>] start_this_handle+0x438/0x527 [jbd2]
     [<ffffffff8106f491>] ? autoremove_wake_function+0x0/0x3e
     [<ffffffffa01f560b>] jbd2_journal_start+0xa1/0xcc [jbd2]
     [<ffffffffa02353be>] ext4_journal_start_sb+0x57/0x81 [ext4]
     [<ffffffffa024a314>] ext4_xattr_set+0x6c/0xe3 [ext4]
     [<ffffffffa024aaff>] ext4_xattr_user_set+0x42/0x4b [ext4]
     [<ffffffff81145adb>] generic_setxattr+0x6b/0x76
     [<ffffffff81146ac0>] __vfs_setxattr_noperm+0x47/0xc0
     [<ffffffff81146bb8>] vfs_setxattr+0x7f/0x9a
     [<ffffffff81146c88>] setxattr+0xb5/0xe8
     [<ffffffff81137467>] ? do_filp_open+0x571/0xa6e
     [<ffffffff81146d26>] sys_fsetxattr+0x6b/0x91
     [<ffffffff81002d32>] system_call_fastpath+0x16/0x1b
    
    So this patch tries to use WRITE_SYNC in __flush_batch so that the request will
    be moved into sync queue and handled by cfq timely. We also use the new plug,
    sot that all the WRITE_SYNC requests can be given as a whole when we unplug it.
    Signed-off-by: default avatarTao Ma <boyu.mt@taobao.com>
    Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
    Cc: Jan Kara <jack@suse.cz>
    Reported-by: default avatarRobin Dong <sanbai@taobao.com>
    d3ad8434
checkpoint.c 22.8 KB