• Sandeep Dhavale's avatar
    erofs: Fix detection of atomic context · 12d0a24a
    Sandeep Dhavale authored
    Current check for atomic context is not sufficient as
    z_erofs_decompressqueue_endio can be called under rcu lock
    from blk_mq_flush_plug_list(). See the stacktrace [1]
    
    In such case we should hand off the decompression work for async
    processing rather than trying to do sync decompression in current
    context. Patch fixes the detection by checking for
    rcu_read_lock_any_held() and while at it use more appropriate
    !in_task() check than in_atomic().
    
    Background: Historically erofs would always schedule a kworker for
    decompression which would incur the scheduling cost regardless of
    the context. But z_erofs_decompressqueue_endio() may not always
    be in atomic context and we could actually benefit from doing the
    decompression in z_erofs_decompressqueue_endio() if we are in
    thread context, for example when running with dm-verity.
    This optimization was later added in patch [2] which has shown
    improvement in performance benchmarks.
    
    ==============================================
    [1] Problem stacktrace
    [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291
    [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi
    [name:core&]preempt_count: 0, expected: 0
    [name:core&]RCU nest depth: 1, expected: 0
    CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S      W  OE      6.1.25-android14-5-maybe-dirty-mainline #1
    Hardware name: MT6897 (DT)
    Call trace:
     dump_backtrace+0x108/0x15c
     show_stack+0x20/0x30
     dump_stack_lvl+0x6c/0x8c
     dump_stack+0x20/0x48
     __might_resched+0x1fc/0x308
     __might_sleep+0x50/0x88
     mutex_lock+0x2c/0x110
     z_erofs_decompress_queue+0x11c/0xc10
     z_erofs_decompress_kickoff+0x110/0x1a4
     z_erofs_decompressqueue_endio+0x154/0x180
     bio_endio+0x1b0/0x1d8
     __dm_io_complete+0x22c/0x280
     clone_endio+0xe4/0x280
     bio_endio+0x1b0/0x1d8
     blk_update_request+0x138/0x3a4
     blk_mq_plug_issue_direct+0xd4/0x19c
     blk_mq_flush_plug_list+0x2b0/0x354
     __blk_flush_plug+0x110/0x160
     blk_finish_plug+0x30/0x4c
     read_pages+0x2fc/0x370
     page_cache_ra_unbounded+0xa4/0x23c
     page_cache_ra_order+0x290/0x320
     do_sync_mmap_readahead+0x108/0x2c0
     filemap_fault+0x19c/0x52c
     __do_fault+0xc4/0x114
     handle_mm_fault+0x5b4/0x1168
     do_page_fault+0x338/0x4b4
     do_translation_fault+0x40/0x60
     do_mem_abort+0x60/0xc8
     el0_da+0x4c/0xe0
     el0t_64_sync_handler+0xd4/0xfc
     el0t_64_sync+0x1a0/0x1a4
    
    [2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/Reported-by: default avatarWill Shiu <Will.Shiu@mediatek.com>
    Suggested-by: default avatarGao Xiang <xiang@kernel.org>
    Signed-off-by: default avatarSandeep Dhavale <dhavale@google.com>
    Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: default avatarAlexandre Mergnat <amergnat@baylibre.com>
    Link: https://lore.kernel.org/r/20230621220848.3379029-1-dhavale@google.comSigned-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
    12d0a24a
zdata.c 49.3 KB