Commit 57bbeacd authored by Huang Jianan's avatar Huang Jianan Committed by Gao Xiang

erofs: fix deadlock when shrink erofs slab

We observed the following deadlock in the stress test under low
memory scenario:

Thread A                               Thread B
- erofs_shrink_scan
 - erofs_try_to_release_workgroup
  - erofs_workgroup_try_to_freeze -- A
                                       - z_erofs_do_read_page
                                        - z_erofs_collection_begin
                                         - z_erofs_register_collection
                                          - erofs_insert_workgroup
                                           - xa_lock(&sbi->managed_pslots) -- B
                                           - erofs_workgroup_get
                                            - erofs_wait_on_workgroup_freezed -- A
  - xa_erase
   - xa_lock(&sbi->managed_pslots) -- B

To fix this, it needs to hold xa_lock before freezing the workgroup
since xarray will be touched then. So let's hold the lock before
accessing each workgroup, just like what we did with the radix tree
before.

[ Gao Xiang: Jianhua Hao also reports this issue at
  https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ]

Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com
Fixes: 64094a04 ("erofs: convert workstn to XArray")
Reviewed-by: default avatarChao Yu <chao@kernel.org>
Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: default avatarHuang Jianan <huangjianan@oppo.com>
Reported-by: default avatarJianhua Hao <haojianhua1@xiaomi.com>
Signed-off-by: default avatarGao Xiang <xiang@kernel.org>
parent 13605725
...@@ -150,7 +150,7 @@ static bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi, ...@@ -150,7 +150,7 @@ static bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi,
* however in order to avoid some race conditions, add a * however in order to avoid some race conditions, add a
* DBG_BUGON to observe this in advance. * DBG_BUGON to observe this in advance.
*/ */
DBG_BUGON(xa_erase(&sbi->managed_pslots, grp->index) != grp); DBG_BUGON(__xa_erase(&sbi->managed_pslots, grp->index) != grp);
/* last refcount should be connected with its managed pslot. */ /* last refcount should be connected with its managed pslot. */
erofs_workgroup_unfreeze(grp, 0); erofs_workgroup_unfreeze(grp, 0);
...@@ -165,15 +165,19 @@ static unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi, ...@@ -165,15 +165,19 @@ static unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi,
unsigned int freed = 0; unsigned int freed = 0;
unsigned long index; unsigned long index;
xa_lock(&sbi->managed_pslots);
xa_for_each(&sbi->managed_pslots, index, grp) { xa_for_each(&sbi->managed_pslots, index, grp) {
/* try to shrink each valid workgroup */ /* try to shrink each valid workgroup */
if (!erofs_try_to_release_workgroup(sbi, grp)) if (!erofs_try_to_release_workgroup(sbi, grp))
continue; continue;
xa_unlock(&sbi->managed_pslots);
++freed; ++freed;
if (!--nr_shrink) if (!--nr_shrink)
break; return freed;
xa_lock(&sbi->managed_pslots);
} }
xa_unlock(&sbi->managed_pslots);
return freed; return freed;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment