Commit 2103cf9c authored by Peter Xu's avatar Peter Xu Committed by Linus Torvalds

hugetlb: dedup the code to add a new file_region

Patch series "mm/hugetlb: Early cow on fork, and a few cleanups", v5.

As reported by Gal [1], we still miss the code clip to handle early cow
for hugetlb case, which is true.  Again, it still feels odd to fork()
after using a few huge pages, especially if they're privately mapped to
me..  However I do agree with Gal and Jason in that we should still have
that since that'll complete the early cow on fork effort at least, and
it'll still fix issues where buffers are not well under control and not
easy to apply MADV_DONTFORK.

The first two patches (1-2) are some cleanups I noticed when reading into
the hugetlb reserve map code.  I think it's good to have but they're not
necessary for fixing the fork issue.

The last two patches (3-4) are the real fix.

I tested this with a fork() after some vfio-pci assignment, so I'm pretty
sure the page copy path could trigger well (page will be accounted right
after the fork()), but I didn't do data check since the card I assigned is
some random nic.

  https://github.com/xzpeter/linux/tree/fork-cow-pin-huge

[1] https://lore.kernel.org/lkml/27564187-4a08-f187-5a84-3df50009f6ca@amazon.com/

Introduce hugetlb_resv_map_add() helper to add a new file_region rather
than duplication the similar code twice in add_reservation_in_range().

Link: https://lkml.kernel.org/r/20210217233547.93892-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210217233547.93892-2-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
Cc: Gal Pressman <galpress@amazon.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Wei Zhang <wzam@amazon.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jann Horn <jannh@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: VMware Graphics <linux-graphics-maintainer@vmware.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 82e69a12
...@@ -331,6 +331,24 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg) ...@@ -331,6 +331,24 @@ static void coalesce_file_region(struct resv_map *resv, struct file_region *rg)
} }
} }
static inline long
hugetlb_resv_map_add(struct resv_map *map, struct file_region *rg, long from,
long to, struct hstate *h, struct hugetlb_cgroup *cg,
long *regions_needed)
{
struct file_region *nrg;
if (!regions_needed) {
nrg = get_file_region_entry_from_cache(map, from, to);
record_hugetlb_cgroup_uncharge_info(cg, h, map, nrg);
list_add(&nrg->link, rg->link.prev);
coalesce_file_region(map, nrg);
} else
*regions_needed += 1;
return to - from;
}
/* /*
* Must be called with resv->lock held. * Must be called with resv->lock held.
* *
...@@ -346,7 +364,7 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t, ...@@ -346,7 +364,7 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
long add = 0; long add = 0;
struct list_head *head = &resv->regions; struct list_head *head = &resv->regions;
long last_accounted_offset = f; long last_accounted_offset = f;
struct file_region *rg = NULL, *trg = NULL, *nrg = NULL; struct file_region *rg = NULL, *trg = NULL;
if (regions_needed) if (regions_needed)
*regions_needed = 0; *regions_needed = 0;
...@@ -375,18 +393,11 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t, ...@@ -375,18 +393,11 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
/* Add an entry for last_accounted_offset -> rg->from, and /* Add an entry for last_accounted_offset -> rg->from, and
* update last_accounted_offset. * update last_accounted_offset.
*/ */
if (rg->from > last_accounted_offset) { if (rg->from > last_accounted_offset)
add += rg->from - last_accounted_offset; add += hugetlb_resv_map_add(resv, rg,
if (!regions_needed) { last_accounted_offset,
nrg = get_file_region_entry_from_cache( rg->from, h, h_cg,
resv, last_accounted_offset, rg->from); regions_needed);
record_hugetlb_cgroup_uncharge_info(h_cg, h,
resv, nrg);
list_add(&nrg->link, rg->link.prev);
coalesce_file_region(resv, nrg);
} else
*regions_needed += 1;
}
last_accounted_offset = rg->to; last_accounted_offset = rg->to;
} }
...@@ -394,17 +405,9 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t, ...@@ -394,17 +405,9 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t,
/* Handle the case where our range extends beyond /* Handle the case where our range extends beyond
* last_accounted_offset. * last_accounted_offset.
*/ */
if (last_accounted_offset < t) { if (last_accounted_offset < t)
add += t - last_accounted_offset; add += hugetlb_resv_map_add(resv, rg, last_accounted_offset,
if (!regions_needed) { t, h, h_cg, regions_needed);
nrg = get_file_region_entry_from_cache(
resv, last_accounted_offset, t);
record_hugetlb_cgroup_uncharge_info(h_cg, h, resv, nrg);
list_add(&nrg->link, rg->link.prev);
coalesce_file_region(resv, nrg);
} else
*regions_needed += 1;
}
VM_BUG_ON(add < 0); VM_BUG_ON(add < 0);
return add; return add;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment