Commit 3f04f62f authored by Andrea Arcangeli's avatar Andrea Arcangeli Committed by Linus Torvalds

thp: split_huge_page paging

Paging logic that splits the page before it is unmapped and added to swap
to ensure backwards compatibility with the legacy swap code.  Eventually
swap should natively pageout the hugepages to increase performance and
decrease seeking and fragmentation of swap space.  swapoff can just skip
over huge pmd as they cannot be part of swap yet.  In add_to_swap be
careful to split the page only if we got a valid swap entry so we don't
split hugepages with a full swap.

In theory we could split pages before isolating them during the lru scan,
but for khugepaged to be safe, I'm relying on either mmap_sem write mode,
or PG_lock taken, so split_huge_page has to run either with mmap_sem
read/write mode or PG_lock taken.  Calling it from isolate_lru_page would
make locking more complicated, in addition to that split_huge_page would
deadlock if called by __isolate_lru_page because it has to take the lru
lock to add the tail pages.
Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
Acked-by: default avatarRik van Riel <riel@redhat.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent bae9c19b
...@@ -386,6 +386,8 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill, ...@@ -386,6 +386,8 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
struct task_struct *tsk; struct task_struct *tsk;
struct anon_vma *av; struct anon_vma *av;
if (unlikely(split_huge_page(page)))
return;
read_lock(&tasklist_lock); read_lock(&tasklist_lock);
av = page_lock_anon_vma(page); av = page_lock_anon_vma(page);
if (av == NULL) /* Not actually mapped anymore */ if (av == NULL) /* Not actually mapped anymore */
......
...@@ -1400,6 +1400,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags) ...@@ -1400,6 +1400,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
int ret; int ret;
BUG_ON(!PageLocked(page)); BUG_ON(!PageLocked(page));
BUG_ON(PageTransHuge(page));
if (unlikely(PageKsm(page))) if (unlikely(PageKsm(page)))
ret = try_to_unmap_ksm(page, flags); ret = try_to_unmap_ksm(page, flags);
......
...@@ -157,6 +157,12 @@ int add_to_swap(struct page *page) ...@@ -157,6 +157,12 @@ int add_to_swap(struct page *page)
if (!entry.val) if (!entry.val)
return 0; return 0;
if (unlikely(PageTransHuge(page)))
if (unlikely(split_huge_page(page))) {
swapcache_free(entry, NULL);
return 0;
}
/* /*
* Radix-tree node allocations from PF_MEMALLOC contexts could * Radix-tree node allocations from PF_MEMALLOC contexts could
* completely exhaust the page allocator. __GFP_NOMEMALLOC * completely exhaust the page allocator. __GFP_NOMEMALLOC
......
...@@ -964,6 +964,8 @@ static inline int unuse_pmd_range(struct vm_area_struct *vma, pud_t *pud, ...@@ -964,6 +964,8 @@ static inline int unuse_pmd_range(struct vm_area_struct *vma, pud_t *pud,
pmd = pmd_offset(pud, addr); pmd = pmd_offset(pud, addr);
do { do {
next = pmd_addr_end(addr, end); next = pmd_addr_end(addr, end);
if (unlikely(pmd_trans_huge(*pmd)))
continue;
if (pmd_none_or_clear_bad(pmd)) if (pmd_none_or_clear_bad(pmd))
continue; continue;
ret = unuse_pte_range(vma, pmd, addr, next, entry, page); ret = unuse_pte_range(vma, pmd, addr, next, entry, page);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment