Commit 1c641e84 authored by Andrea Arcangeli's avatar Andrea Arcangeli Committed by Linus Torvalds

mm: thp: fix BUG on mm->nr_ptes

Dave Jones reports a few Fedora users hitting the BUG_ON(mm->nr_ptes...)
in exit_mmap() recently.

Quoting Hugh's discovery and explanation of the SMP race condition:

  "mm->nr_ptes had unusual locking: down_read mmap_sem plus
   page_table_lock when incrementing, down_write mmap_sem (or mm_users
   0) when decrementing; whereas THP is careful to increment and
   decrement it under page_table_lock.

   Now most of those paths in THP also hold mmap_sem for read or write
   (with appropriate checks on mm_users), but two do not: when
   split_huge_page() is called by hwpoison_user_mappings(), and when
   called by add_to_swap().

   It's conceivable that the latter case is responsible for the
   exit_mmap() BUG_ON mm->nr_ptes that has been reported on Fedora."

The simplest way to fix it without having to alter the locking is to make
split_huge_page() a noop in nr_ptes terms, so by counting the preallocated
pagetables that exists for every mapped hugepage.  It was an arbitrary
choice not to count them and either way is not wrong or right, because
they are not used but they're still allocated.
Reported-by: default avatarDave Jones <davej@redhat.com>
Reported-by: default avatarHugh Dickins <hughd@google.com>
Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
Acked-by: default avatarHugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: <stable@vger.kernel.org>	[3.0.x, 3.1.x, 3.2.x]
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 62aca403
...@@ -671,6 +671,7 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct *mm, ...@@ -671,6 +671,7 @@ static int __do_huge_pmd_anonymous_page(struct mm_struct *mm,
set_pmd_at(mm, haddr, pmd, entry); set_pmd_at(mm, haddr, pmd, entry);
prepare_pmd_huge_pte(pgtable, mm); prepare_pmd_huge_pte(pgtable, mm);
add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR); add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
mm->nr_ptes++;
spin_unlock(&mm->page_table_lock); spin_unlock(&mm->page_table_lock);
} }
...@@ -789,6 +790,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ...@@ -789,6 +790,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
pmd = pmd_mkold(pmd_wrprotect(pmd)); pmd = pmd_mkold(pmd_wrprotect(pmd));
set_pmd_at(dst_mm, addr, dst_pmd, pmd); set_pmd_at(dst_mm, addr, dst_pmd, pmd);
prepare_pmd_huge_pte(pgtable, dst_mm); prepare_pmd_huge_pte(pgtable, dst_mm);
dst_mm->nr_ptes++;
ret = 0; ret = 0;
out_unlock: out_unlock:
...@@ -887,7 +889,6 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm, ...@@ -887,7 +889,6 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
} }
kfree(pages); kfree(pages);
mm->nr_ptes++;
smp_wmb(); /* make pte visible before pmd */ smp_wmb(); /* make pte visible before pmd */
pmd_populate(mm, pmd, pgtable); pmd_populate(mm, pmd, pgtable);
page_remove_rmap(page); page_remove_rmap(page);
...@@ -1047,6 +1048,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, ...@@ -1047,6 +1048,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
VM_BUG_ON(page_mapcount(page) < 0); VM_BUG_ON(page_mapcount(page) < 0);
add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR); add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
VM_BUG_ON(!PageHead(page)); VM_BUG_ON(!PageHead(page));
tlb->mm->nr_ptes--;
spin_unlock(&tlb->mm->page_table_lock); spin_unlock(&tlb->mm->page_table_lock);
tlb_remove_page(tlb, page); tlb_remove_page(tlb, page);
pte_free(tlb->mm, pgtable); pte_free(tlb->mm, pgtable);
...@@ -1375,7 +1377,6 @@ static int __split_huge_page_map(struct page *page, ...@@ -1375,7 +1377,6 @@ static int __split_huge_page_map(struct page *page,
pte_unmap(pte); pte_unmap(pte);
} }
mm->nr_ptes++;
smp_wmb(); /* make pte visible before pmd */ smp_wmb(); /* make pte visible before pmd */
/* /*
* Up to this point the pmd is present and huge and * Up to this point the pmd is present and huge and
...@@ -1988,7 +1989,6 @@ static void collapse_huge_page(struct mm_struct *mm, ...@@ -1988,7 +1989,6 @@ static void collapse_huge_page(struct mm_struct *mm,
set_pmd_at(mm, address, pmd, _pmd); set_pmd_at(mm, address, pmd, _pmd);
update_mmu_cache(vma, address, _pmd); update_mmu_cache(vma, address, _pmd);
prepare_pmd_huge_pte(pgtable, mm); prepare_pmd_huge_pte(pgtable, mm);
mm->nr_ptes--;
spin_unlock(&mm->page_table_lock); spin_unlock(&mm->page_table_lock);
#ifndef CONFIG_NUMA #ifndef CONFIG_NUMA
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment