• Naoya Horiguchi's avatar
    mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling · ead07f6a
    Naoya Horiguchi authored
    memory_failure() can run in 2 different mode (specified by
    MF_COUNT_INCREASED) in page refcount perspective.  When
    MF_COUNT_INCREASED is set, memory_failure() assumes that the caller
    takes a refcount of the target page.  And if cleared, memory_failure()
    takes it in it's own.
    
    In current code, however, refcounting is done differently in each caller.
    For example, madvise_hwpoison() uses get_user_pages_fast() and
    hwpoison_inject() uses get_page_unless_zero().  So this inconsistent
    refcounting causes refcount failure especially for thp tail pages.
    Typical user visible effects are like memory leak or
    VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().
    
    To fix this refcounting issue, this patch introduces get_hwpoison_page()
    to handle thp tail pages in the same manner for each caller of hwpoison
    code.
    
    memory_failure() might fail to split thp and in such case it returns
    without completing page isolation.  This is not good because PageHWPoison
    on the thp is still set and there's no easy way to unpoison such thps.  So
    this patch try to roll back any action to the thp in "non anonymous thp"
    case and "thp split failed" case, expecting an MCE(SRAR) generated by
    later access afterward will properly free such thps.
    
    [akpm@linux-foundation.org: fix CONFIG_HWPOISON_INJECT=m]
    Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    ead07f6a
memory-failure.c 49.2 KB