• Naoya Horiguchi's avatar
    mm: thp: move pmd check inside ptl for freeze_page() · 33f4751e
    Naoya Horiguchi authored
    I found a race condition triggering VM_BUG_ON() in freeze_page(), when
    running a testcase with 3 processes:
      - process 1: keep writing thp,
      - process 2: keep clearing soft-dirty bits from virtual address of process 1
      - process 3: call migratepages for process 1,
    
    The kernel message is like this:
    
      kernel BUG at /src/linux-dev/mm/huge_memory.c:3096!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: cfg80211 rfkill crc32c_intel ppdev serio_raw pcspkr virtio_balloon virtio_console parport_pc parport pvpanic acpi_cpufreq tpm_tis tpm i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi floppy virtio_pci virtio_ring virtio
      CPU: 0 PID: 28863 Comm: migratepages Not tainted 4.6.0-v4.6-160602-0827-+ #2
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff880037320000 ti: ffff88007cdd0000 task.ti: ffff88007cdd0000
      RIP: 0010:[<ffffffff811f8e06>]  [<ffffffff811f8e06>] split_huge_page_to_list+0x496/0x590
      RSP: 0018:ffff88007cdd3b70  EFLAGS: 00010202
      RAX: 0000000000000001 RBX: ffff88007c7b88c0 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000700000200 RDI: ffffea0003188000
      RBP: ffff88007cdd3bb8 R08: 0000000000000001 R09: 00003ffffffff000
      R10: ffff880000000000 R11: ffffc000001fffff R12: ffffea0003188000
      R13: ffffea0003188000 R14: 0000000000000000 R15: 0400000000000080
      FS:  00007f8ec241d740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000             CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f8ec1f3ed20 CR3: 000000003707b000 CR4: 00000000000006f0
      Call Trace:
        ? list_del+0xd/0x30
        queue_pages_pte_range+0x4d1/0x590
        __walk_page_range+0x204/0x4e0
        walk_page_range+0x71/0xf0
        queue_pages_range+0x75/0x90
        ? queue_pages_hugetlb+0x190/0x190
        ? new_node_page+0xc0/0xc0
        ? change_prot_numa+0x40/0x40
        migrate_to_node+0x71/0xd0
        do_migrate_pages+0x1c3/0x210
        SyS_migrate_pages+0x261/0x290
        entry_SYSCALL_64_fastpath+0x1a/0xa4
      Code: e8 b0 87 fb ff 0f 0b 48 c7 c6 30 32 9f 81 e8 a2 87 fb ff 0f 0b 48 c7 c6 b8 46 9f 81 e8 94 87 fb ff 0f 0b 85 c0 0f 84 3e fd ff ff <0f> 0b 85 c0 0f 85 a6 00 00 00 48 8b 75 c0 4c 89 f7 41 be f0 ff
      RIP   split_huge_page_to_list+0x496/0x590
    
    I'm not sure of the full scenario of the reproduction, but my debug
    showed that split_huge_pmd_address(freeze=true) returned without running
    main code of pmd splitting because pmd_present(*pmd) in precheck somehow
    returned 0.  If this happens, the subsequent try_to_unmap() fails and
    returns non-zero (because page_mapcount() still > 0), and finally
    VM_BUG_ON() fires.  This patch tries to fix it by prechecking pmd state
    inside ptl.
    
    Link: http://lkml.kernel.org/r/1466990929-7452-1-git-send-email-n-horiguchi@ah.jp.nec.comSigned-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    33f4751e
huge_memory.c 92.2 KB