• Sean Christopherson's avatar
    KVM: x86: Allocate new rmap and large page tracking when moving memslot · edd4fa37
    Sean Christopherson authored
    Reallocate a rmap array and recalcuate large page compatibility when
    moving an existing memslot to correctly handle the alignment properties
    of the new memslot.  The number of rmap entries required at each level
    is dependent on the alignment of the memslot's base gfn with respect to
    that level, e.g. moving a large-page aligned memslot so that it becomes
    unaligned will increase the number of rmap entries needed at the now
    unaligned level.
    
    Not updating the rmap array is the most obvious bug, as KVM accesses
    garbage data beyond the end of the rmap.  KVM interprets the bad data as
    pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
    
      general protection fault: 0000 [#1] SMP
      CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
      RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
      Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
      RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
      RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
      RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
      RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
      R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
      R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
      FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
      Call Trace:
       kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
       __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
       kvm_set_memory_region+0x45/0x60 [kvm]
       kvm_vm_ioctl+0x30f/0x920 [kvm]
       do_vfs_ioctl+0xa1/0x620
       ksys_ioctl+0x66/0x70
       __x64_sys_ioctl+0x16/0x20
       do_syscall_64+0x4c/0x170
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7f7ed9911f47
      Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
      RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
      RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
      RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
      R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
      R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
      Modules linked in: kvm_intel kvm irqbypass
      ---[ end trace 0c5f570b3358ca89 ]---
    
    The disallow_lpage tracking is more subtle.  Failure to update results
    in KVM creating large pages when it shouldn't, either due to stale data
    or again due to indexing beyond the end of the metadata arrays, which
    can lead to memory corruption and/or leaking data to guest/userspace.
    
    Note, the arrays for the old memslot are freed by the unconditional call
    to kvm_free_memslot() in __kvm_set_memory_region().
    
    Fixes: 05da4558 ("KVM: MMU: large page support")
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
    Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    edd4fa37
x86.c 275 KB