1. 15 Jun, 2017 1 commit
  2. 08 Jun, 2017 9 commits
  3. 06 Jun, 2017 5 commits
    • Marc Zyngier's avatar
      arm: KVM: Allow unaligned accesses at HYP · 33b5c388
      Marc Zyngier authored
      We currently have the HSCTLR.A bit set, trapping unaligned accesses
      at HYP, but we're not really prepared to deal with it.
      
      Since the rest of the kernel is pretty happy about that, let's follow
      its example and set HSCTLR.A to zero. Modern CPUs don't really care.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      33b5c388
    • Marc Zyngier's avatar
      arm64: KVM: Allow unaligned accesses at EL2 · 78fd6dcf
      Marc Zyngier authored
      We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses
      at EL2, but we're not really prepared to deal with it. So far, this
      has been unnoticed, until GCC 7 started emitting those (in particular
      64bit writes on a 32bit boundary).
      
      Since the rest of the kernel is pretty happy about that, let's follow
      its example and set SCTLR_EL2.A to zero. Modern CPUs don't really
      care.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      78fd6dcf
    • Marc Zyngier's avatar
      arm64: KVM: Preserve RES1 bits in SCTLR_EL2 · d68c1f7f
      Marc Zyngier authored
      __do_hyp_init has the rather bad habit of ignoring RES1 bits and
      writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything
      bad, but may end-up being pretty nasty on future revisions of the
      architecture.
      
      Let's preserve those bits so that we don't have to fix this later on.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      d68c1f7f
    • Marc Zyngier's avatar
      KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages · d6dbdd3c
      Marc Zyngier authored
      Under memory pressure, we start ageing pages, which amounts to parsing
      the page tables. Since we don't want to allocate any extra level,
      we pass NULL for our private allocation cache. Which means that
      stage2_get_pud() is allowed to fail. This results in the following
      splat:
      
      [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
      [ 1520.417741] pgd = ffff810f52fef000
      [ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
      [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
      [ 1520.435156] Modules linked in:
      [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
      [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
      [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
      [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
      [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
      [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145
      [ 1520.486325] sp : ffff800ce04e33d0
      [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
      [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
      [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
      [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
      [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
      [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
      [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
      [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
      [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
      [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
      [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
      [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
      [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
      [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
      [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
      [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
      [...]
      [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110
      [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0
      [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8
      [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0
      [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
      [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0
      [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188
      [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250
      [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0
      [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180
      [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8
      [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600
      [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328
      [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340
      [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240
      [...]
      
      The trivial fix is to handle this NULL pud value early, rather than
      dereferencing it blindly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      d6dbdd3c
    • Christoffer Dall's avatar
      KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction · d68356cc
      Christoffer Dall authored
      We used to extract PRIbits from the ICH_VT_EL2 which was the upper field
      in the register word, so a mask wasn't necessary, but as we switched to
      looking at PREbits, which is bits 26 through 28 with the PRIbits field
      being potentially non-zero, we really need to mask off the field value,
      otherwise fun things may happen.
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      d68356cc
  4. 04 Jun, 2017 12 commits
  5. 24 May, 2017 1 commit
  6. 23 May, 2017 3 commits
  7. 18 May, 2017 2 commits
    • Christoffer Dall's avatar
      KVM: arm/arm64: Hold slots_lock when unregistering kvm io bus devices · fa472fa9
      Christoffer Dall authored
      We were not holding the kvm->slots_lock as required when calling
      kvm_io_bus_unregister_dev() as required.
      
      This only affects the error path, but still, let's do our due
      diligence.
      
      Reported by: Eric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      fa472fa9
    • Christoffer Dall's avatar
      KVM: arm/arm64: Fix bug when registering redist iodevs · 552c9f47
      Christoffer Dall authored
      If userspace creates the VCPUs after initializing the VGIC, then we end
      up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because
      it is called prior to adding the VCPU into the vcpus array on the VM.
      
      There is no tight coupling between the VCPU index and the area of the
      redistributor region used for the VCPU, so we can simply ensure that all
      creations of redistributors are serialized per VM, and increment an
      offset when we successfully add a redistributor.
      
      The vgic_register_redist_iodev() function can be called from two paths:
      vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr()
      device attribute handler.  This patch already holds the kvm->lock mutex.
      
      The other path is via kvm_vgic_vcpu_init, which is called through a
      longer chain from kvm_vm_ioctl_create_vcpu(), which releases the
      kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can
      simply take this mutex again later for our purposes.
      
      Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs")
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      Tested-by: default avatarJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      552c9f47
  8. 16 May, 2017 4 commits
  9. 15 May, 2017 3 commits
    • Zhichao Huang's avatar
      KVM: arm: rename pm_fake handler to trap_raz_wi · 9b619a8f
      Zhichao Huang authored
      pm_fake doesn't quite describe what the handler does (ignoring writes
      and returning 0 for reads).
      
      As we're about to use it (a lot) in a different context, rename it
      with a (admitedly cryptic) name that make sense for all users.
      Signed-off-by: default avatarZhichao Huang <zhichao.huang@linaro.org>
      Reviewed-by: default avatarAlex Bennee <alex.bennee@linaro.org>
      Acked-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      9b619a8f
    • Zhichao Huang's avatar
      KVM: arm: plug potential guest hardware debug leakage · 661e6b02
      Zhichao Huang authored
      Hardware debugging in guests is not intercepted currently, it means
      that a malicious guest can bring down the entire machine by writing
      to the debug registers.
      
      This patch enable trapping of all debug registers, preventing the
      guests to access the debug registers. This includes access to the
      debug mode(DBGDSCR) in the guest world all the time which could
      otherwise mess with the host state. Reads return 0 and writes are
      ignored (RAZ_WI).
      
      The result is the guest cannot detect any working hardware based debug
      support. As debug exceptions are still routed to the guest normal
      debug using software based breakpoints still works.
      
      To support debugging using hardware registers we need to implement a
      debug register aware world switch as well as special trapping for
      registers that may affect the host state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarZhichao Huang <zhichao.huang@linaro.org>
      Signed-off-by: default avatarAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      661e6b02
    • Suzuki K Poulose's avatar
      kvm: arm/arm64: Fix race in resetting stage2 PGD · 6c0d706b
      Suzuki K Poulose authored
      In kvm_free_stage2_pgd() we check the stage2 PGD before holding
      the lock and proceed to take the lock if it is valid. And we unmap
      the page tables, followed by releasing the lock. We reset the PGD
      only after dropping this lock, which could cause a race condition
      where another thread waiting on or even holding the lock, could
      potentially see that the PGD is still valid and proceed to perform
      a stage2 operation and later encounter a NULL PGD.
      
      [223090.242280] Unable to handle kernel NULL pointer dereference at
      virtual address 00000040
      [223090.262330] PC is at unmap_stage2_range+0x8c/0x428
      [223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262531] Call trace:
      [223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
      [223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
      [223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
      [223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
      [223090.262543] [<ffff0000080a2478>]
      kvm_mmu_notifier_invalidate_page+0x50/0x8c
      [223090.262547] [<ffff0000082274f8>]
      __mmu_notifier_invalidate_page+0x5c/0x84
      [223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
      [223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
      [223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
      [223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
      [223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
      [223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
      [223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
      [223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
      [223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
      [223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
      [223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
      [223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
      [223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
      [223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
      [223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
      [223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
      [223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
      [223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
      [223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
      [223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
      [223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
      [223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
      [223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c
      
      This patch moves the stage2 PGD manipulation under the lock.
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      6c0d706b