- 15 Dec, 2014 2 commits
-
-
Christoffer Dall authored
It is curently possible to run a VM with architected timers support without creating an in-kernel VGIC, which will result in interrupts from the virtual timer going nowhere. To address this issue, move the architected timers initialization to the time when we run a VCPU for the first time, and then only initialize (and enable) the architected timers if we have a properly created and initialized in-kernel VGIC. When injecting interrupts from the virtual timer to the vgic, the current setup should ensure that this never calls an on-demand init of the VGIC, which is the only call path that could return an error from kvm_vgic_inject_irq(), so capture the return value and raise a warning if there's an error there. We also change the kvm_timer_init() function from returning an int to be a void function, since the function always succeeds. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
Userspace assumes that it can wire up IRQ injections after having created all VCPUs and after having created the VGIC, but potentially before starting the first VCPU. This can currently lead to lost IRQs because the state of that IRQ injection is not stored anywhere and we don't return an error to userspace. We haven't seen this problem manifest itself yet, presumably because guests reset the devices on boot, but this could cause issues with migration and other non-standard startup configurations. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 13 Dec, 2014 10 commits
-
-
Christoffer Dall authored
When the vgic initializes its internal state it does so based on the number of VCPUs available at the time. If we allow KVM to create more VCPUs after the VGIC has been initialized, we are likely to error out in unfortunate ways later, perform buffer overflows etc. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
Some code paths will need to check to see if the internal state of the vgic has been initialized (such as when creating new VCPUs), so introduce such a macro that checks the nr_cpus field which is set when the vgic has been initialized. Also set nr_cpus = 0 in kvm_vgic_destroy, because the error path in vgic_init() will call this function, and code should never errornously assume the vgic to be properly initialized after an error. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
The vgic_initialized() macro currently returns the state of the vgic->ready flag, which indicates if the vgic is ready to be used when running a VM, not specifically if its internal state has been initialized. Rename the macro accordingly in preparation for a more nuanced initialization flow. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Peter Maydell authored
VGIC initialization currently happens in three phases: (1) kvm_vgic_create() (triggered by userspace GIC creation) (2) vgic_init_maps() (triggered by userspace GIC register read/write requests, or from kvm_vgic_init() if not already run) (3) kvm_vgic_init() (triggered by first VM run) We were doing initialization of some state to correspond with the state of a freshly-reset GIC in kvm_vgic_init(); this is too late, since it will overwrite changes made by userspace using the register access APIs before the VM is run. Move this initialization earlier, into the vgic_init_maps() phase. This fixes a bug where QEMU could successfully restore a saved VM state snapshot into a VM that had already been run, but could not restore it "from cold" using the -loadvm command line option (the symptoms being that the restored VM would run but interrupts were ignored). Finally rename vgic_init_maps to vgic_init and renamed kvm_vgic_init to kvm_vgic_map_resources. [ This patch is originally written by Peter Maydell, but I have modified it somewhat heavily, renaming various bits and moving code around. If something is broken, I am to be blamed. - Christoffer ] Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
Introduce a new function to unmap user RAM regions in the stage2 page tables. This is needed on reboot (or when the guest turns off the MMU) to ensure we fault in pages again and make the dcache, RAM, and icache coherent. Using unmap_stage2_range for the whole guest physical range does not work, because that unmaps IO regions (such as the GIC) which will not be recreated or in the best case faulted in on a page-by-page basis. Call this function on secondary and subsequent calls to the KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest Stage-1 MMU is off when faulting in pages and make the caches coherent. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also reset the HCR, because we now modify the HCR dynamically to enable/disable trapping of guest accesses to the VM registers. This is crucial for reboot of VMs working since otherwise we will not be doing the necessary cache maintenance operations when faulting in pages with the guest MMU off. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
If a VCPU was originally started with power off (typically to be brought up by PSCI in SMP configurations), there is no need to clear the POWER_OFF flag in the kernel, as this flag is only tested during the init ioctl itself. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 26 Nov, 2014 1 commit
-
-
Shannon Zhao authored
When call kvm_vgic_inject_irq to inject interrupt, we can known which vcpu the interrupt for by the irq_num and the cpuid. So we should just kick this vcpu to avoid iterating through all. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 25 Nov, 2014 8 commits
-
-
Christoffer Dall authored
When 'injecting' an edge-triggered interrupt with a falling edge we shouldn't clear the pending state on the distributor. In fact, we don't, because the check in vgic_validate_injection would prevent us from ever reaching this bit of code. Remove the unreachable snippet. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andre Przywara authored
Currently we mangle the endianness of the guest's register even on an MMIO _read_, where it is completely useless, because we will not use the value of that register. Rework the io_mem_abort() function to clearly separate between reads and writes and only do the endianness mangling on MMIO writes. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Ard Biesheuvel authored
Readonly memslots are often used to implement emulation of ROMs and NOR flashes, in which case the guest may legally map these regions as uncached. To deal with the incoherency associated with uncached guest mappings, treat all readonly memslots as incoherent, and ensure that pages that belong to regions tagged as such are flushed to DRAM before being passed to the guest. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Laszlo Ersek authored
To allow handling of incoherent memslots in a subsequent patch, this patch adds a paramater 'ipa_uncached' to cache_coherent_guest_page() so that we can instruct it to flush the page's contents to DRAM even if the guest has caching globally enabled. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Ard Biesheuvel authored
Memory regions may be incoherent with the caches, typically when the guest has mapped a host system RAM backed memory region as uncached. Add a flag KVM_MEMSLOT_INCOHERENT so that we can tag these memslots and handle them appropriately when mapping them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Ard Biesheuvel authored
This reverts commit 85c8555f ("KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. The problem being addressed by the patch above was that some ARM code based the memory mapping attributes of a pfn on the return value of kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should be mapped as device memory. However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, and the existing non-ARM users were already using it in a way which suggests that its name should probably have been 'kvm_is_reserved_pfn' from the beginning, e.g., whether or not to call get_page/put_page on it etc. This means that returning false for the zero page is a mistake and the patch above should be reverted. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Ard Biesheuvel authored
Instead of using kvm_is_mmio_pfn() to decide whether a host region should be stage 2 mapped with device attributes, add a new static function kvm_is_device_pfn() that disregards RAM pages with the reserved bit set, as those should usually not be mapped as device memory. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
wanghaibin authored
When vgic_update_irq_pending with level-sensitive false, it is need to deactivates an interrupt, and, it can go to out directly. Here return a false value, because it will be not need to kick. Signed-off-by: wanghaibin <wanghaibin.wang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 29 Oct, 2014 5 commits
-
-
Paolo Bonzini authored
Merge tag 'kvm-s390-next-20141028' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and cleanups 1. A small fix regarding program check handling (cc stable as it overwrites the wrong guest memory) 2. Improve the ipte interlock scalability for older hardware 3. current->mm to mm cleanup (currently a no-op) 4. several SIGP rework patches (more to come)
-
Jan Kiszka authored
In order to access the shadow VMCS, we need to load it. At this point, vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If we now get preempted by Linux, vmx_vcpu_put and, on return, the vmx_vcpu_load will work against the wrong vmcs. That can cause copy_shadow_to_vmcs12 to corrupt the vmcs12 state. Fix the issue by disabling preemption during the copy operation. copy_vmcs12_to_shadow is safe from this issue as it is executed by vmx_vcpu_run when preemption is already disabled before vmentry. This bug is exposed by running Jailhouse within KVM on CPUs with shadow VMCS support. Jailhouse never expects an interrupt pending vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12 is preempted, the active VMCS happens to have the virtual interrupt pending flag set in the CPU-based execution controls. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Nadav Amit authored
Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far jumps") introduced a bug that caused the fix to be incomplete. Due to incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may not trigger #GP. As we know, this imposes a security problem. In addition, the condition for two warnings was incorrect. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Emulation of code that is 14 bytes to the segment limit or closer (e.g. RIP = 0xFFFFFFF2 after reset) is broken because we try to read as many as 15 bytes from the beginning of the instruction, and __linearize fails when the passed (address, size) pair reaches out of the segment. To fix this, let __linearize return the maximum accessible size (clamped to 2^32-1) for usage in __do_insn_fetch_bytes, and avoid the limit check by passing zero for the desired size. For expand-down segments, __linearize is performing a redundant check. (u32)(addr.ea + size - 1) <= lim can only happen if addr.ea is close to 4GB; in this case, addr.ea + size - 1 will also fail the check against the upper bound of the segment (which is provided by the D/B bit). After eliminating the redundant check, it is simple to compute the *max_size for expand-down segments too. Now that the limit check is done in __do_insn_fetch_bytes, we want to inject a general protection fault there if size < op_size (like __linearize would have done), instead of just aborting. This fixes booting Tiano Core from emulated flash with EPT disabled. Cc: stable@vger.kernel.org Fixes: 719d5a9bReported-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
The error code for #GP and #SS is zero when the segment is used to access an operand or an instruction. It is only non-zero when a segment register is being loaded; for limit checks this means cases such as: * for #GP, when RIP is beyond the limit on a far call (before the first instruction is executed). We do not implement this check, but it would be in em_jmp_far/em_call_far. * for #SS, if the new stack overflows during an inter-privilege-level call to a non-conforming code segment. We do not implement stack switching at all. So use an error code of zero. Reviewed-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 28 Oct, 2014 9 commits
-
-
David Hildenbrand authored
In preparation for further code changes (e.g. getting rid of action_flags), this patch splits the handling of the two sigp orders SIGP STOP and SIGP STOP AND STORE STATUS by introducing a separate handler function for SIGP STOP AND STORE STATUS. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
In preparation for further code changes, this patch moves the injection of emergency calls into a separate function and uses it for the processing of SIGP EMERGENCY CALL and SIGP CONDITIONAL EMERGENCY CALL. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
This patch introduces instruction counters for all known sigp orders and also a separate one for unknown orders that are passed to user space. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
This patch introduces in preparation for further code changes separate handler functions for: - SIGP (RE)START - will not be allowed to terminate pending orders - SIGP (INITIAL) CPU RESET - will be allowed to terminate certain pending orders - unknown sigp orders All sigp orders that require user space intervention are logged. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
All sigp orders targeting one VCPU have to verify that the target is valid and available. Let's move the check from the single functions to the dispatcher. The destination VCPU is directly passed as a pointer - instead of the cpu address of the target. Please note that all SIGP orders except SIGP SET ARCHITECTURE - even unknown ones - will now check for the availability of the target VCPU. This is what the architecture documentation specifies. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
All sigp orders except SIGP SET ARCHITECTURE target exactly one vcpu. Let's move the dispatch code for these orders into a separate function to prepare for cleaner target availability checks. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Thomas Huth authored
The monitor-class number field is only 16 bits, so we have to use a u16 pointer to access it. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> CC: stable@vger.kernel.org # v3.16+ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Jason J. Herne authored
In set_guest_storage_key, we really want to reference the mm struct given as a parameter to the function. So replace the current->mm reference with the mm struct passed in by the caller. Signed-off-by: Jason J. Herne <jjherne@us.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Thomas Huth authored
The ipte-locking should be done for each VM seperately, not globally. This way we avoid possible congestions when the simple ipte-lock is used and multiple VMs are running. Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
- 26 Oct, 2014 4 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "Another week, another small batch of fixes. Most of these make zynq, socfpga and sunxi platforms work a bit better: - due to new requirements for regulators, DWMMC on socfpga broke past v3.17 - SMP spinup fix for socfpga - a few DT fixes for zynq - another option (FIXED_REGULATOR) for sunxi is needed that used to be selected by other options but no longer is. - a couple of small DT fixes for at91 - ...and a couple for i.MX" * tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: dts: imx28-evk: Let i2c0 run at 100kHz ARM: i.MX6: Fix "emi" clock name typo ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIP ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGE ARM: dts: socfpga: Add a 3.3V fixed regulator node ARM: dts: socfpga: Fix SD card detect ARM: dts: socfpga: rename gpio nodes ARM: at91/dt: sam9263: fix PLLB frequencies power: reset: at91-reset: fix power down register MAINTAINERS: add atmel ssc driver maintainer entry arm: socfpga: fix fetching cpu1start_addr for SMP ARM: zynq: DT: trivial: Fix mc node ARM: zynq: DT: Add cadence watchdog node ARM: zynq: DT: Add missing reference for memory-controller ARM: zynq: DT: Add missing reference for ADC ARM: zynq: DT: Add missing address for L2 pl310 ARM: zynq: DT: Remove 222 MHz OPP ARM: zynq: DT: Fix GEM register area size
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs updates from Al Viro: "overlayfs merge + leak fix for d_splice_alias() failure exits" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: overlayfs: embed middle into overlay_readdir_data overlayfs: embed root into overlay_readdir_data overlayfs: make ovl_cache_entry->name an array instead of pointer overlayfs: don't hold ->i_mutex over opening the real directory fix inode leaks on d_splice_alias() failure exits fs: limit filesystem stacking depth overlay: overlay filesystem documentation overlayfs: implement show_options overlayfs: add statfs support overlay filesystem shmem: support RENAME_WHITEOUT ext4: support RENAME_WHITEOUT vfs: add RENAME_WHITEOUT vfs: add whiteout support vfs: export check_sticky() vfs: introduce clone_private_mount() vfs: export __inode_permission() to modules vfs: export do_splice_direct() to modules vfs: add i_op->dentry_open()
-
Olof Johansson authored
Merge tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Merge "ARM: imx: fixes for 3.18" from Shawn Guo: The i.MX fixes for 3.18: - Revert one patch which increases I2C bus frequency on imx28-evk - Fix a typo on imx6q EIM clock name * tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx28-evk: Let i2c0 run at 100kHz ARM: i.MX6: Fix "emi" clock name typo Signed-off-by: Olof Johansson <olof@lixom.net>
-
- 25 Oct, 2014 1 commit
-
-
Fabio Estevam authored
Commit 78b81f46 ("ARM: dts: imx28-evk: Run I2C0 at 400kHz") caused issues when doing the following sequence in loop: - Boot the kernel - Perform audio playback - Reboot the system via 'reboot' command In many times the audio card cannot be probed, which causes playback to fail. After restoring to the original i2c0 frequency of 100kHz there is no such problem anymore. This reverts commit 78b81f46. Cc: <stable@vger.kernel.org> # 3.16+ Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
-