- 12 Apr, 2016 33 commits
-
-
Thomas Gleixner authored
commit 551adc60 upstream. Harry reported, that he's able to trigger a system freeze with cpu hot unplug. The freeze turned out to be a live lock caused by recent changes in irq_force_complete_move(). When fixup_irqs() and from there irq_force_complete_move() is called on the dying cpu, then all other cpus are in stop machine an wait for the dying cpu to complete the teardown. If there is a move of an interrupt pending then irq_force_complete_move() sends the cleanup IPI to the cpus in the old_domain mask and waits for them to clear the mask. That's obviously impossible as those cpus are firmly stuck in stop machine with interrupts disabled. I should have known that, but I completely overlooked it being concentrated on the locking issues around the vectors. And the existance of the call to __irq_complete_move() in the code, which actually sends the cleanup IPI made it reasonable to wait for that cleanup to complete. That call was bogus even before the recent changes as it was just a pointless distraction. We have to look at two cases: 1) The move_in_progress flag of the interrupt is set This means the ioapic has been updated with the new vector, but it has not fired yet. In theory there is a race: set_ioapic(new_vector) <-- Interrupt is raised before update is effective, i.e. it's raised on the old vector. So if the target cpu cannot handle that interrupt before the old vector is cleaned up, we get a spurious interrupt and in the worst case the ioapic irq line becomes stale, but my experiments so far have only resulted in spurious interrupts. But in case of cpu hotplug this should be a non issue because if the affinity update happens right before all cpus rendevouz in stop machine, there is no way that the interrupt can be blocked on the target cpu because all cpus loops first with interrupts enabled in stop machine, so the old vector is not yet cleaned up when the interrupt fires. So the only way to run into this issue is if the delivery of the interrupt on the apic/system bus would be delayed beyond the point where the target cpu disables interrupts in stop machine. I doubt that it can happen, but at least there is a theroretical chance. Virtualization might be able to expose this, but AFAICT the IOAPIC emulation is not as stupid as the real hardware. I've spent quite some time over the weekend to enforce that situation, though I was not able to trigger the delayed case. 2) The move_in_progress flag is not set and the old_domain cpu mask is not empty. That means, that an interrupt was delivered after the change and the cleanup IPI has been sent to the cpus in old_domain, but not all CPUs have responded to it yet. In both cases we can assume that the next interrupt will arrive on the new vector, so we can cleanup the old vectors on the cpus in the old_domain cpu mask. Fixes: 98229aa3 "x86/irq: Plug vector cleanup race" Reported-by: Harry Junior <harryjr@outlook.fr> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Joe Lawrence <joe.lawrence@stratus.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ben Hutchings <ben@decadent.org.uk> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603140931430.3657@nanosSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenzo Pieralisi authored
commit 4a2e7aab upstream. The [0 - 64k] ACPI PCI IO port resource boundary check in: acpi_dev_ioresource_flags() is currently applied blindly in the ACPI resource parsing to all architectures, but only x86 suffers from that IO space limitation. On arches (ie IA64 and ARM64) where IO space is memory mapped, the PCI root bridges IO resource windows are firstly initialized from the _CRS (in acpi_decode_space()) and contain the CPU physical address at which a root bridge decodes IO space in the CPU physical address space with the offset value representing the offset required to translate the PCI bus address into the CPU physical address. The IO resource windows are then parsed and updated in arch code before creating and enumerating PCI buses (eg IA64 add_io_space()) to map in an arch specific way the obtained CPU physical address range to a slice of virtual address space reserved to map PCI IO space, ending up with PCI bridges resource windows containing IO resources like the following on a working IA64 configuration: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x1000000-0x100ffff window] (bus address [0x0000-0xffff]) pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window] pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window] pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window] pci_bus 0000:00: root bus resource [bus 00] This implies that the [0 - 64K] check in acpi_dev_ioresource_flags() leaves platforms with memory mapped IO space (ie IA64) broken (ie kernel can't claim IO resources since the host bridge IO resource is disabled and discarded by ACPI core code, see log on IA64 with missing root bridge IO resource, silently filtered by current [0 - 64k] check in acpi_dev_ioresource_flags()): PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window] pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window] pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window] pci_bus 0000:00: root bus resource [bus 00] [...] pci 0000:00:03.0: [1002:515e] type 00 class 0x030000 pci 0000:00:03.0: reg 0x10: [mem 0x80000000-0x87ffffff pref] pci 0000:00:03.0: reg 0x14: [io 0x1000-0x10ff] pci 0000:00:03.0: reg 0x18: [mem 0x88020000-0x8802ffff] pci 0000:00:03.0: reg 0x30: [mem 0x88000000-0x8801ffff pref] pci 0000:00:03.0: supports D1 D2 pci 0000:00:03.0: can't claim BAR 1 [io 0x1000-0x10ff]: no compatible bridge window For this reason, the IO port resources boundaries check in generic ACPI parsing code should be guarded with a CONFIG_X86 guard so that more arches (ie ARM64) can benefit from the generic ACPI resources parsing interface without incurring in unexpected resource filtering, fixing at the same time current breakage on IA64. This patch factors out IO ports boundary [0 - 64k] check in generic ACPI code and makes the IO space check X86 specific to make sure that IO space resources are usable on other arches too. Fixes: 3772aea7 (ia64/PCI/ACPI: Use common ACPI resource parsing interface for host bridge) Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Helgaas authored
commit b84106b4 upstream. The PCI config header (first 64 bytes of each device's config space) is defined by the PCI spec so generic software can identify the device and manage its usage of I/O, memory, and IRQ resources. Some non-spec-compliant devices put registers other than BARs where the BARs should be. When the PCI core sizes these "BARs", the reads and writes it does may have unwanted side effects, and the "BAR" may appear to describe non-sensical address space. Add a flag bit to mark non-compliant devices so we don't touch their BARs. Turn off IO/MEM decoding to prevent the devices from consuming address space, since we can't read the BARs to find out what that address space would be. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Phil Elwell authored
commit 2c7e3306 upstream. The DT bindings for pinctrl-bcm2835 allow both the function and pull to contain either one entry or one per pin. However, an error in the DT parsing can cause failures if the number of pulls differs from the number of functions. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Phil Elwell <phil@raspberrypi.org> Reviewed-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastian Ott authored
commit 80c544de upstream. The function measurement block must not cross a page boundary. Ensure that by raising the alignment requirement to the smallest power of 2 larger than the size of the fmb. Fixes: d0b08853 ("s390/pci: performance statistics and debug infrastructure") Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit 8f100bb1 upstream. Add the missing lpp magic initialization for cpu 0. Without this all samples on cpu 0 do not have the most significant bit set in the program parameter field, which we use to distinguish between guest and host samples if the pid is also 0. We did initialize the lpp magic in the absolute zero lowcore but forgot that when switching to the allocated lowcore on cpu 0 only. Reported-by: Shu Juan Zhang <zhshuj@cn.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: e22cf8ca ("s390/cpumf: rework program parameter setting to detect guest samples") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin Schwidefsky authored
commit e370e476 upstream. There is a tricky interaction between the machine check handler and the critical sections of load_fpu_regs and save_fpu_regs functions. If the machine check interrupts one of the two functions the critical section cleanup will complete the function before the machine check handler s390_do_machine_check is called. Trouble is that the machine check handler needs to validate the floating point registers *before* and not *after* the completion of load_fpu_regs/save_fpu_regs. The simplest solution is to rewind the PSW to the start of the load_fpu_regs/save_fpu_regs and retry the function after the return from the machine check handler. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 6f3508f6 upstream. dct_sel_base_off is declared as a u64 but we're only using the lower 32 bits because of a shift wrapping bug. This can possibly truncate the upper 16 bits of DctSelBaseOffset[47:26], causing us to misdecode the CS row. Fixes: c8e518d5 ('amd64_edac: Sanitize f10_get_base_addr_offset') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20160120095451.GB19898@mwandaSigned-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tony Luck authored
commit eb1af3b7 upstream. Large memory Haswell-EX systems with multiple DIMMs per channel were sometimes reporting the wrong DIMM. Found three problems: 1) Debug printouts for socket and channel interleave were not interpreting the register fields correctly. The socket interleave field is a 2^X value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2, 2=3. 3=4). 2) Actual use of the socket interleave value didn't interpret as 2^X 3) Conversion of address to channel address was complicated, and wrong. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Hildenbrand authored
commit b15d53d0 upstream. kmap_coherent needs disabled preemption to not schedule in the critical section, just like kmap_coherent on mips and kmap_atomic in general. Fixes: 8222dbe2 "sched/preempt, mm/fault: Decouple preemption from the page fault logic" Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Tested-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chris Friesen authored
commit f9c904b7 upstream. The callers of steal_account_process_tick() expect it to return whether a jiffy should be considered stolen or not. Currently the return value of steal_account_process_tick() is in units of cputime, which vary between either jiffies or nsecs depending on CONFIG_VIRT_CPU_ACCOUNTING_GEN. If cputime has nsecs granularity and there is a tiny amount of stolen time (a few nsecs, say) then we will consider the entire tick stolen and will not account the tick on user/system/idle, causing /proc/stats to show invalid data. The fix is to change steal_account_process_tick() to accumulate the stolen time and only account it once it's worth a jiffy. (Thanks to Frederic Weisbecker for suggestions to fix a bug in my first version of the patch.) Signed-off-by: Chris Friesen <chris.friesen@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/56DBBDB8.40305@mail.usask.caSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zhang Rui authored
commit 81ad4276 upstream. In some cases, platform thermal driver may report invalid trip points, thermal core should not take any action for these trip points. This fixed a regression that bogus trip point starts to screw up thermal control on some Lenovo laptops, after commit bb431ba2 Author: Zhang Rui <rui.zhang@intel.com> Date: Fri Oct 30 16:31:47 2015 +0800 Thermal: initialize thermal zone device correctly After thermal zone device registered, as we have not read any temperature before, thus tz->temperature should not be 0, which actually means 0C, and thermal trend is not available. In this case, we need specially handling for the first thermal_zone_device_update(). Both thermal core framework and step_wise governor is enhanced to handle this. And since the step_wise governor is the only one that uses trends, so it's the only thermal governor that needs to be updated. Tested-by: Manuel Krause <manuelkrause@netscape.net> Tested-by: szegad <szegadlo@poczta.onet.pl> Tested-by: prash <prash.n.rao@gmail.com> Tested-by: amish <ammdispose-arch@yahoo.com> Tested-by: Matthias <morpheusxyz123@yahoo.de> Reviewed-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1317190 Link: https://bugzilla.kernel.org/show_bug.cgi?id=114551Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Olsa authored
commit 67d52689 upstream. The util/python-ext-sources file contains source files required to build the python extension relative to $(srctree)/tools/perf, Such a file path $(FILE).c is handed over to the python extension build system, which builds the final object in the $(PYTHON_EXTBUILD)/tmp/$(FILE).o path. After the build is done all files from $(PYTHON_EXTBUILD)lib/ are carried as the result binaries. Above system fails when we add source file relative to ../lib, which we do for: ../lib/bitmap.c ../lib/find_bit.c ../lib/hweight.c ../lib/rbtree.c All above objects will be built like: $(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c $(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c $(PYTHON_EXTBUILD)/tmp/../lib/hweight.c $(PYTHON_EXTBUILD)/tmp/../lib/rbtree.c which accidentally happens to be final library path: $(PYTHON_EXTBUILD)/lib/ Changing setup.py to pass full paths of source files to Extension build class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp directory. Reported-by: Jeff Bastian <jbastian@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wang Nan authored
commit 26dee028 upstream. According to man pages, asprintf returns -1 when failure. This patch fixes two incorrect return value checker. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Fixes: ffeb883e ("perf tools: Show proper error message for wrong terms of hw/sw events") Link: http://lkml.kernel.org/r/1455882283-79592-5-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andi Kleen authored
commit 940db6dc upstream. When an error happens during alias parsing currently the complete parsing of all attributes of the PMU is stopped. This is breaks old perf on a newer kernel that may have not-yet-know alias attributes (such as .scale or .per-pkg). Continue when some attribute is unparseable. This is IMHO a stable candidate and should be backported to older versions to avoid problems with newer kernels. v2: Print warnings when something goes wrong. v3: Change warning to debug output Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1455749095-18358-1-git-send-email-andi@firstfloor.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexander Shishkin authored
commit 927a5570 upstream. The error path in perf_event_open() is such that asking for a sampling event on a PMU that doesn't generate interrupts will end up in dropping the perf_sched_count even though it hasn't been incremented for this event yet. Given a sufficient amount of these calls, we'll end up disabling scheduler's jump label even though we'd still have active events in the system, thereby facilitating the arrival of the infernal regions upon us. I'm fixing this by moving account_event() inside perf_event_alloc(). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit ef697a71 upstream. Old KVM guests invoke single-context invvpid without actually checking whether it is supported. This was fixed by commit 518c8aee ("KVM: VMX: Make sure single type invvpid is supported before issuing invvpid instruction", 2010-08-01) and the patch after, but pre-2.6.36 kernels lack it including RHEL 6. Reported-by: jmontleo@redhat.com Tested-by: jmontleo@redhat.com Fixes: 99b83ac8Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit f6870ee9 upstream. A guest executing an invalid invvpid instruction would hang because the instruction pointer was not updated. Reported-by: jmontleo@redhat.com Tested-by: jmontleo@redhat.com Fixes: 99b83ac8Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit 2849eb4f upstream. A guest executing an invalid invept instruction would hang because the instruction pointer was not updated. Fixes: bfd0a56bReviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit e9ad4ec8 upstream. Moving the initialization earlier is needed in 4.6 because kvm_arch_init_vm is now using mmu_lock, causing lockdep to complain: [ 284.440294] INFO: trying to register non-static key. [ 284.445259] the code is fine but needs lockdep annotation. [ 284.450736] turning off the locking correctness validator. ... [ 284.528318] [<ffffffff810aecc3>] lock_acquire+0xd3/0x240 [ 284.533733] [<ffffffffa0305aa0>] ? kvm_page_track_register_notifier+0x20/0x60 [kvm] [ 284.541467] [<ffffffff81715581>] _raw_spin_lock+0x41/0x80 [ 284.546960] [<ffffffffa0305aa0>] ? kvm_page_track_register_notifier+0x20/0x60 [kvm] [ 284.554707] [<ffffffffa0305aa0>] kvm_page_track_register_notifier+0x20/0x60 [kvm] [ 284.562281] [<ffffffffa02ece70>] kvm_mmu_init_vm+0x20/0x30 [kvm] [ 284.568381] [<ffffffffa02dbf7a>] kvm_arch_init_vm+0x1ea/0x200 [kvm] [ 284.574740] [<ffffffffa02bff3f>] kvm_dev_ioctl+0xbf/0x4d0 [kvm] However, it also helps fixing a preexisting problem, which is why this patch is also good for stable kernels: kvm_create_vm was incrementing current->mm->mm_count but not decrementing it at the out_err label (in case kvm_init_mmu_notifier failed). The new initialization order makes it possible to add the required mmdrop without adding a new error label. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Radim Krčmář authored
commit 7dd0fdff upstream. Discard policy uses ack_notifiers to prevent injection of PIT interrupts before EOI from the last one. This patch changes the policy to always try to deliver the interrupt, which makes a difference when its vector is in ISR. Old implementation would drop the interrupt, but proposed one injects to IRR, like real hardware would. The old policy breaks legacy NMI watchdogs, where PIT is used through virtual wire (LVT0): PIT never sends an interrupt before receiving EOI, thus a guest deadlock with disabled interrupts will stop NMIs. Note that NMI doesn't do EOI, so PIT also had to send a normal interrupt through IOAPIC. (KVM's PIT is deeply rotten and luckily not used much in modern systems.) Even though there is a chance of regressions, I think we can fix the LVT0 NMI bug without introducing a new tick policy. Reported-by: Yuki Shibuya <shibuya.yk@ncos.nec.co.jp> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit 4e422bdd upstream. Sometimes when setting a breakpoint a process doesn't stop on it. This is because the debug registers are not loaded correctly on VCPU load. The following simple reproducer from Oleg Nesterov tries using debug registers in both the host and the guest, for example by running "./bp 0 1" on the host and "./bp 14 15" under QEMU. #include <unistd.h> #include <signal.h> #include <stdlib.h> #include <stdio.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/user.h> #include <asm/debugreg.h> #include <assert.h> #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void set_bp(pid_t pid, void *addr) { unsigned long dr7; assert(write_dr(pid, 0, (long)addr) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); } void *get_rip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } void test(int nr) { void *bp_addr = &&label + nr, *bp_hit; int pid; printf("test bp %d\n", nr); assert(nr < 16); // see 16 asm nops below pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGSTOP); for (;;) { label: asm ( "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" ); } } assert(pid == wait(NULL)); set_bp(pid, bp_addr); for (;;) { assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0); assert(pid == wait(NULL)); bp_hit = get_rip(pid); if (bp_hit != bp_addr) fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n", bp_hit - &&label, nr); } } int main(int argc, const char *argv[]) { while (--argc) { int nr = atoi(*++argv); if (!fork()) test(nr); } while (wait(NULL) > 0) ; return 0; } Suggested-by: Nadadv Amit <namit@cs.technion.ac.il> Reported-by: Andrey Wagin <avagin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Helgaas authored
commit b8941571 upstream. The Home Agent and PCU PCI devices in Broadwell-EP have a non-BAR register where a BAR should be. We don't know what the side effects of sizing the "BAR" would be, and we don't know what address space the "BAR" might appear to describe. Mark these devices as having non-compliant BARs so the PCI core doesn't touch them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stephane Eranian authored
commit 5690ae28 upstream. This patch adds a definition for GLOBAL_OVFL_STATUS bit 55 which is used with the Processor Trace (PT) feature. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-2-git-send-email-eranian@google.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andy Lutomirski authored
commit 4e79e182 upstream. Signal delivery needs to know the sign of an interrupted syscall's return value in order to detect -ERESTART variants. Normally this works independently of bitness because syscalls internally return long. Under ptrace, however, this can break, and syscall_get_error is supposed to sign-extend regs->ax if needed. We were clearing TS_COMPAT too early, though, and this prevented sign extension, which subtly broke syscall restart under ptrace. Reported-by: Robert O'Callahan <robert@ocallahan.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c5c46f59 ("x86/entry: Add new, comprehensible entry and exit handlers written in C") Link: http://lkml.kernel.org/r/cbce3cf545522f64eb37f5478cb59746230db3b5.1455142412.git.luto@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
commit 5f9c01aa upstream. Thomas Voegtle reported that doing oldconfig with a .config which has CONFIG_MICROCODE enabled but BLK_DEV_INITRD disabled prevents the microcode loading mechanism from being built. So untangle it from the BLK_DEV_INITRD dependency so that oldconfig doesn't turn it off and add an explanatory text to its Kconfig help what the supported methods for supplying microcode are. Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-2-git-send-email-bp@alien8.deSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
commit 264285ac upstream. Set the initrd @start depending on the presence of an initrd. Otherwise, builtin microcode loading doesn't work as the start is wrong and we're using it to compute offset to the microcode blobs. Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454499225-21544-3-git-send-email-bp@alien8.deSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chris Paterson authored
commit a32ef81c upstream. Commit 27cbd7e8 ("mmc: sh_mmcif: rework dma channel handling") introduced a typo causing the TX DMA channel allocation to be overwritten by the requested RX DMA channel. Fixes: 27cbd7e8 ("mmc: sh_mmcif: rework dma channel handling") Signed-off-by: Chris Paterson <chris.paterson2@renesas.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit 27cbd7e8 upstream. When compiling the sh_mmcif driver for ARM64, we currently get a harmless build warning: ../drivers/mmc/host/sh_mmcif.c: In function 'sh_mmcif_request_dma_one': ../drivers/mmc/host/sh_mmcif.c:417:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] (void *)pdata->slave_id_tx : ^ ../drivers/mmc/host/sh_mmcif.c:418:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] (void *)pdata->slave_id_rx; This could be worked around by adding another cast to uintptr_t, but I decided to simplify the code a little more to avoid that. This splits out the platform data using code into a separate function and builds that only for CONFIG_SUPERH. This part still has a typecast but does not need a second one. The SH platform code could be further modified to pass a pointer directly as we do on other architectures when we have a filter function. The normal case is simplified further and now just calls dma_request_slave_channel() directly without going through the compat handling. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit b9a1a743 upstream. ARM64 allmodconfig produces a bunch of warnings when building the samsung ASoC code: sound/soc/samsung/dmaengine.c: In function 'samsung_asoc_init_dma_data': sound/soc/samsung/dmaengine.c:53:32: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] playback_data->filter_data = (void *)playback->channel; sound/soc/samsung/dmaengine.c:60:31: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] capture_data->filter_data = (void *)capture->channel; We could easily shut up the warning by adding an intermediate cast, but there is a bigger underlying problem: The use of IORESOURCE_DMA to pass data from platform code to device drivers is dubious to start with, as what we really want is a pointer that can be passed into a filter function. Note that on s3c64xx, the pl08x DMA data is already a pointer, but gets cast to resource_size_t so we can pass it as a resource, and it then gets converted back to a pointer. In contrast, the data we pass for s3c24xx is an index into a device specific table, and we artificially convert that into a pointer for the filter function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thierry Reding authored
commit 70a7fb80 upstream. Commit fa731ac7 ("regulator: core: avoid unused variable warning") introduced a subtle change in how supplies are locked. Where previously code was always locking the regulator of the current iteration, the new implementation only locks the regulator if it has a supply. For any given power tree that means that the root will never get locked. On the other hand the regulator_unlock_supply() will still release all the locks, which in turn causes the lock debugging code to warn about a mutex being unlocked which wasn't locked. Cc: Mark Brown <broonie@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Fixes: Fixes: fa731ac7 ("regulator: core: avoid unused variable warning") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit fa731ac7 upstream. The second argument of the mutex_lock_nested() helper is only evaluated if CONFIG_DEBUG_LOCK_ALLOC is set. Otherwise we get this build warning for the new regulator_lock_supply function: drivers/regulator/core.c: In function 'regulator_lock_supply': drivers/regulator/core.c:142:6: warning: unused variable 'i' [-Wunused-variable] To avoid the warning, this restructures the code to make it both simpler and to move the 'i++' outside of the mutex_lock_nested call, where it is now always used and the variable is not flagged as unused. We had some discussion about changing mutex_lock_nested to an inline function, which would make the code do the right thing here, but in the end decided against it, in order to guarantee that mutex_lock_nested() does not introduced overhead without CONFIG_DEBUG_LOCK_ALLOC. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 9f01cd4a ("regulator: core: introduce function to lock regulators and its supplies") Link: http://permalink.gmane.org/gmane.linux.kernel/2068900Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Borntraeger authored
commit 7a76aa95 upstream. we have to check bit 40 of the facility list before issuing LPP and not bit 48. Otherwise a guest running on a system with "The decimal-floating-point zoned-conversion facility" and without the "The set-program-parameters facility" might crash on an lpp instruction. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: e22cf8ca ("s390/cpumf: rework program parameter setting to detect guest samples") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 Mar, 2016 7 commits
-
-
Greg Kroah-Hartman authored
-
James Hogan authored
commit 4b7b1ef2 upstream. The ld-version.sh script fails on some versions of awk with the following error, resulting in build failures for MIPS: awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(') This is due to the regular expression ".*)", meant to strip off the beginning of the ld version string up to the close bracket, however brackets have a meaning in regular expressions, so lets escape it so that awk doesn't expect a corresponding open bracket. Fixes: ccbef167 ("Kbuild, lto: add ld-version and ld-ifversion ...") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: James Hogan <james.hogan@imgtec.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Cc: Michal Marek <mmarek@suse.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-mips@linux-mips.org Cc: linux-kbuild@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/12838/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicholas Bellinger authored
commit 7f54ab5f upstream. This patch fixes a recent ABORT_TASK regression associated with commit febe562c, where a left-over target_put_sess_cmd() would still be called when __target_check_io_state() detected a command has already been completed, and explicit ABORT must be avoided. Note commit febe562c dropped the local kref_get_unless_zero() check in core_tmr_abort_task(), but did not drop this extra corresponding target_put_sess_cmd() in the failure path. So go ahead and drop this now bogus target_put_sess_cmd(), and avoid this potential use-after-free. Reported-by: Dan Lane <dracodan@gmail.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ming Lei authored
commit 90d0f0f1 upstream. For !BIO_CLONED bio, we can use .bi_vcnt safely, but it doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1] because the start postion may have been moved in the middle of the bvec, such as splitting in the middle of bvec. Fixes: 7bcd79ac(block: bio: introduce helpers to get the 1st and last bvec) Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Hogan authored
commit d825c06b upstream. When calculate_cpu_foreign_map() recalculates the cpu_foreign_map cpumask it uses the local variable temp_foreign_map without initialising it to zero. Since the calculation only ever sets bits in this cpumask any existing bits at that memory location will remain set and find their way into cpu_foreign_map too. This could potentially lead to cache operations suboptimally doing smp calls to multiple VPEs in the same core, even though the VPEs share primary caches. Therefore initialise temp_foreign_map using cpumask_clear() before use. Fixes: cccf34e9 ("MIPS: c-r4k: Fix cache flushing for MT cores") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/12759/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hauke Mehrtens authored
commit 7a50e468 upstream. The MIPS_GIC_IPI should only be selected when MIPS_GIC is also selected, otherwise it results in a compile error. smp-gic.c uses some functions from include/linux/irqchip/mips-gic.h like plat_ipi_call_int_xlate() which are only added to the header file when MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP. The calls top the functions from smp-gic.c are already protected by some #ifdefs The first part of this was introduced in commit 72e20142 ("MIPS: Move GIC IPI functions out of smp-cmp.c") Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/12774/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rui Wang authored
commit ce9113bb upstream. ovl_remove_upper() should do d_drop() only after it successfully removes the dir, otherwise a subsequent getcwd() system call will fail, breaking userspace programs. This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491Signed-off-by: Rui Wang <rui.y.wang@intel.com> Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-