- 01 Jul, 2017 2 commits
-
-
Paul Mackerras authored
At present, interrupts are hard-disabled fairly late in the guest entry path, in the assembly code. Since we check for pending signals for the vCPU(s) task(s) earlier in the guest entry path, it is possible for a signal to be delivered before we enter the guest but not be noticed until after we exit the guest for some other reason. Similarly, it is possible for the scheduler to request a reschedule while we are in the guest entry path, and we won't notice until after we have run the guest, potentially for a whole timeslice. Furthermore, with a radix guest on POWER9, we can take the interrupt with the MMU on. In this case we end up leaving interrupts hard-disabled after the guest exit, and they are likely to stay hard-disabled until we exit to userspace or context-switch to another process. This was masking the fact that we were also not setting the RI (recoverable interrupt) bit in the MSR, meaning that if we had taken an interrupt, it would have crashed the host kernel with an unrecoverable interrupt message. To close these races, we need to check for signals and reschedule requests after hard-disabling interrupts, and then keep interrupts hard-disabled until we enter the guest. If there is a signal or a reschedule request from another CPU, it will send an IPI, which will cause a guest exit. This puts the interrupt disabling before we call kvmppc_start_thread() for all the secondary threads of this core that are going to run vCPUs. The reason for that is that once we have started the secondary threads there is no easy way to back out without going through at least part of the guest entry path. However, kvmppc_start_thread() includes some code for radix guests which needs to call smp_call_function(), which must be called with interrupts enabled. To solve this problem, this patch moves that code into a separate function that is called earlier. When the guest exit is caused by an external interrupt, a hypervisor doorbell or a hypervisor maintenance interrupt, we now handle these using the replay facility. __kvmppc_vcore_entry() now returns the trap number that caused the exit on this thread, and instead of the assembly code jumping to the handler entry, we return to C code with interrupts still hard-disabled and set the irq_happened flag in the PACA, so that when we do local_irq_enable() the appropriate handler gets called. With all this, we now have the interrupt soft-enable flag clear while we are in the guest. This is useful because code in the real-mode hypercall handlers that checks whether interrupts are enabled will now see that they are disabled, which is correct, since interrupts are hard-disabled in the real-mode code. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
Since commit b009031f ("KVM: PPC: Book3S HV: Take out virtual core piggybacking code", 2016-09-15), we only have at most one vcore per subcore. Previously, the fact that there might be more than one vcore per subcore meant that we had the notion of a "master vcore", which was the vcore that controlled thread 0 of the subcore. We also needed a list per subcore in the core_info struct to record which vcores belonged to each subcore. Now that there can only be one vcore in the subcore, we can replace the list with a simple pointer and get rid of the notion of the master vcore (and in fact treat every vcore as a master vcore). We can also get rid of the subcore_vm[] field in the core_info struct since it is never read. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 22 Jun, 2017 2 commits
-
-
Paul Mackerras authored
Now that userspace can set the virtual SMT mode by enabling the KVM_CAP_PPC_SMT capability, it is useful for userspace to be able to query the set of possible virtual SMT modes. This provides a new capability, KVM_CAP_PPC_SMT_POSSIBLE, to provide this information. The return value is a bitmap of possible modes, with bit N set if virtual SMT mode 2^N is available. That is, 1 indicates SMT1 is available, 2 indicates that SMT2 is available, 3 indicates that both SMT1 and SMT2 are available, and so on. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Aravinda Prasad authored
Enhance KVM to cause a guest exit with KVM_EXIT_NMI exit reason upon a machine check exception (MCE) in the guest address space if the KVM_CAP_PPC_FWNMI capability is enabled (instead of delivering a 0x200 interrupt to guest). This enables QEMU to build error log and deliver machine check exception to guest via guest registered machine check handler. This approach simplifies the delivery of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html Note: This patch now directly invokes machine_check_print_event_info() from kvmppc_handle_exit_hv() to print the event to host console at the time of guest exit before the exception is passed on to the guest. Hence, the host-side handling which was performed earlier via machine_check_fwnmi is removed. The reasons for this approach is (i) it is not possible to distinguish whether the exception occurred in the guest or the host from the pt_regs passed on the machine_check_exception(). Hence machine_check_exception() calls panic, instead of passing on the exception to the guest, if the machine check exception is not recoverable. (ii) the approach introduced in this patch gives opportunity to the host kernel to perform actions in virtual mode before passing on the exception to the guest. This approach does not require complex tweaks to machine_check_fwnmi and friends. Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 21 Jun, 2017 2 commits
-
-
Mahesh Salgaonkar authored
It will be used in arch/powerpc/kvm/book3s_hv.c KVM module. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Aravinda Prasad authored
This introduces a new KVM capability to control how KVM behaves on machine check exception (MCE) in HV KVM guests. If this capability has not been enabled, KVM redirects machine check exceptions to guest's 0x200 vector, if the address in error belongs to the guest. With this capability enabled, KVM will cause a guest exit with the exit reason indicating an NMI. The new capability is required to avoid problems if a new kernel/KVM is used with an old QEMU, running a guest that doesn't issue "ibm,nmi-register". As old QEMU does not understand the NMI exit type, it treats it as a fatal error. However, the guest could have handled the machine check error if the exception was delivered to guest's 0x200 interrupt vector instead of NMI exit in case of old QEMU. [paulus@ozlabs.org - Reworded the commit message to be clearer, enable only on HV KVM.] Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 20 Jun, 2017 1 commit
-
-
Paul Mackerras authored
On a POWER9 system, it is possible for an interrupt to become pending for a VCPU when that VCPU is about to cede (execute a H_CEDE hypercall) and has already disabled interrupts, or in the H_CEDE processing up to the point where the XIVE context is pulled from the hardware. In such a case, the H_CEDE should not sleep, but should return immediately to the guest. However, the conditions tested in kvmppc_vcpu_woken() don't include the condition that a XIVE interrupt is pending, so the VCPU could sleep until the next decrementer interrupt. To fix this, we add a new xive_interrupt_pending() helper which looks in the XIVE context that was pulled from the hardware to see if the priority of any pending interrupt is higher (numerically lower than) the CPU priority. If so then kvmppc_vcpu_woken() will return true. If the XIVE context has never been used, then both the pipr and the cppr fields will be zero and the test will indicate that no interrupt is pending. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 19 Jun, 2017 5 commits
-
-
Paul Mackerras authored
On POWER9, we no longer have the restriction that we had on POWER8 where all threads in a core have to be in the same partition, so the CPU threads are now independent. However, we still want to be able to run guests with a virtual SMT topology, if only to allow migration of guests from POWER8 systems to POWER9. A guest that has a virtual SMT mode greater than 1 will expect to be able to use the doorbell facility; it will expect the msgsndp and msgclrp instructions to work appropriately and to be able to read sensible values from the TIR (thread identification register) and DPDES (directed privileged doorbell exception status) special-purpose registers. However, since each CPU thread is a separate sub-processor in POWER9, these instructions and registers can only be used within a single CPU thread. In order for these instructions to appear to act correctly according to the guest's virtual SMT mode, we have to trap and emulate them. We cause them to trap by clearing the HFSCR_MSGP bit in the HFSCR register. The emulation is triggered by the hypervisor facility unavailable interrupt that occurs when the guest uses them. To cause a doorbell interrupt to occur within the guest, we set the DPDES register to 1. If the guest has interrupts enabled, the CPU will generate a doorbell interrupt and clear the DPDES register in hardware. The DPDES hardware register for the guest is saved in the vcpu->arch.vcore->dpdes field. Since this gets written by the guest exit code, other VCPUs wishing to cause a doorbell interrupt don't write that field directly, but instead set a vcpu->arch.doorbell_request flag. This is consumed and set to 0 by the guest entry code, which then sets DPDES to 1. Emulating reads of the DPDES register is somewhat involved, because it requires reading the doorbell pending interrupt status of all of the VCPU threads in the virtual core, and if any of those VCPUs are running, their doorbell status is only up-to-date in the hardware DPDES registers of the CPUs where they are running. In order to get a reasonable approximation of the current doorbell status, we send those CPUs an IPI, causing an exit from the guest which will update the vcpu->arch.vcore->dpdes field. We then use that value in constructing the emulated DPDES register value. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This allows userspace to set the desired virtual SMT (simultaneous multithreading) mode for a VM, that is, the number of VCPUs that get assigned to each virtual core. Previously, the virtual SMT mode was fixed to the number of threads per subcore, and if userspace wanted to have fewer vcpus per vcore, then it would achieve that by using a sparse CPU numbering. This had the disadvantage that the vcpu numbers can get quite large, particularly for SMT1 guests on a POWER8 with 8 threads per core. With this patch, userspace can set its desired virtual SMT mode and then use contiguous vcpu numbering. On POWER8, where the threading mode is "strict", the virtual SMT mode must be less than or equal to the number of threads per subcore. On POWER9, which implements a "loose" threading mode, the virtual SMT mode can be any power of 2 between 1 and 8, even though there is effectively one thread per subcore, since the threads are independent and can all be in different partitions. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This adds code to allow us to use a different value for the HFSCR (Hypervisor Facilities Status and Control Register) when running the guest from that which applies in the host. The reason for doing this is to allow us to trap the msgsndp instruction and related operations in future so that they can be virtualized. We also save the value of HFSCR when a hypervisor facility unavailable interrupt occurs, because the high byte of HFSCR indicates which facility the guest attempted to access. We save and restore the host value on guest entry/exit because some bits of it affect host userspace execution. We only do all this on POWER9, not on POWER8, because we are not intending to virtualize any of the facilities controlled by HFSCR on POWER8. In particular, the HFSCR bit that controls execution of msgsndp and related operations does not exist on POWER8. The HFSCR doesn't exist at all on POWER7. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
It is possible, through a narrow race condition, for a VCPU to exit the guest with a H_CEDE hypercall while it has a doorbell interrupt pending. In this case, the H_CEDE should return immediately, but in fact it puts the VCPU to sleep until some other interrupt becomes pending or a prod is received (via another VCPU doing H_PROD). This fixes it by checking the DPDES (Directed Privileged Doorbell Exception Status) bit for the thread along with the other interrupt pending bits. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This allows userspace (e.g. QEMU) to enable large decrementer mode for the guest when running on a POWER9 host, by setting the LPCR_LD bit in the guest LPCR value. With this, the guest exit code saves 64 bits of the guest DEC value on exit. Other places that use the guest DEC value check the LPCR_LD bit in the guest LPCR value, and if it is set, omit the 32-bit sign extension that would otherwise be done. This doesn't change the DEC emulation used by PR KVM because PR KVM is not supported on POWER9 yet. This is partly based on an earlier patch by Oliver O'Halloran. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 16 Jun, 2017 2 commits
-
-
Paul Mackerras authored
POWER9 DD1 has an erratum where writing to the TBU40 register, which is used to apply an offset to the timebase, can cause the timebase to lose counts. This results in the timebase on some CPUs getting out of sync with other CPUs, which then results in misbehaviour of the timekeeping code. To work around the problem, we make KVM ignore the timebase offset for all guests on POWER9 DD1 machines. This means that live migration cannot be supported on POWER9 DD1 machines. Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
At present, HV KVM on POWER8 and POWER9 machines loses any instruction or data breakpoint set in the host whenever a guest is run. Instruction breakpoints are currently only used by xmon, but ptrace and the perf_event subsystem can set data breakpoints as well as xmon. To fix this, we save the host values of the debug registers (CIABR, DAWR and DAWRX) before entering the guest and restore them on exit. To provide space to save them in the stack frame, we expand the stack frame allocated by kvmppc_hv_entry() from 112 to 144 bytes. Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 15 Jun, 2017 2 commits
-
-
Paul Mackerras authored
If userspace attempts to call the KVM_RUN ioctl when it has hardware transactional memory (HTM) enabled, the values that it has put in the HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by guest values. To fix this, we detect this condition and save those SPR values in the thread struct, and disable HTM for the task. If userspace goes to access those SPRs or the HTM facility in future, a TM-unavailable interrupt will occur and the handler will reload those SPRs and re-enable HTM. If userspace has started a transaction and suspended it, we would currently lose the transactional state in the guest entry path and would almost certainly get a "TM Bad Thing" interrupt, which would cause the host to crash. To avoid this, we detect this case and return from the KVM_RUN ioctl with an EINVAL error, with the KVM exit reason set to KVM_EXIT_FAIL_ENTRY. Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This restores several special-purpose registers (SPRs) to sane values on guest exit that were missed before. TAR and VRSAVE are readable and writable by userspace, and we need to save and restore them to prevent the guest from potentially affecting userspace execution (not that TAR or VRSAVE are used by any known program that run uses the KVM_RUN ioctl). We save/restore these in kvmppc_vcpu_run_hv() rather than on every guest entry/exit. FSCR affects userspace execution in that it can prohibit access to certain facilities by userspace. We restore it to the normal value for the task on exit from the KVM_RUN ioctl. IAMR is normally 0, and is restored to 0 on guest exit. However, with a radix host on POWER9, it is set to a value that prevents the kernel from executing user-accessible memory. On POWER9, we save IAMR on guest entry and restore it on guest exit to the saved value rather than 0. On POWER8 we continue to set it to 0 on guest exit. PSPB is normally 0. We restore it to 0 on guest exit to prevent userspace taking advantage of the guest having set it non-zero (which would allow userspace to set its SMT priority to high). UAMOR is normally 0. We restore it to 0 on guest exit to prevent the AMR from being used as a covert channel between userspace processes, since the AMR is not context-switched at present. Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 13 Jun, 2017 1 commit
-
-
Paul Mackerras authored
This adds code to save the values of three SPRs (special-purpose registers) used by userspace to control event-based branches (EBBs), which are essentially interrupts that get delivered directly to userspace. These registers are loaded up with guest values when entering the guest, and their values are saved when exiting the guest, but we were not saving the host values and restoring them before going back to userspace. On POWER8 this would only affect userspace programs which explicitly request the use of EBBs and also use the KVM_RUN ioctl, since the only source of EBBs on POWER8 is the PMU, and there is an explicit enable bit in the PMU registers (and those PMU registers do get properly context-switched between host and guest). On POWER9 there is provision for externally-generated EBBs, and these are not subject to the control in the PMU registers. Since these registers only affect userspace, we can save them when we first come in from userspace and restore them before returning to userspace, rather than saving/restoring the host values on every guest entry/exit. Similarly, we don't need to worry about their values on offline secondary threads since they execute in the context of the idle task, which never executes in userspace. Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 29 May, 2017 1 commit
-
-
Paul Mackerras authored
POWER9 introduces a new mode for the decrementer register, called large decrementer mode, in which the decrementer counter is 56 bits wide rather than 32, and reads are sign-extended rather than zero-extended. For the decrementer, this new mode is optional and controlled by a bit in the LPCR. The hypervisor decrementer (HDEC) is 56 bits wide on POWER9 and has no mode control. Since KVM code reads and writes the decrementer and hypervisor decrementer registers in a few places, it needs to be aware of the need to treat the decrementer value as a 64-bit quantity, and only do a 32-bit sign extension when large decrementer mode is not in effect. Similarly, the HDEC should always be treated as a 64-bit quantity on POWER9. We define a new EXTEND_HDEC macro to encapsulate the feature test for POWER9 and the sign extension. To enable the sign extension to be removed in large decrementer mode, we test the LPCR_LD bit in the host LPCR image stored in the struct kvm for the guest. If is set then large decrementer mode is enabled and the sign extension should be skipped. This is partly based on an earlier patch by Oliver O'Halloran. Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 22 May, 2017 2 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
The code to fetch a 64-bit value from user space was entirely buggered, and has been since the code was merged in early 2016 in commit b2f68038 ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit kernels"). Happily the buggered routine is almost certainly entirely unused, since the normal way to access user space memory is just with the non-inlined "get_user()", and the inlined version didn't even historically exist. The normal "get_user()" case is handled by external hand-written asm in arch/x86/lib/getuser.S that doesn't have either of these issues. There were two independent bugs in __get_user_asm_u64(): - it still did the STAC/CLAC user space access marking, even though that is now done by the wrapper macros, see commit 11f1a4b9 ("x86: reorganize SMAP handling in user space accesses"). This didn't result in a semantic error, it just means that the inlined optimized version was hugely less efficient than the allegedly slower standard version, since the CLAC/STAC overhead is quite high on modern Intel CPU's. - the double register %eax/%edx was marked as an output, but the %eax part of it was touched early in the asm, and could thus clobber other inputs to the asm that gcc didn't expect it to touch. In particular, that meant that the generated code could look like this: mov (%eax),%eax mov 0x4(%eax),%edx where the load of %edx obviously was _supposed_ to be from the 32-bit word that followed the source of %eax, but because %eax was overwritten by the first instruction, the source of %edx was basically random garbage. The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark the 64-bit output as early-clobber to let gcc know that no inputs should alias with the output register. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@kernel.org # v4.8+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 21 May, 2017 7 commits
-
-
Linus Torvalds authored
Al noticed that unsafe_put_user() had type problems, and fixed them in commit a7cc722f ("fix unsafe_put_user()"), which made me look more at those functions. It turns out that unsafe_get_user() had a type issue too: it limited the largest size of the type it could handle to "unsigned long". Which is fine with the current users, but doesn't match our existing normal get_user() semantics, which can also handle "u64" even when that does not fit in a long. While at it, also clean up the type cast in unsafe_put_user(). We actually want to just make it an assignment to the expected type of the pointer, because we actually do want warnings from types that don't convert silently. And it makes the code more readable by not having that one very long and complex line. [ This patch might become stable material if we ever end up back-porting any new users of the unsafe uaccess code, but as things stand now this doesn't matter for any current existing uses. ] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull misc uaccess fixes from Al Viro: "Fix for unsafe_put_user() (no callers currently in mainline, but anyone starting to use it will step into that) + alpha osf_wait4() infoleak fix" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: osf_wait4(): fix infoleak fix unsafe_put_user()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fix from Thomas Gleixner: "A single scheduler fix: Prevent idle task from ever being preempted. That makes sure that synchronize_rcu_tasks() which is ignoring idle task does not pretend that no task is stuck in preempted state. If that happens and idle was preempted on a ftrace trampoline the machine crashes due to inconsistent state" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Call __schedule() from do_idle() without enabling preemption
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes from Thomas Gleixner: "A set of small fixes for the irq subsystem: - Cure a data ordering problem with chained interrupts - Three small fixlets for the mbigen irq chip" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Fix chained interrupt data ordering irqchip/mbigen: Fix the clear register offset calculation irqchip/mbigen: Fix potential NULL dereferencing irqchip/mbigen: Fix memory mapping code
-
Al Viro authored
failing sys_wait4() won't fill struct rusage... Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
__put_user_size() relies upon its first argument having the same type as what the second one points to; the only other user makes sure of that and unsafe_put_user() should do the same. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: - Fix a bug caused by not cleaning up the new instance unique triggers when deleting an instance. It also creates a selftest that triggers that bug. - Fix the delayed optimization happening after kprobes boot up self tests being removed by freeing of init memory. - Comment kprobes on why the delay optimization is not a problem for removal of modules, to keep other developers from searching that riddle. - Fix another case of rcu not watching in stack trace tracing. * tag 'trace-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Make sure RCU is watching before calling a stack trace kprobes: Document how optimized kprobes are removed from module unload selftests/ftrace: Add test to remove instance with active event triggers selftests/ftrace: Fix bashisms ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub ftrace/instances: Clear function triggers when removing instances ftrace: Simplify glob handling in unregister_ftrace_function_probe_func() tracing/kprobes: Enforce kprobes teardown after testing tracing: Move postpone selftests to core from early_initcall
-
- 20 May, 2017 13 commits
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: "A small collection of fixes that should go into this cycle. - a pull request from Christoph for NVMe, which ended up being manually applied to avoid pulling in newer bits in master. Mostly fibre channel fixes from James, but also a few fixes from Jon and Vijay - a pull request from Konrad, with just a single fix for xen-blkback from Gustavo. - a fuseblk bdi fix from Jan, fixing a regression in this series with the dynamic backing devices. - a blktrace fix from Shaohua, replacing sscanf() with kstrtoull(). - a request leak fix for drbd from Lars, fixing a regression in the last series with the kref changes. This will go to stable as well" * 'for-linus' of git://git.kernel.dk/linux-block: nvmet: release the sq ref on rdma read errors nvmet-fc: remove target cpu scheduling flag nvme-fc: stop queues on error detection nvme-fc: require target or discovery role for fc-nvme targets nvme-fc: correct port role bits nvme: unmap CMB and remove sysfs file in reset path blktrace: fix integer parse fuseblk: Fix warning in super_setup_bdi_name() block: xen-blkback: add null check to avoid null pointer dereference drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub()
-
Vijay Immanuel authored
On rdma read errors, release the sq ref that was taken when the req was initialized. This avoids a hang in nvmet_sq_destroy() when the queue is being freed. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
James Smart authored
Remove NVMET_FCTGTFEAT_NEEDS_CMD_CPUSCHED. It's unnecessary. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
James Smart authored
Per the recommendation by Sagi on: http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html Rather than waiting for reset work thread to stop queues and abort the ios, immediately stop the queues on error detection. Reset thread will restop the queues (as it's called on other paths), but it does not appear to have a side effect. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
James Smart authored
In order to create an association, the remoteport must be serving either a target role or a discovery role. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
James Smart authored
FC Port roles is a bit mask, not individual values. Correct nvme definitions to unique bits. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jon Derrick authored
CMB doesn't get unmapped until removal while getting remapped on every reset. Add the unmapping and sysfs file removal to the reset path in nvme_pci_disable to match the mapping path in nvme_pci_enable. Fixes: 202021c1 ("nvme : Add sysfs entry for NVMe CMBs when appropriate") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Reviewed-By: Stephen Bates <sbates@raithlin.com> Cc: <stable@vger.kernel.org> # 4.9+ Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging driver fixes from Greg KH: "Here are a number of staging driver fixes for 4.12-rc2 Most of them are typec driver fixes found by reviewers and users of the code. There are also some removals of files no longer needed in the tree due to the ion driver rewrite in 4.12-rc1, as well as some wifi driver fixes. And to round it out, a MAINTAINERS file update. All have been in linux-next with no reported issues" * tag 'staging-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (22 commits) MAINTAINERS: greybus-dev list is members-only staging: fsl-dpaa2/eth: add ETHERNET dependency staging: typec: fusb302: refactor resume retry mechanism staging: typec: fusb302: reset i2c_busy state in error staging: rtl8723bs: remove re-positioned call to kfree in os_dep/ioctl_cfg80211.c staging: rtl8192e: GetTs Fix invalid TID 7 warning. staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD. staging: rtl8192e: fix 2 byte alignment of register BSSIDR. staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory. staging: vc04_services: Fix bulk cache maintenance staging: ccree: remove extraneous spin_unlock_bh() in error handler staging: typec: Fix sparse warnings about incorrect types staging: typec: fusb302: do not free gpio from managed resource staging: typec: tcpm: Fix Port Power Role field in PS_RDY messages staging: typec: tcpm: Respond to Discover Identity commands staging: typec: tcpm: Set correct flags in PD request messages staging: typec: tcpm: Drop duplicate PD messages staging: typec: fusb302: Fix chip->vbus_present init value staging: typec: fusb302: Fix module autoload staging: typec: tcpci: declare private structure as static ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are a number of small USB fixes for 4.12-rc2 Most of them come from Johan, in his valiant quest to fix up all drivers that could be affected by "malicious" USB devices. There's also some fixes for more "obscure" drivers to handle some of the vmalloc stack fallout (which for USB drivers, was always the case, but very few people actually ran those systems...) Other than that, the normal set of xhci and gadget and musb driver fixes as well. All have been in linux-next with no reported issues" * tag 'usb-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (42 commits) usb: musb: tusb6010_omap: Do not reset the other direction's packet size usb: musb: Fix trying to suspend while active for OTG configurations usb: host: xhci-plat: propagate return value of platform_get_irq() xhci: Fix command ring stop regression in 4.11 xhci: remove GFP_DMA flag from allocation USB: xhci: fix lock-inversion problem usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd usb: host: xhci-mem: allocate zeroed Scratchpad Buffer xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton usb: xhci: trace URB before giving it back instead of after USB: serial: qcserial: add more Lenovo EM74xx device IDs USB: host: xhci: use max-port define USB: hub: fix SS max number of ports USB: hub: fix non-SS hub-descriptor handling USB: hub: fix SS hub-descriptor handling USB: usbip: fix nonconforming hub descriptor USB: gadget: dummy_hcd: fix hub-descriptor removable fields doc-rst: fixed kernel-doc directives in usb/typec.rst USB: core: of: document reference taken by companion helper USB: ehci-platform: fix companion-device leak ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds authored
Pull char/misc driver fixes from Greg KH: "Here are five small bugfixes for reported issues with 4.12-rc1 and earlier kernels. Nothing huge here, just a lp, mem, vpd, and uio driver fix, along with a Kconfig fixup for one of the misc drivers. All of these have been in linux-next with no reported issues" * tag 'char-misc-4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: firmware: Google VPD: Fix memory allocation error handling drivers: char: mem: Check for address space wraparound with mmap() uio: fix incorrect memory leak cleanup misc: pci_endpoint_test: select CRC32 char: lp: fix possible integer overflow in lp_setup()
-
git://www.linux-watchdog.org/linux-watchdogLinus Torvalds authored
Pull watchdog fixes from Wim Van Sebroeck: - orion_wdt compile-test dependencies - sama5d4_wdt: WDDIS handling and a race confition - pcwd_usb: fix NULL-deref at probe - cadence_wdt: fix timeout setting - wdt_pci: fix build error if SOFTWARE_REBOOT is defined - iTCO_wdt: all versions count down twice - zx2967: remove redundant dev_err call in zx2967_wdt_probe() - bcm281xx: Fix use of uninitialized spinlock * git://www.linux-watchdog.org/linux-watchdog: watchdog: bcm281xx: Fix use of uninitialized spinlock. watchdog: zx2967: remove redundant dev_err call in zx2967_wdt_probe() iTCO_wdt: all versions count down twice watchdog: wdt_pci: fix build error if define SOFTWARE_REBOOT watchdog: cadence_wdt: fix timeout setting watchdog: pcwd_usb: fix NULL-deref at probe watchdog: sama5d4: fix race condition watchdog: sama5d4: fix WDDIS handling watchdog: orion: fix compile-test dependencies
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Mostly nouveau and i915, fairly quiet as usual for rc2" * tag 'drm-fixes-for-v4.12-rc2' of git://people.freedesktop.org/~airlied/linux: drm/atmel-hlcdc: Fix output initialization gpu: host1x: select IOMMU_IOVA drm/nouveau/fifo/gk104-: Silence a locking warning drm/nouveau/secboot: plug memory leak in ls_ucode_img_load_gr() error path drm/nouveau: Fix drm poll_helper handling drm/i915: don't do allocate_va_range again on PIN_UPDATE drm/i915: Fix rawclk readout for g4x drm/i915: Fix runtime PM for LPE audio drm/i915/glk: Fix DSI "*ERROR* ULPS is still active" messages drm/i915/gvt: avoid unnecessary vgpu switch drm/i915/gvt: not to restore in-context mmio drm/etnaviv: don't put fence in case of submit failure drm/i915/gvt: fix typo: "supporte" -> "support" drm: hdlcd: Fix the calculation of the scanout start address
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is the first sweep of mostly minor fixes. There's one security one: the read past the end of a buffer in qedf, and a panic fix for lpfc SLI-3 adapters, but the rest are a set of include and build dependency tidy ups and assorted other small fixes and updates" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: pmcraid: remove redundant check to see if request_size is less than zero scsi: lpfc: ensure els_wq is being checked before destroying it scsi: cxlflash: Select IRQ_POLL scsi: qedf: Avoid reading past end of buffer scsi: qedf: Cleanup the type of io_log->op scsi: lpfc: double lock typo in lpfc_ns_rsp() scsi: qedf: properly update arguments position in function call scsi: scsi_lib: Add #include <scsi/scsi_transport.h> scsi: MAINTAINERS: update OSD entries scsi: Skip deleted devices in __scsi_device_lookup scsi: lpfc: Fix panic on BFS configuration scsi: libfc: do not flood console with messages 'libfc: queue full ...'
-