- 14 Jun, 2017 40 commits
-
-
Greg Kroah-Hartman authored
-
Vegard Nossum authored
commit b0f5a8f3 upstream. This fixes a regression in commit 4d6501dc where I didn't notice that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to NULL after our initialisation in copy_process(). We can simply get rid of the arch-specific initialisation here since it is now always done in copy_process() before hitting copy_thread{,_tls}(). Review notes: - As far as I can tell, copy_process() is the only user of copy_thread_tls(), which is the only caller of copy_thread() for architectures that don't implement copy_thread_tls(). - After this patch, there is no arch-specific code touching p->set_child_tid or p->clear_child_tid whatsoever. - It may look like MIPS/OpenRISC wanted to always have these fields be NULL, but that's not true, as copy_process() would unconditionally set them again _after_ calling copy_thread_tls() before commit 4d6501dc. Fixes: 4d6501dc ("kthread: Fix use-after-free if kthread fork fails") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Jamie Iles <jamie.iles@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit d2df92e9 upstream. The existing code selects no next branch to be inspected when re-inserting an inactive element into the rb-tree, looping endlessly. This patch restricts the check for active elements to the EEXIST case only. Fixes: e701001e ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates") Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jani Nikula authored
commit bb1d1329 upstream. The main thing are the DDI ports. If there's a VBT that says there are no outputs, we should trust that, and not have semi-random defaults. Unfortunately, the defaults have resulted in some Chromebooks without VBT to rely on this behaviour, so we split out the defaults for the missing VBT case. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/95c26079ff640d43f53b944f17e9fc356b36daec.1489152288.git.jani.nikula@intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jani Nikula authored
commit 66578857 upstream. We don't use the error return for anything other than reporting and logging that there is no VBT. We can pull the logging in the function, and remove the error status return. Moreover, if we needed the information for something later on, we'd probably be better off storing the bit in dev_priv, and using it where it's needed, instead of using the error return. While at it, improve the comments. Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/438ebbb0d5f0d321c625065b9cc78532a1dab24f.1489152288.git.jani.nikula@intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Moore authored
commit 48d0e023 upstream. Cong Wang correctly pointed out that the RCU read locking of the auditd_connection struct was wrong, this patch correct this by adopting a more traditional, and correct RCU locking model. This patch is heavily based on an earlier prototype by Cong Wang. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit 90b4f30b upstream. The recent conversion to the hotplug state machine missed that the original hotplug notifiers did not execute in the frozen state, which is used on suspend on resume. This does not matter on single socket machines, but on multi socket systems this breaks when the device for a non-boot socket is removed when the last CPU of that socket is brought offline. The device removal locks up the machine hard w/o any debug output. Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true. Thanks to Tommi for providing debug information patiently while I failed to spot the obvious. Fixes: e00ca5df ("hwmon: (coretemp) Convert to hotplug state machine") Reported-by: Tommi Rantala <tt.rantala@gmail.com> Tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: "Chen, Yu C" <yu.c.chen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chandan Rajendra authored
commit a008c31c upstream. On a ppc64 machine executing overlayfs/019 with xfs as the lower and upper filesystem causes the following call trace, WARNING: CPU: 2 PID: 8034 at /root/repos/linux/fs/iomap.c:765 .iomap_dio_actor+0xcc/0x420 Modules linked in: CPU: 2 PID: 8034 Comm: fsstress Tainted: G L 4.11.0-rc5-next-20170405 #100 task: c000000631314880 task.stack: c0000003915d4000 NIP: c00000000035a72c LR: c00000000035a6f4 CTR: c00000000035a660 REGS: c0000003915d7570 TRAP: 0700 Tainted: G L (4.11.0-rc5-next-20170405) MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 24004284 XER: 00000000 CFAR: c0000000006f7190 SOFTE: 1 GPR00: c00000000035a6f4 c0000003915d77f0 c0000000015a3f00 000000007c22f600 GPR04: 000000000022d000 0000000000002600 c0000003b2d56360 c0000003915d7960 GPR08: c0000003915d7cd0 0000000000000002 0000000000002600 c000000000521cc0 GPR12: 0000000024004284 c00000000fd80a00 000000004b04ae64 ffffffffffffffff GPR16: 000000001000ca70 0000000000000000 c0000003b2d56380 c00000000153d2b8 GPR20: 0000000000000010 c0000003bc87bac8 0000000000223000 000000000022f5ff GPR24: c0000003b2d56360 000000000000000c 0000000000002600 000000000022d000 GPR28: 0000000000000000 c0000003915d7960 c0000003b2d56360 00000000000001ff NIP [c00000000035a72c] .iomap_dio_actor+0xcc/0x420 LR [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 Call Trace: [c0000003915d77f0] [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 (unreliable) [c0000003915d78f0] [c00000000035b9f4] .iomap_apply+0xf4/0x1f0 [c0000003915d79d0] [c00000000035c320] .iomap_dio_rw+0x230/0x420 [c0000003915d7ae0] [c000000000512a14] .xfs_file_dio_aio_read+0x84/0x160 [c0000003915d7b80] [c000000000512d24] .xfs_file_read_iter+0x104/0x130 [c0000003915d7c10] [c0000000002d6234] .__vfs_read+0x114/0x1a0 [c0000003915d7cf0] [c0000000002d7a8c] .vfs_read+0xac/0x1a0 [c0000003915d7d90] [c0000000002d96b8] .SyS_read+0x58/0x100 [c0000003915d7e30] [c00000000000b8e0] system_call+0x38/0xfc Instruction dump: 78630020 7f831b78 7ffc07b4 7c7ce039 40820360 a13d0018 2f890003 419e0288 2f890004 419e00a0 2f890001 419e02a8 <0fe00000> 3b80fffb 38210100 7f83e378 The above problem can also be recreated on a regular xfs filesystem using the command, $ fsstress -d /mnt -l 1000 -n 1000 -p 1000 The reason for the call trace is, 1. When 'reserving' blocks for delayed allocation , XFS reserves more blocks (i.e. past file's current EOF) than required. This is done because XFS assumes that userspace might write more data and hence 'reserving' more blocks might lead to the file's new data being stored contiguously on disk. 2. The in-memory 'struct xfs_bmbt_irec' mapping the file's last extent would then cover the prealloc-ed EOF blocks in addition to the regular blocks. 3. When flushing the dirty blocks to disk, we only flush data till the file's EOF. But before writing out the dirty data, we allocate blocks on the disk for holding the file's new data. This allocation includes the blocks that are part of the 'prealloc EOF blocks'. 4. Later, when the last reference to the inode is being closed, XFS frees the unused 'prealloc EOF blocks' in xfs_inactive(). In step 3 above, When allocating space on disk for the delayed allocation range, the space allocator might sometimes allocate less blocks than required. If such an allocation ends right at the current EOF of the file, We will not be able to clear the "delayed allocation" flag for the 'prealloc EOF blocks', since we won't have dirty buffer heads associated with that range of the file. In such a situation if a Direct I/O read operation is performed on file range [X, Y] (where X < EOF and Y > EOF), we flush dirty data in the range [X, Y] and invalidate page cache for that range (Refer to iomap_dio_rw()). Later for performing the Direct I/O read, XFS obtains the extent items (which are still cached in memory) for the file range. When doing so we are not supposed to get an extent item with IOMAP_DELALLOC flag set, since the previous "flush" operation should have converted any delayed allocation data in the range [X, Y]. Hence we end up hitting a WARN_ON_ONCE(1) statement in iomap_dio_actor(). This commit fixes the bug by preventing the read operation from going beyond iomap_dio->i_size. Reported-by: Santhosh G <santhog4@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit 310b4816 upstream. a590b90d ("cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc()") converted most cgroup_get() usages to cgroup_get_live() leaving cgroup_sk_alloc() the sole user of cgroup_get(). When !CONFIG_SOCK_CGROUP_DATA, this ends up triggering unused warning for cgroup_get(). Silence the warning by adding __maybe_unused to cgroup_get(). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20170501145340.17e8ef86@canb.auug.org.auSigned-off-by: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wei Yongjun authored
commit a9de080b upstream. Make sure dmi_system_id tables are NULL terminated. Fixes: 70365027 ("pinctrl: cherryview: Add a quirk to make Acer Chromebook keyboard work again") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takatoshi Akiyama authored
commit 3c910176 upstream. This patch fixes an issue that kernel panic happens when DMA is enabled and we press enter key while the kernel booting on the serial console. * An interrupt may occur after sci_request_irq(). * DMA transfer area is initialized by setup_timer() in sci_request_dma() and used in interrupt. If an interrupt occurred between sci_request_irq() and setup_timer() in sci_request_dma(), DMA transfer area has not been initialized yet. So, this patch changes the order of sci_request_irq() and sci_request_dma(). Fixes: 73a19e4c ("serial: sh-sci: Add DMA support.") Signed-off-by: Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com> [Shimoda changes the commit log] Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michał Winiarski authored
commit ca7a45ba upstream. Used by production device: Intel(R) Iris(TM) Graphics P555 Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227112256.20060-1-michal.winiarski@intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ville Syrjälä authored
commit 668e3b01 upstream. Not calling pm_runtime_enable() means that runtime PM can't be enabled at all via sysfs. So we definitely need to call it from somewhere. Calling it from the driver seems like a bad idea because it would have to be paired with a pm_runtime_disable() at driver unload time, otherwise the core gets upset. Also if there's no LPE audio driver loaded then we couldn't runtime suspend i915 either. So it looks like a better plan is to call it from i915 when we register the platform device. That seems to match how pci generally does things. I cargo culted the pm_runtime_forbid() and pm_runtime_set_active() calls from pci as well. The exposed runtime PM API is massive an thorougly misleading, so I don't actually know if this is how you're supposed to use the API or not. But it seems to work. I can now runtime suspend i915 again with or without the LPE audio driver loaded, and reloading the LPE audio driver also seems to work. Note that powertop won't auto-tune runtime PM for platform devices, which is a little annoying. So I'm not sure that leaving runtime PM in "on" mode by default is the best choice here. But I've left it like that for now at least. Also remove the comment about there not being much benefit from LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime PM, which will also block s0ix, and that could have a measurable impact on power consumption. Cc: Takashi Iwai <tiwai@suse.de> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Fixes: 0b6b524f ("ALSA: x86: Don't enable runtime PM as default") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-2-ville.syrjala@linux.intel.comReviewed-by: Takashi Iwai <tiwai@suse.de> (cherry picked from commit 183c0035) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Julius Werner authored
commit 32829da5 upstream. A recent fix to /dev/mem prevents mappings from wrapping around the end of physical address space. However, the check was written in a way that also prevents a mapping reaching just up to the end of physical address space, which may be a valid use case (especially on 32-bit systems). This patch fixes it by checking the last mapped address (instead of the first address behind that) for overflow. Fixes: b299cde2 ("drivers: char: mem: Check for address space wraparound with mmap()") Reported-by: Nico Huber <nico.h@gmx.de> Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastian Andrzej Siewior authored
commit 40da1b11 upstream. If a custom CPU target is specified and that one is not available _or_ can't be interrupted then the code returns to userland without dropping a lock as notices by lockdep: |echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target | ================================================ | [ BUG: lock held when returning to user space! ] | ------------------------------------------------ | bash/503 is leaving the kernel with locks still held! | 1 lock held by bash/503: | #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40 So release the lock then. Fixes: 757c989b ("cpu/hotplug: Make target state writeable") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.deSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 4efda5f2 upstream. soc_cleanup_card_resources() call snd_card_free() at the last of its procedure. This turned out to lead to a use-after-free. PCM runtimes have been already removed via soc_remove_pcm_runtimes(), while it's dereferenced later in soc_pcm_free() called via snd_card_free(). The fix is simple: just move the snd_card_free() call to the beginning of the whole procedure. This also gives another benefit: it guarantees that all operations have been shut down before actually releasing the resources, which was racy until now. Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit ba3021b2 upstream. snd_timer_user_tselect() reallocates the queue buffer dynamically, but it forgot to reset its indices. Since the read may happen concurrently with ioctl and snd_timer_user_tselect() allocates the buffer via kmalloc(), this may lead to the leak of uninitialized kernel-space data, as spotted via KMSAN: BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10 CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007 kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086 copy_to_user ./arch/x86/include/asm/uaccess.h:725 snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004 do_loop_readv_writev fs/read_write.c:716 __do_readv_writev+0x94c/0x1380 fs/read_write.c:864 do_readv_writev fs/read_write.c:894 vfs_readv fs/read_write.c:908 do_readv+0x52a/0x5d0 fs/read_write.c:934 SYSC_readv+0xb6/0xd0 fs/read_write.c:1021 SyS_readv+0x87/0xb0 fs/read_write.c:1018 This patch adds the missing reset of queue indices. Together with the previous fix for the ioctl/read race, we cover the whole problem. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit d11662f4 upstream. The read from ALSA timer device, the function snd_timer_user_tread(), may access to an uninitialized struct snd_timer_user fields when the read is concurrently performed while the ioctl like snd_timer_user_tselect() is invoked. We have already fixed the races among ioctls via a mutex, but we seem to have forgotten the race between read vs ioctl. This patch simply applies (more exactly extends the already applied range of) tu->ioctl_lock in snd_timer_user_tread() for closing the race window. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ben Skeggs authored
commit b4e382ca upstream. Reusing the list_head for both is a bad idea. Callback execution is done with the lock dropped so that alarms can be rescheduled from the callback, which means that with some unfortunate timing, lists can get corrupted. The execution list should not require its own locking, the single function that uses it can only be called from a single context. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dominik Brodowski authored
commit 5b0bc9ac upstream. During early boot, load_ucode_intel_ap() uses __load_ucode_intel() to obtain a pointer to the relevant microcode patch (embedded in the initrd), and stores this value in 'intel_ucode_patch' to speed up the microcode patch application for subsequent CPUs. On resuming from suspend-to-RAM, however, load_ucode_ap() calls load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is long gone so the pointer stored in 'intel_ucode_patch' no longer points to a valid microcode patch. Clear that pointer so that we effectively fall back to the CPU hotplug notifier callbacks to update the microcode. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> [ Edit and massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.deSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sinclair Yeh authored
commit 07678eca upstream. When vmw_gb_surface_define_ioctl() is called with an existing buffer, we end up returning an uninitialized variable in the backup_handle. The fix is to first initialize backup_handle to 0 just to be sure, and second, when a user-provided buffer is found, we will use the req->buffer_handle as the backup_handle. Reported-by: Murray McAllister <murray.mcallister@insomniasec.com> Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vladis Dronov authored
commit ee9c4e68 upstream. The 'req->mip_levels' parameter in vmw_gb_surface_define_ioctl() is a user-controlled 'uint32_t' value which is used as a loop count limit. This can lead to a kernel lockup and DoS. Add check for 'req->mip_levels'. References: https://bugzilla.redhat.com/show_bug.cgi?id=1437431Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit f0c62e98 upstream. If vmalloc() fails then we need to a bit of cleanup before returning. Fixes: fb1d9738 ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Timur Tabi authored
commit 24609669 upstream. Use software polling (PHY_POLL) to check for link state changes instead of relying on the EMAC's hardware polling feature. Some PHY drivers are unable to get a functioning link because the HW polling is not robust enough. The EMAC is able to poll the PHY on the MDIO bus looking for link state changes (via the Link Status bit in the Status Register at address 0x1). When the link state changes, the EMAC triggers an interrupt and tells the driver what the new state is. The feature eliminates the need for software to poll the MDIO bus. Unfortunately, this feature is incompatible with phylib, because it ignores everything that the PHY core and PHY drivers are trying to do. In particular: 1. It assumes a compatible register set, so PHYs with different registers may not work. 2. It doesn't allow for hardware errata that have work-arounds implemented in the PHY driver. 3. It doesn't support multiple register pages. If the PHY core switches the register set to another page, the EMAC won't know the page has changed and will still attempt to read the same PHY register. 4. It only checks the copper side of the link, not the SGMII side. Some PHY drivers (e.g. at803x) may also check the SGMII side, and report the link as not ready during autonegotiation if the SGMII link is still down. Phylib then waits for another interrupt to query the PHY again, but the EMAC won't send another interrupt because it thinks the link is up. Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit 1123a604 upstream. Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit 5a41344a ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Fixes: 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jin Yao authored
commit cc1582c2 upstream. When doing sampling, for example: perf record -e cycles:u ... On workloads that do a lot of kernel entry/exits we see kernel samples, even though :u is specified. This is due to skid existing. This might be a security issue because it can leak kernel addresses even though kernel sampling support is disabled. The patch drops the kernel samples if exclude_kernel is specified. For example, test on Haswell desktop: perf record -e cycles:u <mgen> perf report --stdio Before patch applied: 99.77% mgen mgen [.] buf_read 0.20% mgen mgen [.] rand_buf_init 0.01% mgen [kernel.vmlinux] [k] apic_timer_interrupt 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen mgen [.] rand_array_init 0.00% mgen [kernel.vmlinux] [k] page_fault 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] __strcasestr 0.00% mgen ld-2.23.so [.] strcmp 0.00% mgen ld-2.23.so [.] _dl_start 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start We can see kernel symbols apic_timer_interrupt and page_fault. After patch applied: 99.79% mgen mgen [.] buf_read 0.19% mgen mgen [.] rand_buf_init 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen mgen [.] rand_array_init 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] vfprintf 0.00% mgen libc-2.23.so [.] rand 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen libc-2.23.so [.] _IO_doallocbuf 0.00% mgen ld-2.23.so [.] do_lookup_x 0.00% mgen ld-2.23.so [.] open_verify.constprop.7 0.00% mgen ld-2.23.so [.] _dl_important_hwcaps 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start There are only userspace symbols. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: kan.liang@intel.com Cc: mark.rutland@arm.com Cc: will.deacon@arm.com Cc: yao.jin@intel.com Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrew Lunn authored
commit 3e4240da upstream. This reverts commit 368e5fbd. devm_ioremap_resource() enforces that there are no overlapping resources, where as devm_ioremap() does not. The sata phy driver needs a subset of the sata IO address space, so maps some of the sata address space. As a result, sata_mv now fails to probe, reporting it cannot get its resources, and so we don't have any SATA disks. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Breno Leitao authored
commit 7f22ced4 upstream. Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: 5d176f75 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Breno Leitao authored
commit 1195892c upstream. Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: 70fe3d98 ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Bringmann authored
commit dc421b20 upstream. When adding or removing memory, the aa_index (affinity value) for the memblock must also be converted to match the endianness of the rest of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval of the attribute will likely lead to non-existent nodes, followed by using the default node in the code inappropriately. Fixes: 5f97b2a0 ("powerpc/pseries: Implement memory hotplug add in the kernel") Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Ellerman authored
commit ba4a648f upstream. In commit 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Fixes: 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe Leroy authored
commit 6f553912 upstream. of_mm_gpiochip_add_data() generates an oops for NULL pointer dereference. of_mm_gpiochip_add_data() calls mm_gc->save_regs() before setting the data, therefore ->save_regs() cannot use gpiochip_get_data() Fixes: 937daafc ("powerpc: simple-gpio: use gpiochip data pointer") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joe Carnuccio authored
commit 74939a0b upstream. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joe Carnuccio authored
commit 1d634965 upstream. Set bit (BIT_15) to send right ECHO payload information for Diagnostic Echo Test command. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joe Carnuccio authored
commit ce6c668b upstream. Firmware dump allows for debugging customer issues. This patch fixes start/end pointer calculation to capture T262 template entry for dump tool. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Quinn Tran authored
commit 0ea88662 upstream. Remove redundant fc_host_port_name calls to prevent early access of scsi_host->shost_data buffer. This prevent null pointer access. Following stack trace is seen: BUG: unable to handle kernel NULL pointer dereference at 00000000000008 IP: qla24xx_report_id_acquisition+0x22d/0x3a0 [qla2xxx] Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sawan Chandak authored
commit b95b9452 upstream. when driver is loaded with Multi Queue enabled, it was noticed that there was one less queue pair created. Following message would indicate this: "No resources to create additional q pair." The result of one less queue pair means that system can crash, if the block mq layer thinks there is an extra hardware queue available, and the driver will use a NULL ptr qpair in that instance. Following stack trace is seen in one of the crash: irq_create_affinity_masks+0x98/0x530 irq_create_affinity_masks+0x98/0x530 __pci_enable_msix+0x321/0x4e0 mutex_lock+0x12/0x40 pci_alloc_irq_vectors_affinity+0xb5/0x140 qla24xx_enable_msix+0x79/0x530 [qla2xxx] qla2x00_request_irqs+0x61/0x2d0 [qla2xxx] qla2x00_probe_one+0xc73/0x2390 [qla2xxx] ida_simple_get+0x98/0x100 kernfs_next_descendant_post+0x40/0x50 local_pci_probe+0x45/0xa0 pci_device_probe+0xfc/0x140 driver_probe_device+0x2c5/0x470 __driver_attach+0xdd/0xe0 driver_probe_device+0x470/0x470 bus_for_each_dev+0x6c/0xc0 driver_attach+0x1e/0x20 bus_add_driver+0x45/0x270 driver_register+0x60/0xe0 __pci_register_driver+0x4c/0x50 qla2x00_module_init+0x1ce/0x21e [qla2xxx] Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
himanshu.madhani@cavium.com authored
scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive commit cb590700 upstream. Following messages are seen into system logs qla2xxx [0000:09:00.0]-00af:9: Performing ISP error recovery - ha=ffff98315ee30000. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:09:00.0]-d009:9: Firmware has been previously dumped (ffffba488c001000) -- ignoring request. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. See Bugzilla for details https://bugzilla.kernel.org/show_bug.cgi?id=195285 Fixes: d7459527 ("scsi: qla2xxx: Add multiple queue pair functionality.") Reported-by: Laurence Oberman <loberman@redhat.com> Reported-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Tested-by: Laurence Oberman <loberman@redhat.com> Tested-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johannes Thumshirn authored
commit ddff7ed4 upstream. When pci_enable_device() or pci_enable_device_mem() fail in qla2x00_probe_one() we bail out but do a call to pci_disable_device(). This causes the dev_WARN_ON() in pci_disable_device() to trigger, as the device wasn't enabled previously. So instead of taking the 'probe_out' error path we can directly return *iff* one of the pci_enable_device() calls fails. Additionally rename the 'probe_out' goto label's name to the more descriptive 'disable_device'. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: e315cd28 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit d6dbdd3c upstream. Under memory pressure, we start ageing pages, which amounts to parsing the page tables. Since we don't want to allocate any extra level, we pass NULL for our private allocation cache. Which means that stage2_get_pud() is allowed to fail. This results in the following splat: [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 1520.417741] pgd = ffff810f52fef000 [ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000 [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 1520.435156] Modules linked in: [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G W 4.12.0-rc4-00027-g1885c397eaec #7205 [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016 [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000 [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110 [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0 [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145 [ 1520.486325] sp : ffff800ce04e33d0 [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064 [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000 [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000 [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000 [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000 [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000 [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70 [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008 [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002 [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940 [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200 [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000 [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000 [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008 [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000) [...] [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110 [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0 [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8 [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0 [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0 [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0 [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188 [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250 [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0 [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180 [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8 [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600 [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328 [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340 [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240 [...] The trivial fix is to handle this NULL pud value early, rather than dereferencing it blindly. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-