- 06 Jun, 2016 40 commits
-
-
Oleg Nesterov authored
[ Upstream commit bf959931 ] The following program (simplified version of generated by syzkaller) #include <pthread.h> #include <unistd.h> #include <sys/ptrace.h> #include <stdio.h> #include <signal.h> void *thread_func(void *arg) { ptrace(PTRACE_TRACEME, 0,0,0); return 0; } int main(void) { pthread_t thread; if (fork()) return 0; while (getppid() != 1) ; pthread_create(&thread, NULL, thread_func, NULL); pthread_join(thread, NULL); return 0; } creates an unreapable zombie if /sbin/init doesn't use __WALL. This is not a kernel bug, at least in a sense that everything works as expected: debugger should reap a traced sub-thread before it can reap the leader, but without __WALL/__WCLONE do_wait() ignores sub-threads. Unfortunately, it seems that /sbin/init in most (all?) distributions doesn't use it and we have to change the kernel to avoid the problem. Note also that most init's use sys_waitid() which doesn't allow __WALL, so the necessary user-space fix is not that trivial. This patch just adds the "ptrace" check into eligible_child(). To some degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is mostly ignored when the tracee reports to debugger. Or WSTOPPED, the tracer doesn't need to set this flag to wait for the stopped tracee. This obviously means the user-visible change: __WCLONE and __WALL no longer have any meaning for debugger. And I can only hope that this won't break something, but at least strace/gdb won't suffer. We could make a more conservative change. Say, we can take __WCLONE into account, or !thread_group_leader(). But it would be nice to not complicate these historical/confusing checks. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Roland McGrath <roland@hack.frob.com> Cc: <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Tomáš Trnka authored
[ Upstream commit c0cb8bf3 ] The length of the GSS MIC token need not be a multiple of four bytes. It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data() would previously only trim mic.len + 4 B. The remaining up to three bytes would then trigger a check in nfs4svc_decode_compoundargs(), leading to a "garbage args" error and mount failure: nfs4svc_decode_compoundargs: compound not properly padded! nfsd: failed to decode arguments! This would prevent older clients using the pre-RFC 4121 MIC format (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+ servers using krb5i. The trimming was introduced by commit 4c190e2f ("sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer"). Fixes: 4c190e2f "unrpc: trim off trailing checksum..." Signed-off-by: Tomáš Trnka <ttrnka@mail.muni.cz> Cc: stable@vger.kernel.org Acked-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Adrian Hunter authored
[ Upstream commit 265984b3 ] The CMD19/CMD14 bus width test has been found to be unreliable in some cases. It is not essential, so simply remove it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Matt Gumbel authored
[ Upstream commit 32ecd320 ] 008GE0 Toshiba mmc in some Intel Baytrail tablets responds to MMC_SEND_EXT_CSD in 450-600ms. This patch will... () Increase the long read time quirk timeout from 300ms to 600ms. Original author of that quirk says 300ms was only a guess and that the number may need to be raised in the future. () Add this specific MMC to the quirk Signed-off-by: Matt Gumbel <matthew.k.gumbel@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Ville Syrjälä authored
[ Upstream commit 7045c368 ] When we read out the watermark state from the hardware we're supposed to transfer that into the active watermarks, but currently we fail to any part of the active watermarks that isn't explicitly written. Let's clear it all upfront. Looks like this has been like this since the beginning, when I added the readout. No idea why I didn't clear it up. Cc: Matt Roper <matthew.d.roper@intel.com> Fixes: 243e6a44 ("drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state()") Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463151318-14719-2-git-send-email-ville.syrjala@linux.intel.com (cherry picked from commit 15606534) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Rafael J. Wysocki authored
[ Upstream commit 3a17fb32 ] Grygorii Strashko reports: The PM runtime will be left disabled for the device if its .suspend_late() callback fails and async suspend is not allowed for this device. In this case device will not be added in dpm_late_early_list and dpm_resume_early() will ignore this device, as result PM runtime will be disabled for it forever (side effect: after 8 subsequent failures for the same device the PM runtime will be reenabled due to disable_depth overflow). To fix this problem, add devices to dpm_late_early_list regardless of whether or not device_suspend_late() returns errors for them. That will ensure failures in there to be handled consistently for all devices regardless of their async suspend/resume status. Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Ricky Liang authored
[ Upstream commit affa80bd ] When running a 32-bit userspace on a 64-bit kernel, the UI_SET_PHYS ioctl needs to be treated with special care, as it has the pointer size encoded in the command. Signed-off-by: Ricky Liang <jcliang@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Matt Evans authored
[ Upstream commit e4fe9e7d ] The EC field of the constructed ESR is conditionally modified by ORing in ESR_ELx_EC_DABT_LOW for a data abort. However, ESR_ELx_EC_SHIFT is missing from this condition. Signed-off-by: Matt Evans <matt.evans@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Kai-Heng Feng authored
[ Upstream commit 423cd785 ] The headphone has noise when playing sound or switching microphone sources. It uses the same codec on XPS 13 9350, but with different subsystem ID. Applying the fixup can solve the issue. Also, changing the model name to better differentiate models. v2: Reorder by device ID. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
David Henningsson authored
[ Upstream commit c04017ea ] These laptops support both headphone, headset and mic modes for the 3.5mm jack. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/bugs/1526330Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Sachin Prabhu authored
[ Upstream commit b74cb9a8 ] The session key is the default keyring set for request_key operations. This session key is revoked when the user owning the session logs out. Any long running daemon processes started by this session ends up with revoked session keyring which prevents these processes from using the request_key mechanism from obtaining the krb5 keys. The problem has been reported by a large number of autofs users. The problem is also seen with multiuser mounts where the share may be used by processes run by a user who has since logged out. A reproducer using automount is available on the Red Hat bz. The patch creates a new keyring which is used to cache cifs spnego upcalls. Red Hat bz: 1267754 Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Mark Brown authored
[ Upstream commit d3030d11 ] The ak4642 driver is using a regmap cache sync to restore the configuration of the chip on resume but (as Peter observed) does not actually define a register cache which means that the resume is never going to work and we trigger asserts in regmap. Fix this by enabling caching. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Axel Lin authored
[ Upstream commit f8ea6ceb ] The max_register setting for ak4642, ak4643 and ak4648 are wrong, fix it. According to the datasheet: the maximum valid register for ak4642 is 0x1f the maximum valid register for ak4643 is 0x24 the maximum valid register for ak4648 is 0x27 The default settings for ak4642 and ak4643 are the same for 0x0 ~ 0x1f registers, so it's fine to use the same reg_default table with differnt num_reg_defaults setting. Signed-off-by: Axel Lin <axel.lin@ingics.com> Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Dave Chinner authored
[ Upstream commit 7d3aa7fe ] We don't write back stale inodes so we should skip them in xfs_iflush_cluster, too. cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Dave Chinner authored
[ Upstream commit 51b07f30 ] Some careless idiot(*) wrote crap code in commit 1a3e8f3d ("xfs: convert inode cache lookups to use RCU locking") back in late 2010, and so xfs_iflush_cluster checks the wrong inode for whether it is still valid under RCU protection. Fix it to lock and check the correct inode. (*) Careless-idiot: Dave Chinner <dchinner@redhat.com> cc: <stable@vger.kernel.org> # 3.10.x- Discovered-by: Brain Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Dave Chinner authored
[ Upstream commit b1438f47 ] When a failure due to an inode buffer occurs, the error handling fails to abort the inode writeback correctly. This can result in the inode being reclaimed whilst still in the AIL, leading to use-after-free situations as well as filesystems that cannot be unmounted as the inode log items left in the AIL never get removed. Fix this by ensuring fatal errors from xfs_imap_to_bp() result in the inode flush being aborted correctly. cc: <stable@vger.kernel.org> # 3.10.x- Reported-by: Shyam Kaushik <shyam@zadarastorage.com> Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com> Tested-by: Shyam Kaushik <shyam@zadarastorage.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Daniel Lezcano authored
[ Upstream commit e7387da5 ] Commit 0b89e9aa (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) rightfully fixed a regression by letting the coupled idle state framework to handle local interrupt enabling when the CPU is exiting an idle state. The current code checks if the idle state is coupled and, if so, it will let the coupled code to enable interrupts. This way, it can decrement the ready-count before handling the interrupt. This mechanism prevents the other CPUs from waiting for a CPU which is handling interrupts. But the check is done against the state index returned by the back end driver's ->enter functions which could be different from the initial index passed as parameter to the cpuidle_enter_state() function. entered_state = target_state->enter(dev, drv, index); [ ... ] if (!cpuidle_state_is_coupled(drv, entered_state)) local_irq_enable(); [ ... ] If the 'index' is referring to a coupled idle state but the 'entered_state' is *not* coupled, then the interrupts are enabled again. All CPUs blocked on the sync barrier may busy loop longer if the CPU has interrupts to handle before decrementing the ready-count. That's consuming more energy than saving. Fixes: 0b89e9aa (cpuidle: delay enabling interrupts until all coupled CPUs leave idle) Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Xunlei Pang authored
[ Upstream commit 4c1ed5a6 ] For cpuidle_state_is_coupled(), 'dev' is not used, so remove it. Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steve French authored
[ Upstream commit 897fba11 ] Wrong return code was being returned on SMB3 rmdir of non-empty directory. For SMB3 (unlike for cifs), we attempt to delete a directory by set of delete on close flag on the open. Windows clients set this flag via a set info (SET_FILE_DISPOSITION to set this flag) which properly checks if the directory is empty. With this patch on smb3 mounts we correctly return "DIRECTORY NOT EMPTY" on attempts to remove a non-empty directory. Signed-off-by: Steve French <steve.french@primarydata.com> CC: Stable <stable@vger.kernel.org> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Stefan Metzmacher authored
[ Upstream commit 1a967d6c ] Only server which map unknown users to guest will allow access using a non-null NTLMv2_Response. For Samba it's the "map to guest = bad user" option. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: Stefan Metzmacher <metze@samba.org> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Stefan Metzmacher authored
[ Upstream commit 777f69b8 ] Only server which map unknown users to guest will allow access using a non-null NTChallengeResponse. For Samba it's the "map to guest = bad user" option. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: Stefan Metzmacher <metze@samba.org> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Stefan Metzmacher authored
[ Upstream commit fa8f3a35 ] Only server which map unknown users to guest will allow access using a non-null LMChallengeResponse. For Samba it's the "map to guest = bad user" option. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: Stefan Metzmacher <metze@samba.org> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Stefan Metzmacher authored
[ Upstream commit cfda35d9 ] See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client: ... Set NullSession to FALSE If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND (AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1) OR AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0)) -- Special case: client requested anonymous authentication Set NullSession to TRUE ... Only server which map unknown users to guest will allow access using a non-null NTChallengeResponse. For Samba it's the "map to guest = bad user" option. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913 CC: Stable <stable@vger.kernel.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Lyude authored
[ Upstream commit 255f0e7c ] During boot, MST hotplugs are generally expected (even if no physical hotplugging occurs) and result in DRM's connector topology changing. This means that using num_connector from the current mode configuration can lead to the number of connectors changing under us. This can lead to some nasty scenarios in fbcon: - We allocate an array to the size of dev->mode_config.num_connectors. - MST hotplug occurs, dev->mode_config.num_connectors gets incremented. - We try to loop through each element in the array using the new value of dev->mode_config.num_connectors, and end up going out of bounds since dev->mode_config.num_connectors is now larger then the array we allocated. fb_helper->connector_count however, will always remain consistent while we do a modeset in fb_helper. Note: This is just polish for 4.7, Dave Airlie's drm_connector refcounting fixed these bugs for real. But it's good enough duct-tape for stable kernel backporting, since backporting the refcounting changes is way too invasive. Cc: stable@vger.kernel.org Signed-off-by: Lyude <cpaul@redhat.com> [danvet: Clarify why we need this. Also remove the now unused "dev" local variable to appease gcc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-3-git-send-email-cpaul@redhat.comSigned-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Lyude authored
[ Upstream commit 14a3842a ] During boot time, MST devices usually send a ton of hotplug events irregardless of whether or not any physical hotplugs actually occurred. Hotplugs mean connectors being created/destroyed, and the number of DRM connectors changing under us. This isn't a problem if we use fb_helper->connector_count since we only set it once in the code, however if we use num_connector from struct drm_mode_config we risk it's value changing under us. On top of that, there's even a chance that dev->mode_config.num_connector != fb_helper->connector_count. If the number of connectors happens to increase under us, we'll end up using the wrong array size for memcpy and start writing beyond the actual length of the array, occasionally resulting in kernel panics. Note: This is just polish for 4.7, Dave Airlie's drm_connector refcounting fixed these bugs for real. But it's good enough duct-tape for stable kernel backporting, since backporting the refcounting changes is way too invasive. Cc: stable@vger.kernel.org Signed-off-by: Lyude <cpaul@redhat.com> [danvet: Clarify why we need this.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-2-git-send-email-cpaul@redhat.comSigned-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Maciej W. Rozycki authored
[ Upstream commit e49d3848 ] Fix a build regression from commit c9017757 ("MIPS: init upper 64b of vector registers when MSA is first used"): arch/mips/built-in.o: In function `enable_restore_fp_context': traps.c:(.text+0xbb90): undefined reference to `_init_msa_upper' traps.c:(.text+0xbb90): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper' traps.c:(.text+0xbef0): undefined reference to `_init_msa_upper' traps.c:(.text+0xbef0): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper' to !CONFIG_CPU_HAS_MSA configurations with older GCC versions, which are unable to figure out that calls to `_init_msa_upper' are indeed dead. Of the many ways to tackle this failure choose the approach we have already taken in `thread_msa_context_live'. [ralf@linux-mips.org: Drop patch segment to junk file.] Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: stable@vger.kernel.org # v3.16+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13271/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Prarit Bhargava authored
[ Upstream commit ad67b437 ] b84106b4 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with the PCI spec. But it didn't do anything for expansion ROM BARs, so we still try to size them, resulting in warnings like this on Broadwell-EP: pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref] Move the non-compliant BAR check from __pci_read_base() up to pci_read_bases() so it applies to the expansion ROM BAR as well as to BARs 0-5. Note that direct callers of __pci_read_base(), like sriov_init(), will now bypass this check. We haven't had reports of devices with broken SR-IOV BARs yet. [bhelgaas: changelog] Fixes: b84106b4 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Adrian Hunter authored
[ Upstream commit 1c447116 ] Some eMMCs set the partition switch timeout too low. Now typically eMMCs are considered a critical component (e.g. because they store the root file system) and consequently are expected to be reliable. Thus we can neglect the use case where eMMCs can't switch reliably and we might want a lower timeout to facilitate speedy recovery. Although we could employ a quirk for the cards that are affected (if we could identify them all), as described above, there is little benefit to having a low timeout, so instead simply set a minimum timeout. The minimum is set to 300ms somewhat arbitrarily - the examples that have been seen had a timeout of 10ms but were sometimes taking 60-70ms. Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steven Rostedt (Red Hat) authored
[ Upstream commit 59643d15 ] If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE then the DIV_ROUND_UP() will return zero. Here's the details: # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb tracing_entries_write() processes this and converts kb to bytes. 18014398509481980 << 10 = 18446744073709547520 and this is passed to ring_buffer_resize() as unsigned long size. size = DIV_ROUND_UP(size, BUF_PAGE_SIZE); Where DIV_ROUND_UP(a, b) is (a + b - 1)/b BUF_PAGE_SIZE is 4080 and here 18446744073709547520 + 4080 - 1 = 18446744073709551599 where 18446744073709551599 is still smaller than 2^64 2^64 - 18446744073709551599 = 17 But now 18446744073709551599 / 4080 = 4521260802379792 and size = size * 4080 = 18446744073709551360 This is checked to make sure its still greater than 2 * 4080, which it is. Then we convert to the number of buffer pages needed. nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE) but this time size is 18446744073709551360 and 2^64 - (18446744073709551360 + 4080 - 1) = -3823 Thus it overflows and the resulting number is less than 4080, which makes 3823 / 4080 = 0 an nr_pages is set to this. As we already checked against the minimum that nr_pages may be, this causes the logic to fail as well, and we crash the kernel. There's no reason to have the two DIV_ROUND_UP() (that's just result of historical code changes), clean up the code and fix this bug. Cc: stable@vger.kernel.org # 3.5+ Fixes: 83f40318 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steven Rostedt (Red Hat) authored
[ Upstream commit 9b94a8fb ] The size variable to change the ring buffer in ftrace is a long. The nr_pages used to update the ring buffer based on the size is int. On 64 bit machines this can cause an overflow problem. For example, the following will cause the ring buffer to crash: # cd /sys/kernel/debug/tracing # echo 10 > buffer_size_kb # echo 8556384240 > buffer_size_kb Then you get the warning of: WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260 Which is: RB_WARN_ON(cpu_buffer, nr_removed); Note each ring buffer page holds 4080 bytes. This is because: 1) 10 causes the ring buffer to have 3 pages. (10kb requires 3 * 4080 pages to hold) 2) (2^31 / 2^10 + 1) * 4080 = 8556384240 The value written into buffer_size_kb is shifted by 10 and then passed to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760 3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE which is 4080. 8761737461760 / 4080 = 2147484672 4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update 5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int turns into the value of -2147482627 6) As the value is a negative number, in update_pages_handler() it is negated and passed to rb_remove_pages() and 2147482627 pages will be removed, which is much larger than 3 and it causes the warning because not all the pages asked to be removed were removed. Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Cc: stable@vger.kernel.org # 2.6.28+ Fixes: 7a8e76a3 ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steven Rostedt (Red Hat) authored
[ Upstream commit 58a09ec6 ] Instead of using a global per_cpu variable to perform the recursive checks into the ring buffer, use the already existing per_cpu descriptor that is part of the ring buffer itself. Not only does this simplify the code, it also allows for one ring buffer to be used within the guts of the use of another ring buffer. For example trace_printk() can now be used within the ring buffer to record changes done by an instance into the main ring buffer. The recursion checks will prevent the trace_printk() itself from causing recursive issues with the main ring buffer (it is just ignored), but the recursive checks wont prevent the trace_printk() from recording other ring buffers. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Steven Rostedt (Red Hat) authored
[ Upstream commit 3205f806 ] I was running the trace_event benchmark and noticed that the times to record a trace_event was all over the place. I looked at the assembly of the ring_buffer_lock_reserver() and saw this: <ring_buffer_lock_reserve>: 31 c0 xor %eax,%eax 48 83 3d 76 47 bd 00 cmpq $0x1,0xbd4776(%rip) # ffffffff81d10d60 <ring_buffer_flags> 01 55 push %rbp 48 89 e5 mov %rsp,%rbp 75 1d jne ffffffff8113c60d <ring_buffer_lock_reserve+0x2d> 65 ff 05 69 e3 ec 7e incl %gs:0x7eece369(%rip) # a960 <__preempt_count> 8b 47 08 mov 0x8(%rdi),%eax 85 c0 test %eax,%eax +---- 74 12 je ffffffff8113c610 <ring_buffer_lock_reserve+0x30> | 65 ff 0d 5b e3 ec 7e decl %gs:0x7eece35b(%rip) # a960 <__preempt_count> | 0f 84 85 00 00 00 je ffffffff8113c690 <ring_buffer_lock_reserve+0xb0> | 31 c0 xor %eax,%eax | 5d pop %rbp | c3 retq | 90 nop +---> 65 44 8b 05 48 e3 ec mov %gs:0x7eece348(%rip),%r8d # a960 <__preempt_count> 7e 41 81 e0 ff ff ff 7f and $0x7fffffff,%r8d b0 08 mov $0x8,%al 65 8b 0d 58 36 ed 7e mov %gs:0x7eed3658(%rip),%ecx # fc80 <current_context> 41 f7 c0 00 ff 1f 00 test $0x1fff00,%r8d 74 1e je ffffffff8113c64f <ring_buffer_lock_reserve+0x6f> 41 f7 c0 00 00 10 00 test $0x100000,%r8d b0 01 mov $0x1,%al 75 13 jne ffffffff8113c64f <ring_buffer_lock_reserve+0x6f> 41 81 e0 00 00 0f 00 and $0xf0000,%r8d 49 83 f8 01 cmp $0x1,%r8 19 c0 sbb %eax,%eax 83 e0 02 and $0x2,%eax 83 c0 02 add $0x2,%eax 85 c8 test %ecx,%eax 75 ab jne ffffffff8113c5fe <ring_buffer_lock_reserve+0x1e> 09 c8 or %ecx,%eax 65 89 05 24 36 ed 7e mov %eax,%gs:0x7eed3624(%rip) # fc80 <current_context> The arrow is the fast path. After adding the unlikely's, the fast path looks a bit better: <ring_buffer_lock_reserve>: 31 c0 xor %eax,%eax 48 83 3d 76 47 bd 00 cmpq $0x1,0xbd4776(%rip) # ffffffff81d10d60 <ring_buffer_flags> 01 55 push %rbp 48 89 e5 mov %rsp,%rbp 75 7b jne ffffffff8113c66b <ring_buffer_lock_reserve+0x8b> 65 ff 05 69 e3 ec 7e incl %gs:0x7eece369(%rip) # a960 <__preempt_count> 8b 47 08 mov 0x8(%rdi),%eax 85 c0 test %eax,%eax 0f 85 9f 00 00 00 jne ffffffff8113c6a1 <ring_buffer_lock_reserve+0xc1> 65 8b 0d 57 e3 ec 7e mov %gs:0x7eece357(%rip),%ecx # a960 <__preempt_count> 81 e1 ff ff ff 7f and $0x7fffffff,%ecx b0 08 mov $0x8,%al 65 8b 15 68 36 ed 7e mov %gs:0x7eed3668(%rip),%edx # fc80 <current_context> f7 c1 00 ff 1f 00 test $0x1fff00,%ecx 75 50 jne ffffffff8113c670 <ring_buffer_lock_reserve+0x90> 85 d0 test %edx,%eax 75 7d jne ffffffff8113c6a1 <ring_buffer_lock_reserve+0xc1> 09 d0 or %edx,%eax 65 89 05 53 36 ed 7e mov %eax,%gs:0x7eed3653(%rip) # fc80 <current_context> 65 8b 05 fc da ec 7e mov %gs:0x7eecdafc(%rip),%eax # a130 <cpu_number> 89 c2 mov %eax,%edx Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Paul Burton authored
[ Upstream commit bd239f1e ] Whilst a PR_SET_FP_MODE prctl is performed there are decisions made based upon whether the task is executing on the current CPU. This may change if we're preempted, so disable preemption to avoid such changes for the lifetime of the mode switch. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS") Reviewed-by: Maciej W. Rozycki <macro@imgtec.com> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Cc: Adam Buchbinder <adam.buchbinder@gmail.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: stable <stable@vger.kernel.org> # v4.0+ Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/13144/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Maciej W. Rozycki authored
[ Upstream commit abf378be ] Correct the cases missed with commit 9b26616c ("MIPS: Respect the ISA level in FCSR handling") and prevent writes to read-only FCSR bits there. This in particular applies to FP context initialisation where any IEEE 754-2008 bits preset by `mips_set_personality_nan' are cleared before the relevant ptrace(2) call takes effect and the PTRACE_POKEUSR request addressing FPC_CSR where no masking of read-only FCSR bits is done. Remove the FCSR clearing from FP context initialisation then and unify PTRACE_POKEUSR/FPC_CSR and PTRACE_SETFPREGS handling, by factoring out code from `ptrace_setfpregs' and calling it from both places. This mostly matters to soft float configurations where the emulator can be switched this way to a mode which should not be accessible and cannot be set with the CTC1 instruction. With hard float configurations any effect is transient anyway as read-only bits will retain their values at the time the FP context is restored. Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: stable@vger.kernel.org # v4.0+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13239/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Maciej W. Rozycki authored
[ Upstream commit 42495484 ] Fix a floating-point context restoration regression introduced with commit 9b26616c ("MIPS: Respect the ISA level in FCSR handling") that causes a Floating Point exception and consequently a kernel oops with hard float configurations when one or more FCSR Enable and their corresponding Cause bits are set both at a time via a ptrace(2) call. To do so reinstate Cause bit masking originally introduced with commit b1442d39 ("MIPS: Prevent user from setting FCSR cause bits") to address this exact problem and then inadvertently removed from the PTRACE_SETFPREGS request with the commit referred above. Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: stable@vger.kernel.org # v4.0+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13238/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Paul Burton authored
[ Upstream commit ab4a92e6 ] When emulating a jalr instruction with rd == $0, the code in isBranchInstr was incorrectly writing to GPR $0 which should actually always remain zeroed. This would lead to any further instructions emulated which use $0 operating on a bogus value until the task is next context switched, at which point the value of $0 in the task context would be restored to the correct zero by a store in SAVE_SOME. Fix this by not writing to rd if it is $0. Fixes: 102cedc3 ("MIPS: microMIPS: Floating point support.") Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Maciej W. Rozycki <macro@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable <stable@vger.kernel.org> # v3.10 Patchwork: https://patchwork.linux-mips.org/patch/13160/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
James Hogan authored
[ Upstream commit 987e5b83 ] Since commit 8cb48fe1 ("MIPS: Provide correct siginfo_t.si_stime"), MIPS' uapi/asm/siginfo.h has included uapi/asm-generic/siginfo.h directly before defining MIPS' struct siginfo, in order to get the necessary definitions needed for the siginfo struct without the generic copy_siginfo() hitting compiler errors due to struct siginfo not yet being defined. Now that the generic copy_siginfo() is moved out to linux/signal.h we can safely include asm-generic/siginfo.h before defining the MIPS specific struct siginfo, which avoids the uapi/ include as well as breakage due to generic copy_siginfo() being defined before struct siginfo. Reported-by: Christopher Ferris <cferris@google.com> Fixes: 8cb48fe1 ("MIPS: Provide correct siginfo_t.si_stime") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Petr Malat <oss@malat.biz> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.0- Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
James Hogan authored
[ Upstream commit ca9eb49a ] The generic copy_siginfo() is currently defined in asm-generic/siginfo.h, after including uapi/asm-generic/siginfo.h which defines the generic struct siginfo. However this makes it awkward for an architecture to use it if it has to define its own struct siginfo (e.g. MIPS and potentially IA64), since it means that asm-generic/siginfo.h can only be included after defining the arch-specific siginfo, which may be problematic if the arch-specific definition needs definitions from uapi/asm-generic/siginfo.h. It is possible to work around this by first including uapi/asm-generic/siginfo.h to get the constants before defining the arch-specific siginfo, and include asm-generic/siginfo.h after. However uapi headers can't be included by other uapi headers, so that first include has to be in an ifdef __kernel__, with the non __kernel__ case including the non-UAPI header instead. Instead of that mess, move the generic copy_siginfo() definition into linux/signal.h, which allows an arch-specific uapi/asm/siginfo.h to include asm-generic/siginfo.h and define the arch-specific siginfo, and for the generic copy_siginfo() to see that arch-specific definition. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Petr Malat <oss@malat.biz> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Christopher Ferris <cferris@google.com> Cc: linux-arch@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-ia64@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 4.0- Patchwork: https://patchwork.linux-mips.org/patch/12478/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Paul Burton authored
[ Upstream commit 37d22a0d ] It's possible for pages to become visible prior to update_mmu_cache running if a thread within the same address space preempts the current thread or runs simultaneously on another CPU. That is, the following scenario is possible: CPU0 CPU1 write to page flush_dcache_page flush_icache_page set_pte_at map page update_mmu_cache If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid & visible, and update_mmu_cache where the dcache flush occurs then CPU1s icache will fill from stale data (unless it fills from the dcache, in which case all is good, but most MIPS CPUs don't have this property). Commit 4d46a67a ("MIPS: Fix race condition in lazy cache flushing.") attempted to fix that by performing the dcache flush in flush_icache_page such that it occurs before the set_pte_at call makes the page visible. However it has the problem that not all code that writes to pages exposed to userland call flush_icache_page. There are many callers of set_pte_at under mm/ and only 2 of them do call flush_icache_page. Thus the race window between a page becoming visible & being coherent between the icache & dcache remains open in some cases. To illustrate some of the cases, a WARN was added to __update_cache with this patch applied that triggered in cases where a page about to be flushed from the dcache was not the last page provided to flush_icache_page. That is, backtraces were obtained for cases in which the race window is left open without this patch. The 2 standout examples follow. When forking a process: [ 15.271842] [<80417630>] __update_cache+0xcc/0x188 [ 15.277274] [<80530394>] copy_page_range+0x56c/0x6ac [ 15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac [ 15.289028] [<80429f80>] do_fork+0xe4/0x420 [ 15.293747] [<80413808>] handle_sys+0x128/0x14c When exec'ing an ELF binary: [ 14.445964] [<80417630>] __update_cache+0xcc/0x188 [ 14.451369] [<80538d88>] move_page_tables+0x414/0x498 [ 14.457075] [<8055d848>] setup_arg_pages+0x220/0x318 [ 14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0 [ 14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214 [ 14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c [ 14.480324] [<8055f938>] do_execve+0x38/0x44 [ 14.485137] [<80413808>] handle_sys+0x128/0x14c These code paths write into a page, call flush_dcache_page then call set_pte_at without flush_icache_page inbetween. The end result is that the icache can become corrupted & userland processes may execute unexpected or invalid code, typically resulting in a reserved instruction exception, a trap or a segfault. Fix this race condition fully by performing any cache maintenance required to keep the icache & dcache in sync in set_pte_at, before the page is made valid. This has the added bonus of ensuring the cache maintenance always happens in one location, rather than being duplicated in flush_icache_page & update_mmu_cache. It also matches the way other architectures solve the same problem (see arm, ia64 & powerpc). Signed-off-by: Paul Burton <paul.burton@imgtec.com> Reported-by: Ionela Voinescu <ionela.voinescu@imgtec.com> Cc: Lars Persson <lars.persson@axis.com> Fixes: 4d46a67a ("MIPS: Fix race condition in lazy cache flushing.") Cc: Steven J. Hill <sjhill@realitydiluted.com> Cc: David Daney <david.daney@cavium.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable <stable@vger.kernel.org> # v4.1+ Patchwork: https://patchwork.linux-mips.org/patch/12722/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-
Paul Burton authored
[ Upstream commit f4281bba ] The following patch will expose __update_cache to highmem pages. Handle them by mapping them in for the duration of the cache maintenance, just like in __flush_dcache_page. The code for that isn't shared because we need the page address in __update_cache so sharing became messy. Given that the entirity is an extra 5 lines, just duplicate it. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Lars Persson <lars.persson@axis.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable <stable@vger.kernel.org> # v4.1+ Patchwork: https://patchwork.linux-mips.org/patch/12721/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
-