- 09 Sep, 2018 40 commits
-
-
Richard Weinberger authored
commit eef19816 upstream. Allocate the buffer after we return early. Otherwise memory is being leaked. Cc: <stable@vger.kernel.org> Fixes: 1e51764a ("UBIFS: add new flash file system") Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jann Horn authored
commit 5820f140 upstream. The old code would hold the userns_state_mutex indefinitely if memdup_user_nul stalled due to e.g. a userfault region. Prevent that by moving the memdup_user_nul in front of the mutex_lock(). Note: This changes the error precedence of invalid buf/count/*ppos vs map already written / capabilities missing. Fixes: 22d917d8 ("userns: Rework the user_namespace adding uid/gid...") Cc: stable@vger.kernel.org Signed-off-by:
Jann Horn <jannh@google.com> Acked-by:
Christian Brauner <christian@brauner.io> Acked-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jann Horn authored
commit 42a0cc34 upstream. Holding uts_sem as a writer while accessing userspace memory allows a namespace admin to stall all processes that attempt to take uts_sem. Instead, move data through stack buffers and don't access userspace memory while uts_sem is held. Cc: stable@vger.kernel.org Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Jann Horn <jannh@google.com> Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jacob Pan authored
commit 1c48db44 upstream. PFSID should be used in the invalidation descriptor for flushing device IOTLBs on SRIOV VFs. Signed-off-by:
Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: stable@vger.kernel.org Cc: "Ashok Raj" <ashok.raj@intel.com> Cc: "Lu Baolu" <baolu.lu@linux.intel.com> Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jacob Pan authored
commit 0f725561 upstream. When SRIOV VF device IOTLB is invalidated, we need to provide the PF source ID such that IOMMU hardware can gauge the depth of invalidation queue which is shared among VFs. This is needed when device invalidation throttle (DIT) capability is supported. This patch adds bit definitions for checking and tracking PFSID. Signed-off-by:
Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: stable@vger.kernel.org Cc: "Ashok Raj" <ashok.raj@intel.com> Cc: "Lu Baolu" <baolu.lu@linux.intel.com> Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit a6f57208 upstream. Will noted that only checking mm_users is incorrect; we should also check mm_count in order to cover CPUs that have a lazy reference to this mm (and could do speculative TLB operations). If removing this turns out to be a performance issue, we can re-instate a more complete check, but in tlb_table_flush() eliding the call_rcu_sched(). Fixes: 26723911 ("mm, powerpc: move the RCU page-table freeing into generic code") Reported-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Rik van Riel <riel@surriel.com> Acked-by:
Will Deacon <will.deacon@arm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jon Hunter authored
commit 6e181190 upstream. On all versions of Tegra30 Cardhu, the reset signal to the NXP PCA9546 I2C mux is connected to the Tegra GPIO BB0. Currently, this pin on the Tegra is not configured as a GPIO but as a special-function IO (SFIO) that is multiplexing the pin to an I2S controller. On exiting system suspend, I2C commands sent to the PCA9546 are failing because there is no ACK. Although it is not possible to see exactly what is happening to the reset during suspend, by ensuring it is configured as a GPIO and driven high, to de-assert the reset, the failures are no longer seen. Please note that this GPIO is also used to drive the reset signal going to the camera connector on the board. However, given that there is no camera support currently for Cardhu, this should not have any impact. Fixes: 40431d16 ("ARM: tegra: enable PCA9546 on Cardhu") Cc: stable@vger.kernel.org Signed-off-by:
Jon Hunter <jonathanh@nvidia.com> Signed-off-by:
Thierry Reding <treding@nvidia.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
commit 8618289c upstream. We must drop the lock before we can sleep in referring_call_exists(). Reported-by:
Jia-Ju Bai <baijiaju1990@gmail.com> Fixes: 045d2a6d ("NFSv4.1: Delay callback processing...") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
commit d0fbb1d8 upstream. The use of the inode->i_lock was converted to a mutex, but we forgot to remove the old inode unlock/lock() pair that allowed the layout segment to be put inside the loop. Reported-by:
Jia-Ju Bai <baijiaju1990@gmail.com> Fixes: e824f99a ("NFSv4: Use a mutex to protect the per-inode commit...") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bill Baker authored
commit 0f90be13 upstream. After a live data migration event at the NFS server, the client may send I/O requests to the wrong server, causing a live hang due to repeated recovery events. On the wire, this will appear as an I/O request failing with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly. NFS4ERR_BADSSESSION is returned because the session ID being used was issued by the other server and is not valid at the old server. The failure is caused by async worker threads having cached the transport (xprt) in the rpc_task structure. After the migration recovery completes, the task is redispatched and the task resends the request to the wrong server based on the old value still present in tk_xprt. The solution is to recompute the tk_xprt field of the rpc_task structure so that the request goes to the correct server. Signed-off-by:
Bill Baker <bill.baker@oracle.com> Reviewed-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Helen Chao <helen.chao@oracle.com> Fixes: fb43d172 ("SUNRPC: Use the multipath iterator to assign a ...") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 0914bb96 upstream. "dev->nr_children" is the number of children which were parsed successfully in bl_parse_stripe(). It could be all of them and then, in that case, it is equal to v->stripe.volumes_count. Either way, the > should be >= so that we don't go beyond the end of what we're supposed to. Fixes: 5c83746a ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # 3.17+ Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Maciej S. Szmigiero authored
commit fc8ebd01 upstream. The value that struct cftype .write() method returns is then directly returned to userspace as the value returned by write() syscall, so it should be the number of bytes actually written (or consumed) and not zero. Returning zero from write() syscall makes programs like /bin/echo or bash spin. Signed-off-by:
Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: e21b7a0b ("block, bfq: add full hierarchical scheduling and cgroups support") Cc: stable@vger.kernel.org Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Max Filippov authored
commit fec3259c upstream. Cache invalidation macros use cache line size to iterate over invalidated cache lines, assuming that all cache ways are invalidated by single instruction, but xtensa ISA recommends to not assume that for future compatibility: In some implementations all ways at index Addry-1..z are invalidated regardless of the specified way, but for future compatibility this behavior should not be assumed. Iterate over all cache ways in ___invalidate_icache_all and ___invalidate_dcache_all. Cc: stable@vger.kernel.org Signed-off-by:
Max Filippov <jcmvbkbc@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Max Filippov authored
commit be75de25 upstream. When building kernel for xtensa cores with big cache lines (e.g. 128 bytes or more) __loop_cache_all and __loop_cache_page may generate assembly instructions with immediate fields that are too big. This results in the following build errors: arch/xtensa/mm/misc.S: Assembler messages: arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '256' arch/xtensa/mm/misc.S:464: Error: operand 2 of 'diwbi' has invalid value '384' arch/xtensa/kernel/head.S: Assembler messages: arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '256' arch/xtensa/kernel/head.S:172: Error: operand 2 of 'diu' has invalid value '384' arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '256' arch/xtensa/kernel/head.S:176: Error: operand 2 of 'iiu' has invalid value '384' arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '256' arch/xtensa/kernel/head.S:255: Error: operand 2 of 'diwb' has invalid value '384' Add parameter max_immed to these macros and use it to limit values of immediate operands. Extract common code of these macros into the new macro __loop_cache_unroll. Cc: stable@vger.kernel.org Signed-off-by:
Max Filippov <jcmvbkbc@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Mackerras authored
commit 8cfbdbdc upstream. Commit 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page", 2018-07-17) added some checks to ensure that guest DMA mappings don't attempt to map more than the guest is entitled to access. However, errors in the logic mean that legitimate guest requests to map pages for DMA are being denied in some situations. Specifically, if the first page of the range passed to mm_iommu_get() is mapped with a normal page, and subsequent pages are mapped with transparent huge pages, we end up with mem->pageshift == 0. That means that the page size checks in mm_iommu_ua_to_hpa() and mm_iommu_up_to_hpa_rm() will always fail for every page in that region, and thus the guest can never map any memory in that region for DMA, typically leading to a flood of error messages like this: qemu-system-ppc64: VFIO_MAP_DMA: -22 qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument) The logic errors in mm_iommu_get() are: (a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte() call (meaning that find_linux_pte() returns the pte for the first address in the range, not the address we are currently up to); (b) use of 'pageshift' as the variable to receive the hugepage shift returned by find_linux_pte() - for a normal page this gets set to 0, leading to us setting mem->pageshift to 0 when we conclude that the pte returned by find_linux_pte() didn't match the page we were looking at; (c) comparing 'compshift', which is a page order, i.e. log base 2 of the number of pages, with 'pageshift', which is a log base 2 of the number of bytes. To fix these problems, this patch introduces 'cur_ua' to hold the current user address and uses that in the find_linux_pte() call; introduces 'pteshift' to hold the hugepage shift found by find_linux_pte(); and compares 'pteshift' with 'compshift + PAGE_SHIFT' rather than 'compshift'. The patch also moves the local_irq_restore to the point after the PTE pointer returned by find_linux_pte() has been dereferenced because otherwise the PTE could change underneath us, and adds a check to avoid doing the find_linux_pte() call once mem->pageshift has been reduced to PAGE_SHIFT, as an optimization. Fixes: 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit 0027ff2a upstream. Two bug fixes: 1) missing entries in the l1d_param array; this can cause a host crash if an access attempts to reach the missing entry. Future-proof the get function against any overflows as well. However, the two entries VMENTER_L1D_FLUSH_EPT_DISABLED and VMENTER_L1D_FLUSH_NOT_REQUIRED must not be accepted by the parse function, so disable them there. 2) invalid values must be rejected even if the CPU does not have the bug, so test for them before checking boot_cpu_has(X86_BUG_L1TF) ... and a small refactoring, since the .cmd field is redundant with the index in the array. Reported-by:
Bandan Das <bsd@redhat.com> Cc: stable@vger.kernel.org Fixes: a7b9020bSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
zhangyi (F) authored
commit 3df6f61f upstream. Commit ea0212f4 (power: auto select CONFIG_SRCU) made the code in drivers/base/power/wakeup.c use SRCU instead of RCU, but it forgot to select CONFIG_SRCU in Kconfig, which leads to the following build error if CONFIG_SRCU is not selected somewhere else: drivers/built-in.o: In function `wakeup_source_remove': (.text+0x3c6fc): undefined reference to `synchronize_srcu' drivers/built-in.o: In function `pm_print_active_wakeup_sources': (.text+0x3c7a8): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `pm_print_active_wakeup_sources': (.text+0x3c84c): undefined reference to `__srcu_read_unlock' drivers/built-in.o: In function `device_wakeup_arm_wake_irqs': (.text+0x3d1d8): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `device_wakeup_arm_wake_irqs': (.text+0x3d228): undefined reference to `__srcu_read_unlock' drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs': (.text+0x3d24c): undefined reference to `__srcu_read_lock' drivers/built-in.o: In function `device_wakeup_disarm_wake_irqs': (.text+0x3d29c): undefined reference to `__srcu_read_unlock' drivers/built-in.o:(.data+0x4158): undefined reference to `process_srcu' Fix this error by selecting CONFIG_SRCU when PM_SLEEP is enabled. Fixes: ea0212f4 (power: auto select CONFIG_SRCU) Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> [ rjw: Minor subject/changelog fixups ] Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Henry Willard authored
commit 2a3eb51e upstream. If cppc_cpufreq.ko is deleted at the same time that tuned-adm is changing profiles, there is a small chance that a race can occur between cpufreq_dbs_governor_exit() and cpufreq_dbs_governor_limits() resulting in a system failure when the latter tries to use policy->governor_data that has been freed by the former. This patch uses gov_dbs_data_mutex to synchronize access. Fixes: e788892b (cpufreq: governor: Get rid of governor events) Signed-off-by:
Henry Willard <henry.willard@oracle.com> [ rjw: Subject, minor white space adjustment ] Cc: 4.8+ <stable@vger.kernel.org> # 4.8+ Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Kalauskas authored
commit c8bd134a upstream. The call to strlcpy in backing_dev_store is incorrect. It should take the size of the destination buffer instead of the size of the source buffer. Additionally, ignore the newline character (\n) when reading the new file_name buffer. This makes it possible to set the backing_dev as follows: echo /dev/sdX > /sys/block/zram0/backing_dev The reason it worked before was the fact that strlcpy() copies 'len - 1' bytes, which is strlen(buf) - 1 in our case, so it accidentally didn't copy the trailing new line symbol. Which also means that "echo -n /dev/sdX" most likely was broken. Signed-off-by:
Peter Kalauskas <peskal@google.com> Link: http://lkml.kernel.org/r/20180813061623.GC64836@rodete-desktop-imager.corp.google.comAcked-by:
Minchan Kim <minchan@kernel.org> Reviewed-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Amir Goldstein authored
commit 67810693 upstream. Only upper dir can be impure, but if we are in the middle of iterating a lower real dir, dir could be copied up and marked impure. We only want the impure cache if we started iterating a real upper dir to begin with. Aditya Kali reported that the following reproducer hits the WARN_ON(!cache->refcount) in ovl_get_cache(): docker run --rm drupal:8.5.4-fpm-alpine \ sh -c 'cd /var/www/html/vendor/symfony && \ chown -R www-data:www-data . && ls -l .' Reported-by:
Aditya Kali <adityakali@google.com> Tested-by:
Aditya Kali <adityakali@google.com> Fixes: 4edb83bb ('ovl: constant d_ino for non-merge dirs') Cc: <stable@vger.kernel.org> # v4.14 Signed-off-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael David Tinoco authored
commit 6afebb70 upstream. Fixes https://bugs.linaro.org/show_bug.cgi?id=3903 LTP Functional tests have caused a bad paging request when triggering the regmap_read_debugfs() logic of the device PMIC Hi6553 (reading regmap/f8000000.pmic/registers file during read_all test): Unable to handle kernel paging request at virtual address ffff0 [ffff00000984e000] pgd=0000000077ffe803, pud=0000000077ffd803,0 Internal error: Oops: 96000007 [#1] SMP ... Hardware name: HiKey Development Board (DT) ... Call trace: regmap_mmio_read8+0x24/0x40 regmap_mmio_read+0x48/0x70 _regmap_bus_reg_read+0x38/0x48 _regmap_read+0x68/0x170 regmap_read+0x50/0x78 regmap_read_debugfs+0x1a0/0x308 regmap_map_read_file+0x48/0x58 full_proxy_read+0x68/0x98 __vfs_read+0x48/0x80 vfs_read+0x94/0x150 SyS_read+0x6c/0xd8 el0_svc_naked+0x30/0x34 Code: aa1e03e0 d503201f f9400280 8b334000 (39400000) Investigations have showed that, when triggered by debugfs read() handler, the mmio regmap logic was reading a bigger (16k) register area than the one mapped by devm_ioremap_resource() during hi655x-pmic probe time (4k). This commit changes hi655x's max register, according to HW specs, to be the same as the one declared in the pmic device in hi6220's dts, fixing the issue. Cc: <stable@vger.kernel.org> #v4.9 #v4.14 #v4.16 #v4.17 Signed-off-by:
Rafael David Tinoco <rafael.tinoco@linaro.org> Signed-off-by:
Lee Jones <lee.jones@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit 016f8ffc upstream. While debugging another bug, I was looking at all the synchronize*() functions being used in kernel/trace, and noticed that trace_uprobes was using synchronize_sched(), with a comment to synchronize with {u,ret}_probe_trace_func(). When looking at those functions, the data is protected with "rcu_read_lock()" and not with "rcu_read_lock_sched()". This is using the wrong synchronize_*() function. Link: http://lkml.kernel.org/r/20180809160553.469e1e32@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 70ed91c6 ("tracing/uprobes: Support ftrace_event_file base multibuffer") Acked-by:
Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kamalesh Babulal authored
commit 6e9df95b upstream. livepatch module author can pass module name/old function name with more than the defined character limit. With obj->name length greater than MODULE_NAME_LEN, the livepatch module gets loaded but waits forever on the module specified by obj->name to be loaded. It also populates a /sys directory with an untruncated object name. In the case of funcs->old_name length greater then KSYM_NAME_LEN, it would not match against any of the symbol table entries. Instead loop through the symbol table comparing them against a nonexisting function, which can be avoided. The same issues apply, to misspelled/incorrect names. At least gatekeep the modules with over the limit string length, by checking for their length during livepatch module registration. Cc: stable@vger.kernel.org Signed-off-by:
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by:
Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit d1c392c9 upstream. I hit the following splat in my tests: ------------[ cut here ]------------ IRQs not enabled as expected WARNING: CPU: 3 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x44/0x8c Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6 CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-rc2-test+ #2 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: tick_nohz_idle_enter+0x44/0x8c Code: ec 05 00 00 00 75 26 83 b8 c0 05 00 00 00 75 1d 80 3d d0 36 3e c1 00 75 14 68 94 63 12 c1 c6 05 d0 36 3e c1 01 e8 04 ee f8 ff <0f> 0b 58 fa bb a0 e5 66 c1 e8 25 0f 04 00 64 03 1d 28 31 52 c1 8b EAX: 0000001c EBX: f26e7f8c ECX: 00000006 EDX: 00000007 ESI: f26dd1c0 EDI: 00000000 EBP: f26e7f40 ESP: f26e7f38 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010296 CR0: 80050033 CR2: 0813c6b0 CR3: 2f342000 CR4: 001406f0 Call Trace: do_idle+0x33/0x202 cpu_startup_entry+0x61/0x63 start_secondary+0x18e/0x1ed startup_32_smp+0x164/0x168 irq event stamp: 18773830 hardirqs last enabled at (18773829): [<c040150c>] trace_hardirqs_on_thunk+0xc/0x10 hardirqs last disabled at (18773830): [<c040151c>] trace_hardirqs_off_thunk+0xc/0x10 softirqs last enabled at (18773824): [<c0ddaa6f>] __do_softirq+0x25f/0x2bf softirqs last disabled at (18773767): [<c0416bbe>] call_on_stack+0x45/0x4b ---[ end trace b7c64aa79e17954a ]--- After a bit of debugging, I found what was happening. This would trigger when performing "perf" with a high NMI interrupt rate, while enabling and disabling function tracer. Ftrace uses breakpoints to convert the nops at the start of functions to calls to the function trampolines. The breakpoint traps disable interrupts and this makes calls into lockdep via the trace_hardirqs_off_thunk in the entry.S code. What happens is the following: do_idle { [interrupts enabled] <interrupt> [interrupts disabled] TRACE_IRQS_OFF [lockdep says irqs off] [...] TRACE_IRQS_IRET test if pt_regs say return to interrupts enabled [yes] TRACE_IRQS_ON [lockdep says irqs are on] <nmi> nmi_enter() { printk_nmi_enter() [traced by ftrace] [ hit ftrace breakpoint ] <breakpoint exception> TRACE_IRQS_OFF [lockdep says irqs off] [...] TRACE_IRQS_IRET [return from breakpoint] test if pt_regs say interrupts enabled [no] [iret back to interrupt] [iret back to code] tick_nohz_idle_enter() { lockdep_assert_irqs_enabled() [lockdep say no!] Although interrupts are indeed enabled, lockdep thinks it is not, and since we now do asserts via lockdep, it gives a false warning. The issue here is that printk_nmi_enter() is called before lockdep_off(), which disables lockdep (for this reason) in NMIs. By simply not allowing ftrace to see printk_nmi_enter() (via notrace annotation) we keep lockdep from getting confused. Cc: stable@vger.kernel.org Fixes: 42a0bb3f ("printk/nmi: generic solution for safe printk in NMI") Acked-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by:
Petr Mladek <pmladek@suse.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit 757d9140 upstream. Masami Hiramatsu reported: Current trace-enable attribute in sysfs returns an error if user writes the same setting value as current one, e.g. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable bash: echo: write error: Invalid argument # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable bash: echo: write error: Device or resource busy But this is not a preferred behavior, it should ignore if new setting is same as current one. This fixes the problem as below. # cat /sys/block/sda/trace/enable 0 # echo 0 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable # echo 1 > /sys/block/sda/trace/enable Link: http://lkml.kernel.org/r/20180816103802.08678002@gandalf.local.home Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Cc: stable@vger.kernel.org Fixes: cd649b8b ("blktrace: remove sysfs_blk_trace_enable_show/store()") Reported-by:
Masami Hiramatsu <mhiramat@kernel.org> Tested-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit f143641b upstream. Currently, when one echo's in 1 into tracing_on, the current tracer's "start()" function is executed, even if tracing_on was already one. This can lead to strange side effects. One being that if the hwlat tracer is enabled, and someone does "echo 1 > tracing_on" into tracing_on, the hwlat tracer's start() function is called again which will recreate another kernel thread, and make it unable to remove the old one. Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de Cc: stable@vger.kernel.org Fixes: 2df8f8a6 ("tracing: Fix regression with irqsoff tracer and tracing_on file") Reported-by:
Erica Bugden <erica.bugden@linutronix.de> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 5c8b84f4 upstream. Do not set the system power-off callback and omap power-off rtc pointer until we're done setting up our device to avoid leaving stale pointers around after a late probe error. Fixes: 97ea1906 ("rtc: omap: Support ext_wakeup configuration") Cc: stable <stable@vger.kernel.org> # 4.9 Cc: Marcin Niestroj <m.niestroj@grinn-global.com> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by:
Johan Hovold <johan@kernel.org> Acked-by:
Tony Lindgren <tony@atomide.com> Reviewed-by:
Marcin Niestroj <m.niestroj@grinn-global.com> Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
commit c3cc1b0f upstream. Currently, when all modules, including VMCI and VMware balloon are built into the kernel, the initialization of the balloon happens before the VMCI is probed. As a result, the balloon fails to initialize the VMCI doorbell, which it uses to get asynchronous requests for balloon size changes. The problem can be seen in the logs, in the form of the following message: "vmw_balloon: failed to initialize vmci doorbell" The driver would work correctly but slightly less efficiently, probing for requests periodically. This patch changes the balloon to be initialized using late_initcall() instead of module_init() to address this issue. It does not address a situation in which VMCI is built as a module and the balloon is built into the kernel. Fixes: 48e3d668 ("VMware balloon: Enable notification via VMCI") Cc: stable@vger.kernel.org Reviewed-by:
Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by:
Nadav Amit <namit@vmware.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
commit ce664331 upstream. When vmballoon_vmci_init() sets a doorbell using VMCI_DOORBELL_SET, for some reason it does not consider the status and looks at the result. However, the hypervisor does not update the result - it updates the status. This might cause VMCI doorbell not to be enabled, resulting in degraded performance. Fixes: 48e3d668 ("VMware balloon: Enable notification via VMCI") Cc: stable@vger.kernel.org Reviewed-by:
Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by:
Nadav Amit <namit@vmware.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
commit 5081efd1 upstream. If the hypervisor sets 2MB batching is on, while batching is cleared, the balloon code breaks. In this case the legacy mechanism is used with 2MB page. The VM would report a 2MB page is ballooned, and the hypervisor would only take the first 4KB. While the hypervisor should not report such settings, make the code more robust by not enabling 2MB support without batching. Fixes: 365bd7ef ("VMware balloon: Support 2m page ballooning.") Cc: stable@vger.kernel.org Reviewed-by:
Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by:
Nadav Amit <nadav.amit@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
commit 09755690 upstream. When balloon batching is not supported by the hypervisor, the guest frame number (GFN) must fit in 32-bit. However, due to a bug, this check was mistakenly ignored. In practice, when total RAM is greater than 16TB, the balloon does not work currently, making this bug unlikely to happen. Fixes: ef0f8f11 ("VMware balloon: partially inline vmballoon_reserve_page.") Cc: stable@vger.kernel.org Reviewed-by:
Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by:
Nadav Amit <namit@vmware.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chanwoo Choi authored
commit 8a9dbb77 upstream. Previously, extcon used the spinlock before calling the notifier_call_chain to prevent the scheduled out of task and to prevent the notification delay. When spinlock is locked for sending the notification, deadlock issue occured on the side of extcon consumer device. To fix this issue, extcon consumer device should always use the work. it is always not reasonable to use work. To fix this issue on extcon consumer device, release locking when sending the notification of connector state. Fixes: ab11af04 ("extcon: Add the synchronization extcon APIs to support the notification") Cc: stable@vger.kernel.org Cc: Roger Quadros <rogerq@ti.com> Cc: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by:
Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lars-Peter Clausen authored
commit 9a5094ca upstream. A sysfs write callback function needs to either return the number of consumed characters or an error. The ad952x_store() function currently returns 0 if the input value was "0", this will signal that no characters have been consumed and the function will be called repeatedly in a loop indefinitely. Fix this by returning number of supplied characters to indicate that the whole input string has been consumed. Signed-off-by:
Lars-Peter Clausen <lars@metafoo.de> Signed-off-by:
Alexandru Ardelean <alexandru.ardelean@analog.com> Fixes: cd1678f9 ("iio: frequency: New driver for AD9523 SPI Low Jitter Clock Generator") Cc: <Stable@vger.kernel.org> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lars-Peter Clausen authored
commit 5a4e33c1 upstream. Fix the displayed phase for the ad9523 driver. Currently the most significant decimal place is dropped and all other digits are shifted one to the left. This is due to a multiplication by 10, which is not necessary, so remove it. Signed-off-by:
Lars-Peter Clausen <lars@metafoo.de> Signed-off-by:
Alexandru Ardelean <alexandru.ardelean@analog.com> Fixes: cd1678f9 ("iio: frequency: New driver for AD9523 SPI Low Jitter Clock Generator") Cc: <Stable@vger.kernel.org> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
commit c5b974be upstream. The IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY case is missing a return and will fall through to the default case and errorenously return -EINVAL. Fix this by adding in missing *return ret*. Fixes: 626f971b ("staging:iio:accel:sca3000 Add write support to the low pass filter control") Reported-by:
Jonathan Cameron <jic23@kernel.org> Signed-off-by:
Gustavo A. R. Silva <gustavo@embeddedor.com> Cc: <Stable@vger.kernel.org> Signed-off-by:
Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dexuan Cui authored
commit d3b26dd7 upstream. Before setting channel->rescind in vmbus_rescind_cleanup(), we should make sure the channel callback won't run any more, otherwise a high-level driver like pci_hyperv, which may be infinitely waiting for the host VSP's response and notices the channel has been rescinded, can't safely give up: e.g., in hv_pci_protocol_negotiation() -> wait_for_response(), it's unsafe to exit from wait_for_response() and proceed with the on-stack variable "comp_pkt" popped. The issue was originally spotted by Michael Kelley <mikelley@microsoft.com>. In vmbus_close_internal(), the patch also minimizes the range protected by disabling/enabling channel->callback_event: we don't really need that for the whole function. Signed-off-by:
Dexuan Cui <decui@microsoft.com> Reviewed-by:
Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Michael Kelley <mikelley@microsoft.com> Signed-off-by:
K. Y. Srinivasan <kys@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tycho Andersen authored
commit a5ba1d95 upstream. We have reports of the following crash: PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0" #0 [ffff88085c6db710] machine_kexec at ffffffff81046239 #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248 #2 [ffff88085c6db830] oops_end at ffffffff81008ae7 #3 [ffff88085c6db860] no_context at ffffffff81050b8f #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75 #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83 #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122 [exception RIP: uart_put_char+149] RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006 RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081 RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120 RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320 R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000 R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544 #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2 #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016 #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194 #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2 #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d #20 [ffff88085c6dbeb0] kthread at ffffffff81096384 #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f after slogging through some dissasembly: ffffffff814b6720 <uart_put_char>: ffffffff814b6720: 55 push %rbp ffffffff814b6721: 48 89 e5 mov %rsp,%rbp ffffffff814b6724: 48 83 ec 20 sub $0x20,%rsp ffffffff814b6728: 48 89 1c 24 mov %rbx,(%rsp) ffffffff814b672c: 4c 89 64 24 08 mov %r12,0x8(%rsp) ffffffff814b6731: 4c 89 6c 24 10 mov %r13,0x10(%rsp) ffffffff814b6736: 4c 89 74 24 18 mov %r14,0x18(%rsp) ffffffff814b673b: e8 b0 8e 58 00 callq ffffffff81a3f5f0 <mcount> ffffffff814b6740: 4c 8b a7 88 02 00 00 mov 0x288(%rdi),%r12 ffffffff814b6747: 45 31 ed xor %r13d,%r13d ffffffff814b674a: 41 89 f6 mov %esi,%r14d ffffffff814b674d: 49 83 bc 24 70 01 00 cmpq $0x0,0x170(%r12) ffffffff814b6754: 00 00 ffffffff814b6756: 49 8b 9c 24 80 01 00 mov 0x180(%r12),%rbx ffffffff814b675d: 00 ffffffff814b675e: 74 2f je ffffffff814b678f <uart_put_char+0x6f> ffffffff814b6760: 48 89 df mov %rbx,%rdi ffffffff814b6763: e8 a8 67 58 00 callq ffffffff81a3cf10 <_raw_spin_lock_irqsave> ffffffff814b6768: 41 8b 8c 24 78 01 00 mov 0x178(%r12),%ecx ffffffff814b676f: 00 ffffffff814b6770: 89 ca mov %ecx,%edx ffffffff814b6772: f7 d2 not %edx ffffffff814b6774: 41 03 94 24 7c 01 00 add 0x17c(%r12),%edx ffffffff814b677b: 00 ffffffff814b677c: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b6782: 75 23 jne ffffffff814b67a7 <uart_put_char+0x87> ffffffff814b6784: 48 89 c6 mov %rax,%rsi ffffffff814b6787: 48 89 df mov %rbx,%rdi ffffffff814b678a: e8 e1 64 58 00 callq ffffffff81a3cc70 <_raw_spin_unlock_irqrestore> ffffffff814b678f: 44 89 e8 mov %r13d,%eax ffffffff814b6792: 48 8b 1c 24 mov (%rsp),%rbx ffffffff814b6796: 4c 8b 64 24 08 mov 0x8(%rsp),%r12 ffffffff814b679b: 4c 8b 6c 24 10 mov 0x10(%rsp),%r13 ffffffff814b67a0: 4c 8b 74 24 18 mov 0x18(%rsp),%r14 ffffffff814b67a5: c9 leaveq ffffffff814b67a6: c3 retq ffffffff814b67a7: 49 8b 94 24 70 01 00 mov 0x170(%r12),%rdx ffffffff814b67ae: 00 ffffffff814b67af: 48 63 c9 movslq %ecx,%rcx ffffffff814b67b2: 41 b5 01 mov $0x1,%r13b ffffffff814b67b5: 44 88 34 0a mov %r14b,(%rdx,%rcx,1) ffffffff814b67b9: 41 8b 94 24 78 01 00 mov 0x178(%r12),%edx ffffffff814b67c0: 00 ffffffff814b67c1: 83 c2 01 add $0x1,%edx ffffffff814b67c4: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b67ca: 41 89 94 24 78 01 00 mov %edx,0x178(%r12) ffffffff814b67d1: 00 ffffffff814b67d2: eb b0 jmp ffffffff814b6784 <uart_put_char+0x64> ffffffff814b67d4: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1) ffffffff814b67db: 00 00 00 00 00 for our build, this is crashing at: circ->buf[circ->head] = c; Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf) protected by the "per-port mutex", which based on uart_port_check() is state->port.mutex. Indeed, the lock acquired in uart_put_char() is uport->lock, i.e. not the same lock. Anyway, since the lock is not acquired, if uart_shutdown() is called, the last chunk of that function may release state->xmit.buf before its assigned to null, and cause the race above. To fix it, let's lock uport->lock when allocating/deallocating state->xmit.buf in addition to the per-port mutex. v2: switch to locking uport->lock on allocation/deallocation instead of locking the per-port mutex in uart_put_char. Note that since uport->lock is a spin lock, we have to switch the allocation to GFP_ATOMIC. v3: move the allocation outside the lock, so we can switch back to GFP_KERNEL Signed-off-by:
Tycho Andersen <tycho@tycho.ws> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mikulas Patocka authored
commit bc9e9cf0 upstream. dm-crypt should only increase device limits, it should not decrease them. This fixes a bug where the user could creates a crypt device with 1024 sector size on the top of scsi device that had 4096 logical block size. The limit 4096 would be lost and the user could incorrectly send 1024-I/Os to the crypt device. Cc: stable@vger.kernel.org Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ilya Dryomov authored
commit 5b1fe7be upstream. Quoting Documentation/device-mapper/cache.txt: The 'dirty' state for a cache block changes far too frequently for us to keep updating it on the fly. So we treat it as a hint. In normal operation it will be written when the dm device is suspended. If the system crashes all cache blocks will be assumed dirty when restarted. This got broken in commit f177940a ("dm cache metadata: switch to using the new cursor api for loading metadata") in 4.9, which removed the code that consulted cmd->clean_when_opened (CLEAN_SHUTDOWN on-disk flag) when loading cache blocks. This results in data corruption on an unclean shutdown with dirty cache blocks on the fast device. After the crash those blocks are considered clean and may get evicted from the cache at any time. This can be demonstrated by doing a lot of reads to trigger individual evictions, but uncache is more predictable: ### Disable auto-activation in lvm.conf to be able to do uncache in ### time (i.e. see uncache doing flushing) when the fix is applied. # xfs_io -d -c 'pwrite -b 4M -S 0xaa 0 1G' /dev/vdb # vgcreate vg_cache /dev/vdb /dev/vdc # lvcreate -L 1G -n lv_slowdev vg_cache /dev/vdb # lvcreate -L 512M -n lv_cachedev vg_cache /dev/vdc # lvcreate -L 256M -n lv_metadev vg_cache /dev/vdc # lvconvert --type cache-pool --cachemode writeback vg_cache/lv_cachedev --poolmetadata vg_cache/lv_metadev # lvconvert --type cache vg_cache/lv_slowdev --cachepool vg_cache/lv_cachedev # xfs_io -d -c 'pwrite -b 4M -S 0xbb 0 512M' /dev/mapper/vg_cache-lv_slowdev # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # dmsetup status vg_cache-lv_slowdev 0 2097152 cache 8 27/65536 128 8192/8192 1 100 0 0 0 8192 7065 2 metadata2 writeback 2 migration_threshold 2048 smq 0 rw - ^^^^ 7065 * 64k = 441M yet to be written to the slow device # echo b >/proc/sysrq-trigger # vgchange -ay vg_cache # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # lvconvert --uncache vg_cache/lv_slowdev Flushing 0 blocks for cache vg_cache/lv_slowdev. Logical volume "lv_cachedev" successfully removed Logical volume vg_cache/lv_slowdev is not cached. # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ 0fe00010: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ This is the case with both v1 and v2 cache pool metatata formats. After applying this patch: # vgchange -ay vg_cache # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # lvconvert --uncache vg_cache/lv_slowdev Flushing 3724 blocks for cache vg_cache/lv_slowdev. ... Flushing 71 blocks for cache vg_cache/lv_slowdev. Logical volume "lv_cachedev" successfully removed Logical volume vg_cache/lv_slowdev is not cached. # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ Cc: stable@vger.kernel.org Fixes: f177940a ("dm cache metadata: switch to using the new cursor api for loading metadata") Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mike Snitzer authored
commit fd2fa954 upstream. policy_hint_size starts as 0 during __write_initial_superblock(). It isn't until the policy is loaded that policy_hint_size is set in-core (cmd->policy_hint_size). But it never got recorded in the on-disk superblock because __commit_transaction() didn't deal with transfering the in-core cmd->policy_hint_size to the on-disk superblock. The in-core cmd->policy_hint_size gets initialized by metadata_open()'s __begin_transaction_flags() which re-reads all superblock fields. Because the superblock's policy_hint_size was never properly stored, when the cache was created, hints_array_available() would always return false when re-activating a previously created cache. This means __load_mappings() always considered the hints invalid and never made use of the hints (these hints served to optimize). Another detremental side-effect of this oversight is the cache_check utility would fail with: "invalid hint width: 0" Cc: stable@vger.kernel.org Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-