- 31 May, 2019 18 commits
-
-
Filipe Manana authored
commit 72bd2323 upstream. Currently when we fail to COW a path at btrfs_update_root() we end up always aborting the transaction. However all the current callers of btrfs_update_root() are able to deal with errors returned from it, many do end up aborting the transaction themselves (directly or not, such as the transaction commit path), other BUG_ON() or just gracefully cancel whatever they were doing. When syncing the fsync log, we call btrfs_update_root() through tree-log.c:update_log_root(), and if it returns an -ENOSPC error, the log sync code does not abort the transaction, instead it gracefully handles the error and returns -EAGAIN to the fsync handler, so that it falls back to a transaction commit. Any other error different from -ENOSPC, makes the log sync code abort the transaction. So remove the transaction abort from btrfs_update_log() when we fail to COW a path to update the root item, so that if an -ENOSPC failure happens we avoid aborting the current transaction and have a chance of the fsync succeeding after falling back to a transaction commit. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203413 Fixes: 79787eaa ("btrfs: replace many BUG_ONs with proper error handling") Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josef Bacik authored
commit 8fca9550 upstream. If we have an error writing out a delalloc range in btrfs_punch_hole_lock_range we'll unlock the inode and then goto out_only_mutex, where we will again unlock the inode. This is bad, don't do this. Fixes: f27451f2 ("Btrfs: add support for fallocate's zero range operation") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andreas Gruenbacher authored
commit 5a5ec83d upstream. Commit 4d207133 changed the types of the statistic values in struct gfs2_lkstats from s64 to u64. Because of that, what should be a signed value in gfs2_update_stats turned into an unsigned value. When shifted right, we end up with a large positive value instead of a small negative value, which results in an incorrect variance estimate. Fixes: 4d207133 ("gfs2: Make statistics unsigned, suitable for use with do_div()") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoph Hellwig authored
commit a98d9ae9 upstream. DMA allocations that can't sleep may return non-remapped addresses, but we do not properly handle them in the mmap and get_sgtable methods. Resolve non-vmalloc addresses using virt_to_page to handle this corner case. Cc: <stable@vger.kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ard Biesheuvel authored
commit b2eed9b5 upstream. The following commit 7290d580 ("module: use relative references for __ksymtab entries") updated the ksymtab handling of some KASLR capable architectures so that ksymtab entries are emitted as pairs of 32-bit relative references. This reduces the size of the entries, but more importantly, it gets rid of statically assigned absolute addresses, which require fixing up at boot time if the kernel is self relocating (which takes a 24 byte RELA entry for each member of the ksymtab struct). Since ksymtab entries are always part of the same module as the symbol they export, it was assumed at the time that a 32-bit relative reference is always sufficient to capture the offset between a ksymtab entry and its target symbol. Unfortunately, this is not always true: in the case of per-CPU variables, a per-CPU variable's base address (which usually differs from the actual address of any of its per-CPU copies) is allocated in the vicinity of the ..data.percpu section in the core kernel (i.e., in the per-CPU reserved region which follows the section containing the core kernel's statically allocated per-CPU variables). Since we randomize the module space over a 4 GB window covering the core kernel (based on the -/+ 4 GB range of an ADRP/ADD pair), we may end up putting the core kernel out of the -/+ 2 GB range of 32-bit relative references of module ksymtab entries that refer to per-CPU variables. So reduce the module randomization range a bit further. We lose 1 bit of randomization this way, but this is something we can tolerate. Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 52f476a3 upstream. Jeff discovered that performance improves from ~375K iops to ~519K iops on a simple psync-write fio workload when moving the location of 'struct page' from the default PMEM location to DRAM. This result is surprising because the expectation is that 'struct page' for dax is only needed for third party references to dax mappings. For example, a dax-mapped buffer passed to another system call for direct-I/O requires 'struct page' for sending the request down the driver stack and pinning the page. There is no usage of 'struct page' for first party access to a file via read(2)/write(2) and friends. However, this "no page needed" expectation is violated by CONFIG_HARDENED_USERCOPY and the check_copy_size() performed in copy_from_iter_full_nocache() and copy_to_iter_mcsafe(). The check_heap_object() helper routine assumes the buffer is backed by a slab allocator (DRAM) page and applies some checks. Those checks are invalid, dax pages do not originate from the slab, and redundant, dax_iomap_actor() has already validated that the I/O is within bounds. Specifically that routine validates that the logical file offset is within bounds of the file, then it does a sector-to-pfn translation which validates that the physical mapping is within bounds of the block device. Bypass additional hardened usercopy overhead and call the 'no check' versions of the copy_{to,from}_iter operations directly. Fixes: 0aed55af ("x86, uaccess: introduce copy_from_iter_flushcache...") Cc: <stable@vger.kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <willy@infradead.org> Reported-and-tested-by: Jeff Smits <jeff.smits@intel.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Suthikulpanit, Suravee authored
commit c9bcd3e3 upstream. Current logic does not allow VCPU to be loaded onto CPU with APIC ID 255. This should be allowed since the host physical APIC ID field in the AVIC Physical APIC table entry is an 8-bit value, and APIC ID 255 is valid in system with x2APIC enabled. Instead, do not allow VCPU load if the host APIC ID cannot be represented by an 8-bit value. Also, use the more appropriate AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK instead of AVIC_MAX_PHYSICAL_ID_COUNT. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trac Hoang authored
commit ec0970e0 upstream. The iproc host eMMC/SD controller hold time does not meet the specification in the HS50 mode. This problem can be mitigated by disabling the HISPD bit; thus forcing the controller output data to be driven on the falling clock edges rather than the rising clock edges. Stable tag (v4.12+) chosen to assist stable kernel maintainers so that the change does not produce merge conflicts backporting to older kernel versions. In reality, the timing bug existed since the driver was first introduced but there is no need for this driver to be supported in kernel versions that old. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Trac Hoang <trac.hoang@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trac Hoang authored
commit b7dfa695 upstream. The iproc host eMMC/SD controller hold time does not meet the specification in the HS50 mode. This problem can be mitigated by disabling the HISPD bit; thus forcing the controller output data to be driven on the falling clock edges rather than the rising clock edges. This change applies only to the Cygnus platform. Stable tag (v4.12+) chosen to assist stable kernel maintainers so that the change does not produce merge conflicts backporting to older kernel versions. In reality, the timing bug existed since the driver was first introduced but there is no need for this driver to be supported in kernel versions that old. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Trac Hoang <trac.hoang@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Axtens authored
commit 009b30ac upstream. The kernel self-tests picked up an issue with CTR mode: alg: skcipher: p8_aes_ctr encryption test failed (wrong result) on test vector 3, cfg="uneven misaligned splits, may sleep" Test vector 3 has an IV of FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD, so after 3 increments it should wrap around to 0. In the aesp8-ppc code from OpenSSL, there are two paths that increment IVs: the bulk (8 at a time) path, and the individual path which is used when there are fewer than 8 AES blocks to process. In the bulk path, the IV is incremented with vadduqm: "Vector Add Unsigned Quadword Modulo", which does 128-bit addition. In the individual path, however, the IV is incremented with vadduwm: "Vector Add Unsigned Word Modulo", which instead does 4 32-bit additions. Thus the IV would instead become FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result. Use vadduqm. This was probably a typo originally, what with q and w being adjacent. It is a pretty narrow edge case: I am really impressed by the quality of the kernel self-tests! Fixes: 5c380d62 ("crypto: vmx - Add support for VMS instructions by ASM") Cc: stable@vger.kernel.org Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Nayna Jain <nayna@linux.ibm.com> Tested-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Martin K. Petersen authored
commit 8acf608e upstream. This reverts commit 20bd1d02. This patch introduced regressions for devices that come online in read-only state and subsequently switch to read-write. Given how the partition code is currently implemented it is not possible to persist the read-only flag across a device revalidate call. This may need to get addressed in the future since it is common for user applications to proactively call BLKRRPART. Reverting this commit will re-introduce a regression where a device-initiated revalidate event will cause the admin state to be forgotten. A separate patch will address this issue. Fixes: 20bd1d02 ("scsi: sd: Keep disk read-only when re-reading partition") Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrea Parri authored
commit a0934fd2 upstream. This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: 6c0ca7ae ("sbitmap: fix wakeup hang after sbq resize") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Omar Sandoval <osandov@fb.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrea Parri authored
commit f381c6a4 upstream. This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: dac56212 ("bio: skip atomic inc/dec of ->bi_cnt for most use cases") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Bonzini authored
commit 66f61c92 upstream. Commit 11988499 ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes", 2019-04-02) introduced a "return false" in a function returning int, and anyway set_efer has a "nonzero on error" conventon so it should be returning 1. Reported-by: Pavel Machek <pavel@denx.de> Fixes: 11988499 ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes") Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 0916878d upstream. For a single device mount using a zoned block device, the zone information for the device is stored in the sbi->devs single entry array and sbi->s_ndevs is set to 1. This differs from a single device mount using a regular block device which does not allocate sbi->devs and sets sbi->s_ndevs to 0. However, sbi->s_devs == 0 condition is used throughout the code to differentiate a single device mount from a multi-device mount where sbi->s_ndevs is always larger than 1. This results in problems with single zoned block device volumes as these are treated as multi-device mounts but do not have the start_blk and end_blk information set. One of the problem observed is skipping of zone discard issuing resulting in write commands being issued to full zones or unaligned to a zone write pointer. Fix this problem by simply treating the cases sbi->s_ndevs == 0 (single regular block device mount) and sbi->s_ndevs == 1 (single zoned block device mount) in the same manner. This is done by introducing the helper function f2fs_is_multi_device() and using this helper in place of direct tests of sbi->s_ndevs value, improving code readability. Fixes: 7bb3a371 ("f2fs: Fix zoned block device support") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
commit 82a25b02 upstream. We didn't wait for outstanding direct IO during truncate in nojournal mode (as we skip orphan handling in that case). This can lead to fs corruption or stale data exposure if truncate ends up freeing blocks and these get reallocated before direct IO finishes. Fix the condition determining whether the wait is necessary. CC: stable@vger.kernel.org Fixes: 1c9114f9 ("ext4: serialize unlocked dio reads with truncate") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
commit ee0ed02c upstream. It is possible that unlinked inode enters ext4_setattr() (e.g. if somebody calls ftruncate(2) on unlinked but still open file). In such case we should not delete the inode from the orphan list if truncate fails. Note that this is mostly a theoretical concern as filesystem is corrupted if we reach this path anyway but let's be consistent in our orphan handling. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit 693713cb upstream. User Mode Linux does not have access to the ip or sp fields of the pt_regs, and accessing them causes UML to fail to build. Hide the int3_emulate_jmp() and int3_emulate_call() instructions from UML, as it doesn't need them anyway. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 May, 2019 22 commits
-
-
Greg Kroah-Hartman authored
-
Yifeng Li authored
commit 9dc20113 upstream. A fallthrough in switch/case was introduced in f627caf5 ("fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting"), due to my copy-paste error, which would cause the memory clock frequency for SM720 to be programmed to SM712. Since it only reprograms the clock to a different frequency, it's only a benign issue without visible side-effect, so it also evaded Sudip Mukherjee's code review and regression tests. scripts/checkpatch.pl also failed to discover the issue, possibly due to nested switch statements. This issue was found by Stephen Rothwell by building linux-next with -Wimplicit-fallthrough. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: f627caf5 ("fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting") Signed-off-by: Yifeng Li <tomli@tomli.me> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Borkmann authored
commit 50b045a8 upstream. One of the biggest issues we face right now with picking LRU map over regular hash table is that a map walk out of user space, for example, to just dump the existing entries or to remove certain ones, will completely mess up LRU eviction heuristics and wrong entries such as just created ones will get evicted instead. The reason for this is that we mark an entry as "in use" via bpf_lru_node_set_ref() from system call lookup side as well. Thus upon walk, all entries are being marked, so information of actual least recently used ones are "lost". In case of Cilium where it can be used (besides others) as a BPF based connection tracker, this current behavior causes disruption upon control plane changes that need to walk the map from user space to evict certain entries. Discussion result from bpfconf [0] was that we should simply just remove marking from system call side as no good use case could be found where it's actually needed there. Therefore this patch removes marking for regular LRU and per-CPU flavor. If there ever should be a need in future, the behavior could be selected via map creation flag, but due to mentioned reason we avoid this here. [0] http://vger.kernel.org/bpfconf.html Fixes: 29ba732a ("bpf: Add BPF_MAP_TYPE_LRU_HASH") Fixes: 8f844938 ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Borkmann authored
commit c6110222 upstream. Add a callback map_lookup_elem_sys_only() that map implementations could use over map_lookup_elem() from system call side in case the map implementation needs to handle the latter differently than from the BPF data path. If map_lookup_elem_sys_only() is set, this will be preferred pick for map lookups out of user space. This hook is used in a follow-up fix for LRU map, but once development window opens, we can convert other map types from map_lookup_elem() (here, the one called upon BPF_MAP_LOOKUP_ELEM cmd is meant) over to use the callback to simplify and clean up the latter. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chenbo Feng authored
commit e547ff3f upstream. For iptable module to load a bpf program from a pinned location, it only retrieve a loaded program and cannot change the program content so requiring a write permission for it might not be necessary. Also when adding or removing an unrelated iptable rule, it might need to flush and reload the xt_bpf related rules as well and triggers the inode permission check. It might be better to remove the write premission check for the inode so we won't need to grant write access to all the processes that flush and restore iptables rules. Signed-off-by: Chenbo Feng <fengc@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
This reverts commit 118d38a3 which is commit 8184d44c upstream. Tommi reports that this patch breaks the build, it's not really needed so let's revert it. Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Stanislav Fomichev <sdf@google.com> Cc: Sasha Levin <sashal@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Garry authored
commit 0b777eee upstream. In commit 376991db ("driver core: Postpone DMA tear-down until after devres release"), we changed the ordering of tearing down the device DMA ops and releasing all the device's resources; this was because the DMA ops should be maintained until we release the device's managed DMA memories. However, we have seen another crash on an arm64 system when a device driver probe fails: hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2 scsi host1: hisi_sas_v3_hw BUG: Bad page state in process swapper/0 pfn:313f5 page:ffff7e0000c4fd40 count:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1000(reserved) Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1-43081-g22d97fd-dirty #1433 Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - V1.12.01 01/29/2019 Call trace: dump_backtrace+0x0/0x118 show_stack+0x14/0x1c dump_stack+0xa4/0xc8 bad_page+0xe4/0x13c free_pages_check_bad+0x4c/0xc0 __free_pages_ok+0x30c/0x340 __free_pages+0x30/0x44 __dma_direct_free_pages+0x30/0x38 dma_direct_free+0x24/0x38 dma_free_attrs+0x9c/0xd8 dmam_release+0x20/0x28 release_nodes+0x17c/0x220 devres_release_all+0x34/0x54 really_probe+0xc4/0x2c8 driver_probe_device+0x58/0xfc device_driver_attach+0x68/0x70 __driver_attach+0x94/0xdc bus_for_each_dev+0x5c/0xb4 driver_attach+0x20/0x28 bus_add_driver+0x14c/0x200 driver_register+0x6c/0x124 __pci_register_driver+0x48/0x50 sas_v3_pci_driver_init+0x20/0x28 do_one_initcall+0x40/0x25c kernel_init_freeable+0x2b8/0x3c0 kernel_init+0x10/0x100 ret_from_fork+0x10/0x18 Disabling lock debugging due to kernel taint BUG: Bad page state in process swapper/0 pfn:313f6 page:ffff7e0000c4fd80 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [ 89.322983] flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 The crash occurs for the same reason. In this case, on the really_probe() failure path, we are still clearing the DMA ops prior to releasing the device's managed memories. This patch fixes this issue by reordering the DMA ops teardown and the call to devres_release_all() on the failure path. Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> [jpg: backport to 4.19.x and earlier] Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nigel Croxon authored
commit b2176a1d upstream. The problem is that any 'uptodate' vs 'disks' check is not precise in this path. Put a "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the device that might try to kick off writes and then skip the action. Better to prevent the raid driver from taking unexpected action *and* keep the system alive vs killing the machine with BUG_ON. Note: fixed warning reported by kbuild test robot <lkp@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Song Liu authored
commit a25d8c32 upstream. This reverts commit 4f4fd7c5. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Nigel Croxon <ncroxon@redhat.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Olsa authored
[ Upstream commit 6f55967a ] New race in x86_pmu_stop() was introduced by replacing the atomic __test_and_clear_bit() of cpuc->active_mask by separate test_bit() and __clear_bit() calls in the following commit: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") The race causes panic for PEBS events with enabled callchains: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... RIP: 0010:perf_prepare_sample+0x8c/0x530 Call Trace: <NMI> perf_event_output_forward+0x2a/0x80 __perf_event_overflow+0x51/0xe0 handle_pmi_common+0x19e/0x240 intel_pmu_handle_irq+0xad/0x170 perf_event_nmi_handler+0x2e/0x50 nmi_handle+0x69/0x110 default_do_nmi+0x3e/0x100 do_nmi+0x11a/0x180 end_repeat_nmi+0x16/0x1a RIP: 0010:native_write_msr+0x6/0x20 ... </NMI> intel_pmu_disable_event+0x98/0xf0 x86_pmu_stop+0x6e/0xb0 x86_pmu_del+0x46/0x140 event_sched_out.isra.97+0x7e/0x160 ... The event is configured to make samples from PEBS drain code, but when it's disabled, we'll go through NMI path instead, where data->callchain will not get allocated and we'll crash: x86_pmu_stop test_bit(hwc->idx, cpuc->active_mask) intel_pmu_disable_event(event) { ... intel_pmu_pebs_disable(event); ... EVENT OVERFLOW -> <NMI> intel_pmu_handle_irq handle_pmi_common TEST PASSES -> test_bit(bit, cpuc->active_mask)) perf_event_overflow perf_prepare_sample { ... if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY)) data->callchain = perf_callchain(event, regs); CRASH -> size += data->callchain->nr; } </NMI> ... x86_pmu_disable_event(event) } __clear_bit(hwc->idx, cpuc->active_mask); Fixing this by disabling the event itself before setting off the PEBS bit. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Arcari <darcari@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Lendacky Thomas <Thomas.Lendacky@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler") Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Arnaldo Carvalho de Melo authored
[ Upstream commit bf561d3c ] While cross building perf to the ARC architecture on a fedora 30 host, we were failing with: CC /tmp/build/perf/bench/numa.o bench/numa.c: In function ‘worker_thread’: bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’? getrusage(RUSAGE_THREAD, &rusage); ^~~~~~~~~~~~~ SIGEV_THREAD bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in [perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1 arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 [perfbuilder@60d5802468f6 perf]$ Trying to reproduce a report by Vineet, I noticed that, with just cross-built zlib and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define, check for that and numactl libraries, I ended up with the above failure. So, since RUSAGE_THREAD is available as a define in the system headers, check if it is defined in the 'perf bench numa' sources and define it if not. Now it builds and I have to figure out if the problem reported by Vineet only takes place if we have libelf or some other library available. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-snps-arc@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Al Viro authored
[ Upstream commit 4e903604 ] To choose whether to pick the GID from the old (16bit) or new (32bit) field, we should check if the old gid field is set to 0xffff. Mainline checks the old *UID* field instead - cut'n'paste from the corresponding code in ufs_get_inode_uid(). Fixes: 252e211eSigned-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Gary Hook authored
[ Upstream commit b51ce374 ] Enablement of AMD's Secure Memory Encryption feature is determined very early after start_kernel() is entered. Part of this procedure involves scanning the command line for the parameter 'mem_encrypt'. To determine intended state, the function sme_enable() uses library functions cmdline_find_option() and strncmp(). Their use occurs early enough such that it cannot be assumed that any instrumentation subsystem is initialized. For example, making calls to a KASAN-instrumented function before KASAN is set up will result in the use of uninitialized memory and a boot failure. When AMD's SME support is enabled, conditionally disable instrumentation of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c. [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ] Fixes: aca20d54 ("x86/mm: Add support to make use of Secure Memory Encryption") Reported-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Boris Brezillon <bbrezillon@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: "luto@kernel.org" <luto@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "mingo@redhat.com" <mingo@redhat.com> Cc: "peterz@infradead.org" <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Tobin C. Harding authored
[ Upstream commit 9a4f26cc ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Signed-off-by: Tobin C. Harding <tobin@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tobin C. Harding <tobin@kernel.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Luca Coelho authored
[ Upstream commit de1887c0 ] We don't check for the validity of the lengths in the packet received from the firmware. If the MPDU length received in the rx descriptor is too short to contain the header length and the crypt length together, we may end up trying to copy a negative number of bytes (headlen - hdrlen < 0) which will underflow and cause us to try to copy a huge amount of data. This causes oopses such as this one: BUG: unable to handle kernel paging request at ffff896be2970000 PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1 Hardware name: [...] RIP: 0010:memcpy_erms+0x6/0x10 Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287 RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610 RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000 RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000 R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050 R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038 FS: 0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm] iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi] iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi] irq_thread_fn+0x24/0x49 irq_thread+0xb0/0x122 kthread+0x138/0x140 ret_from_fork+0x1f/0x40 Fix that by checking the lengths for correctness and trigger a warning to show that we have received wrong data. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bjørn Mork authored
[ Upstream commit 88ef66a2 ] Adding device entries found in vendor modified versions of this driver. Function maps for some of the devices follow: WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918) MI_00 Qualcomm HS-USB Diagnostics MI_01 Android Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm Wireless HS-USB Ethernet Adapter MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1435 ProdID=0918 Rev= 2.32 S: Manufacturer=Android S: Product=Android S: SerialNumber=0123456789ABCDEF C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=32ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms WNC D18 LTE CAT3 module (1435:d182) MI_00 Qualcomm HS-USB Diagnostics MI_01 Androd Debug interface MI_02 Qualcomm HS-USB Modem MI_03 Qualcomm HS-USB NMEA MI_04 Qualcomm Wireless HS-USB Ethernet Adapter MI_05 Qualcomm Wireless HS-USB Ethernet Adapter MI_06 USB Mass Storage Device ZM8510/ZM8620/ME3960 (19d2:0396) MI_00 ZTE Mobile Broadband Diagnostics Port MI_01 ZTE Mobile Broadband AT Port MI_02 ZTE Mobile Broadband Modem MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan) MI_04 ZTE Mobile Broadband ADB Port ME3620_X (19d2:1432) MI_00 ZTE Diagnostics Device MI_01 ZTE UI AT Interface MI_02 ZTE Modem Device MI_03 ZTE Mobile Broadband Network Adapter MI_04 ZTE Composite ADB Interface Reported-by: Lars Melin <larsm17@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Peter Zijlstra authored
[ Upstream commit 0edd6b64 ] Unless the very next line is schedule(), or implies it, one must not use preempt_enable_no_resched(). It can cause a preemption to go missing and thereby cause arbitrary delays, breaking the PREEMPT=y invariant. Cc: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Andrey Smirnov authored
[ Upstream commit 349ced99 ] Fix a similar endless event loop as was done in commit 8dcf3217 ("i2c: prevent endless uevent loop with CONFIG_I2C_DEBUG_CORE"): The culprit is the dev_dbg printk in the i2c uevent handler. If this is activated (for instance by CONFIG_I2C_DEBUG_CORE) it results in an endless loop with systemd-journald. This happens if user-space scans the system log and reads the uevent file to get information about a newly created device, which seems fair use to me. Unfortunately reading the "uevent" file uses the same function that runs for creating the uevent for a new device, generating the next syslog entry Both CONFIG_I2C_DEBUG_CORE and CONFIG_POWER_SUPPLY_DEBUG were reported in https://bugs.freedesktop.org/show_bug.cgi?id=76886 but only former seems to have been fixed. Drop debug prints as it was done in I2C subsystem to resolve the issue. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Chris Healy <cphealy@gmail.com> Cc: linux-pm@vger.kernel.org Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Andrew Jones authored
[ Upstream commit 811328fc ] A failed KVM_ARM_VCPU_INIT should not set the vcpu target, as the vcpu target is used by kvm_vcpu_initialized() to determine if other vcpu ioctls may proceed. We need to set the target before calling kvm_reset_vcpu(), but if that call fails, we should then unset it and clear the feature bitmap while we're at it. Signed-off-by: Andrew Jones <drjones@redhat.com> [maz: Simplified patch, completed commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Kangjie Lu authored
[ Upstream commit 22e8860c ] regmap_update_bits could fail and deserves a check. The patch adds the checks and if it fails, returns its error code upstream. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bhagavathi Perumal S authored
[ Upstream commit f1267cf3 ] The txq of vif is added to active_txqs list for ATF TXQ scheduling in the function ieee80211_queue_skb(), but it was not properly removed before freeing the txq object. It was causing use after free of the txq objects from the active_txqs list, result was kernel panic due to invalid memory access. Fix kernel invalid memory access by properly removing txq object from active_txqs list before free the object. Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Vitaly Kuznetsov authored
[ Upstream commit da66761c ] It was reported that with some special Multi Processor Group configuration, e.g: bcdedit.exe /set groupsize 1 bcdedit.exe /set maxgroup on bcdedit.exe /set groupaware on for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism is in use. Tracing kvm_hv_flush_tlb immediately reveals the issue: kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2 The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES, however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified. We don't flush anything and apparently it's not what Windows expects. TLFS doesn't say anything about such requests and newer Windows versions seem to be unaffected. This all feels like a WS2012 bug, which is, however, easy to workaround in KVM: let's flush everything when we see an empty flush request, over-flushing doesn't hurt. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-