- 22 Jun, 2019 40 commits
-
-
Yang Shi authored
commit 7a30df49 upstream. A few new fields were added to mmu_gather to make TLB flush smarter for huge page by telling what level of page table is changed. __tlb_reset_range() is used to reset all these page table state to unchanged, which is called by TLB flush for parallel mapping changes for the same range under non-exclusive lock (i.e. read mmap_sem). Before commit dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update PTEs in parallel don't remove page tables. But, the forementioned commit may do munmap() under read mmap_sem and free page tables. This may result in program hang on aarch64 reported by Jan Stancek. The problem could be reproduced by his test program with slightly modified below. ---8<--- static int map_size = 4096; static int num_iter = 500; static long threads_total; static void *distant_area; void *map_write_unmap(void *ptr) { int *fd = ptr; unsigned char *map_address; int i, j = 0; for (i = 0; i < num_iter; i++) { map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (map_address == MAP_FAILED) { perror("mmap"); exit(1); } for (j = 0; j < map_size; j++) map_address[j] = 'b'; if (munmap(map_address, map_size) == -1) { perror("munmap"); exit(1); } } return NULL; } void *dummy(void *ptr) { return NULL; } int main(void) { pthread_t thid[2]; /* hint for mmap in map_write_unmap() */ distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); munmap(distant_area, (size_t)DISTANT_MMAP_SIZE); distant_area += DISTANT_MMAP_SIZE / 2; while (1) { pthread_create(&thid[0], NULL, map_write_unmap, NULL); pthread_create(&thid[1], NULL, dummy, NULL); pthread_join(thid[0], NULL); pthread_join(thid[1], NULL); } } ---8<--- The program may bring in parallel execution like below: t1 t2 munmap(map_address) downgrade_write(&mm->mmap_sem); unmap_region() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); free_pgtables() tlb->freed_tables = 1 tlb->cleared_pmds = 1 pthread_exit() madvise(thread_stack, 8M, MADV_DONTNEED) zap_page_range() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); tlb_finish_mmu() if (mm_tlb_flush_nested(tlb->mm)) __tlb_reset_range() __tlb_reset_range() would reset freed_tables and cleared_* bits, but this may cause inconsistency for munmap() which do free page tables. Then it may result in some architectures, e.g. aarch64, may not flush TLB completely as expected to have stale TLB entries remained. Use fullmm flush since it yields much better performance on aarch64 and non-fullmm doesn't yields significant difference on x86. The original proposed fix came from Jan Stancek who mainly debugged this issue, I just wrapped up everything together. Jan's testing results: v5.2-rc2-24-gbec7550c -------------------------- mean stddev real 37.382 2.780 user 1.420 0.078 sys 54.658 1.855 v5.2-rc2-24-gbec7550c + "mm: mmu_gather: remove __tlb_reset_range() for force flush" ---------------------------------------------------------------------------------------_ mean stddev real 37.119 2.105 user 1.548 0.087 sys 55.698 1.357 [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.alibaba.com Fixes: dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Jan Stancek <jstancek@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Suggested-by: Will Deacon <will.deacon@arm.com> Tested-by: Will Deacon <will.deacon@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> [4.20+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tobin C. Harding authored
[ Upstream commit b9fba67b ] If a call to kobject_init_and_add() fails we should call kobject_put() otherwise we leak memory. Add call to kobject_put() in the error path of call to kobject_init_and_add(). Please note, this has the side effect that the release method is called if kobject_init_and_add() fails. Link: http://lkml.kernel.org/r/20190513033458.2824-1-tobin@kernel.orgSigned-off-by: Tobin C. Harding <tobin@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Amit Cohen authored
[ Upstream commit 275e928f ] Force of 56G is not supported by hardware in Ethernet devices. This configuration fails with a bad parameter error from firmware. Add check of this case. Instead of trying to set 56G with autoneg off, return a meaningful error. Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jason Yan authored
[ Upstream commit 3b054179 ] The sas_port(phy->port) allocated in sas_ex_discover_expander() will not be deleted when the expander failed to discover. This will cause resource leak and a further issue of kernel BUG like below: [159785.843156] port-2:17:29: trying to add phy phy-2:17:29 fails: it's already part of another port [159785.852144] ------------[ cut here ]------------ [159785.856833] kernel BUG at drivers/scsi/scsi_transport_sas.c:1086! [159785.863000] Internal error: Oops - BUG: 0 [#1] SMP [159785.867866] CPU: 39 PID: 16993 Comm: kworker/u96:2 Tainted: G W OE 4.19.25-vhulk1901.1.0.h111.aarch64 #1 [159785.878458] Hardware name: Huawei Technologies Co., Ltd. Hi1620EVBCS/Hi1620EVBCS, BIOS Hi1620 CS B070 1P TA 03/21/2019 [159785.889231] Workqueue: 0000:74:02.0_disco_q sas_discover_domain [159785.895224] pstate: 40c00009 (nZcv daif +PAN +UAO) [159785.900094] pc : sas_port_add_phy+0x188/0x1b8 [159785.904524] lr : sas_port_add_phy+0x188/0x1b8 [159785.908952] sp : ffff0001120e3b80 [159785.912341] x29: ffff0001120e3b80 x28: 0000000000000000 [159785.917727] x27: ffff802ade8f5400 x26: ffff0000681b7560 [159785.923111] x25: ffff802adf11a800 x24: ffff0000680e8000 [159785.928496] x23: ffff802ade8f5728 x22: ffff802ade8f5708 [159785.933880] x21: ffff802adea2db40 x20: ffff802ade8f5400 [159785.939264] x19: ffff802adea2d800 x18: 0000000000000010 [159785.944649] x17: 00000000821bf734 x16: ffff00006714faa0 [159785.950033] x15: ffff0000e8ab4ecf x14: 7261702079646165 [159785.955417] x13: 726c612073277469 x12: ffff00006887b830 [159785.960802] x11: ffff00006773eaa0 x10: 7968702079687020 [159785.966186] x9 : 0000000000002453 x8 : 726f702072656874 [159785.971570] x7 : 6f6e6120666f2074 x6 : ffff802bcfb21290 [159785.976955] x5 : ffff802bcfb21290 x4 : 0000000000000000 [159785.982339] x3 : ffff802bcfb298c8 x2 : 337752b234c2ab00 [159785.987723] x1 : 337752b234c2ab00 x0 : 0000000000000000 [159785.993108] Process kworker/u96:2 (pid: 16993, stack limit = 0x0000000072dae094) [159786.000576] Call trace: [159786.003097] sas_port_add_phy+0x188/0x1b8 [159786.007179] sas_ex_get_linkrate.isra.5+0x134/0x140 [159786.012130] sas_ex_discover_expander+0x128/0x408 [159786.016906] sas_ex_discover_dev+0x218/0x4c8 [159786.021249] sas_ex_discover_devices+0x9c/0x1a8 [159786.025852] sas_discover_root_expander+0x134/0x160 [159786.030802] sas_discover_domain+0x1b8/0x1e8 [159786.035148] process_one_work+0x1b4/0x3f8 [159786.039230] worker_thread+0x54/0x470 [159786.042967] kthread+0x134/0x138 [159786.046269] ret_from_fork+0x10/0x18 [159786.049918] Code: 91322300 f0004402 91178042 97fe4c9b (d4210000) [159786.056083] Modules linked in: hns3_enet_ut(OE) hclge(OE) hnae3(OE) hisi_sas_test_hw(OE) hisi_sas_test_main(OE) serdes(OE) [159786.067202] ---[ end trace 03622b9e2d99e196 ]--- [159786.071893] Kernel panic - not syncing: Fatal exception [159786.077190] SMP: stopping secondary CPUs [159786.081192] Kernel Offset: disabled [159786.084753] CPU features: 0x2,a2a00a38 Fixes: 2908d778 ("[SCSI] aic94xx: new driver") Reported-by: Jian Luo <luojian5@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
YueHaibing authored
[ Upstream commit 12e750bc ] If alloc_workqueue fails in alua_init, it should return -ENOMEM, otherwise it will trigger null-ptr-deref while unloading module which calls destroy_workqueue dereference wq->lock like this: BUG: KASAN: null-ptr-deref in __lock_acquire+0x6b4/0x1ee0 Read of size 8 at addr 0000000000000080 by task syz-executor.0/7045 CPU: 0 PID: 7045 Comm: syz-executor.0 Tainted: G C 5.1.0+ #28 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 Call Trace: dump_stack+0xa9/0x10e __kasan_report+0x171/0x18d ? __lock_acquire+0x6b4/0x1ee0 kasan_report+0xe/0x20 __lock_acquire+0x6b4/0x1ee0 lock_acquire+0xb4/0x1b0 __mutex_lock+0xd8/0xb90 drain_workqueue+0x25/0x290 destroy_workqueue+0x1f/0x3f0 __x64_sys_delete_module+0x244/0x330 do_syscall_64+0x72/0x2a0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 03197b61 ("scsi_dh_alua: Use workqueue for RTPG") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Lianbo Jiang authored
[ Upstream commit 1d94f06e ] When SME is enabled, the smartpqi driver won't work on the HP DL385 G10 machine, which causes the failure of kernel boot because it fails to allocate pqi error buffer. Please refer to the kernel log: .... [ 9.431749] usbcore: registered new interface driver uas [ 9.441524] Microsemi PQI Driver (v1.1.4-130) [ 9.442956] i40e 0000:04:00.0: fw 6.70.48768 api 1.7 nvm 10.2.5 [ 9.447237] smartpqi 0000:23:00.0: Microsemi Smart Family Controller found Starting dracut initqueue hook... [ OK ] Started Show Plymouth Boot Scre[ 9.471654] Broadcom NetXtreme-C/E driver bnxt_en v1.9.1 en. [ OK ] Started Forward Password Requests to Plymouth Directory Watch. [[0;[ 9.487108] smartpqi 0000:23:00.0: failed to allocate PQI error buffer .... [ 139.050544] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts [ 139.589779] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts Basically, the fact that the coherent DMA mask value wasn't set caused the driver to fall back to SWIOTLB when SME is active. For correct operation, lets call the dma_set_mask_and_coherent() to properly set the mask for both streaming and coherent, in order to inform the kernel about the devices DMA addressing capabilities. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Don Brace <don.brace@microsemi.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Varun Prakash authored
[ Upstream commit cc555759 ] ip_dev_find() can return NULL so add a check for NULL pointer. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Max Uvarov authored
[ Upstream commit 2b892649 ] PHY_INTERFACE_MODE_RGMII_RXID is less then TXID so code to set tx delay is never called. Fixes: 2a10154a ("net: phy: dp83867: Add TI dp83867 phy") Signed-off-by: Max Uvarov <muvarov@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Max Uvarov authored
[ Upstream commit 1a97a477 ] After reset SGMII Autoneg timer is set to 2us (bits 6 and 5 are 01). That is not enough to finalize autonegatiation on some devices. Increase this timer duration to maximum supported 16ms. Signed-off-by: Max Uvarov <muvarov@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Max Uvarov authored
[ Upstream commit 333061b9 ] For supporting 10Mps speed in SGMII mode DP83867_10M_SGMII_RATE_ADAPT bit of DP83867_10M_SGMII_CFG register has to be cleared by software. That does not affect speeds 100 and 1000 so can be done on init. Signed-off-by: Max Uvarov <muvarov@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Russell King authored
[ Upstream commit c6787263 ] Ensure that we supply the same phy interface mode to mac_link_down() as we did for the corresponding mac_link_up() call. This ensures that MAC drivers that use the phy interface mode in these methods can depend on mac_link_down() always corresponding to a mac_link_up() call for the same interface mode. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jes Sorensen authored
[ Upstream commit 41de54c6 ] If blk_mq_init_allocated_queue() fails, make sure to free the poll stat callback struct allocated. Signed-off-by: Jes Sorensen <jsorensen@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Yoshihiro Shimoda authored
[ Upstream commit 315ca92d ] The sh_eth_close() resets the MAC and then calls phy_stop() so that mdio read access result is incorrect without any error according to kernel trace like below: ifconfig-216 [003] .n.. 109.133124: mdio_access: ee700000.ethernet-ffffffff read phy:0x01 reg:0x00 val:0xffff According to the hardware manual, the RMII mode should be set to 1 before operation the Ethernet MAC. However, the previous code was not set to 1 after the driver issued the soft_reset in sh_eth_dev_exit() so that the mdio read access result seemed incorrect. To fix the issue, this patch adds a condition and set the RMII mode register in sh_eth_dev_exit() for R-Car Gen2 and RZ/A1 SoCs. Note that when I have tried to move the sh_eth_dev_exit() calling after phy_stop() on sh_eth_close(), but it gets worse (kernel panic happened and it seems that a register is accessed while the clock is off). Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sami Tolvanen authored
[ Upstream commit 1e29ab31 ] Calling sys_ni_syscall through a syscall_fn_t pointer trips indirect call Control-Flow Integrity checking due to a function type mismatch. Use SYSCALL_DEFINE0 for __arm64_sys_ni_syscall instead and remove the now unnecessary casts. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sami Tolvanen authored
[ Upstream commit 0e358bd7 ] Although a syscall defined using SYSCALL_DEFINE0 doesn't accept parameters, use the correct function type to avoid indirect call type mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sami Tolvanen authored
[ Upstream commit 8ef8f368 ] Syscall wrappers in <asm/syscall_wrapper.h> use const struct pt_regs * as the argument type. Use const in syscall_fn_t as well to fix indirect call type mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Geert Uytterhoeven authored
[ Upstream commit 6954158a ] With gcc 4.1: sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_switch_fetching_mode’: sound/firewire/fireface/ff-protocol-latter.c:97: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_begin_session’: sound/firewire/fireface/ff-protocol-latter.c:170: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:197: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c:205: warning: integer constant is too large for ‘long’ type sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_finish_session’: sound/firewire/fireface/ff-protocol-latter.c:214: warning: integer constant is too large for ‘long’ type Fix this by adding the missing "ULL" suffixes. Add the same suffix to the last constant, to maintain consistency. Fixes: fd1cc9de ("ALSA: fireface: add support for Fireface UCX") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paul Mackerras authored
[ Upstream commit 5a3f4936 ] Currently the HV KVM code takes the kvm->lock around calls to kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call kvm_for_each_vcpu() internally). However, that leads to a lock order inversion problem, because these are called in contexts where the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock according to Documentation/virtual/kvm/locking.txt. Hence there is a possibility of deadlock. To fix this, we simply don't take the kvm->lock mutex around these calls. This is safe because the implementations of kvm_for_each_vcpu() and kvm_get_vcpu_by_id() have been designed to be able to be called locklessly. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paul Mackerras authored
[ Upstream commit 1659e27d ] Currently the Book 3S KVM code uses kvm->lock to synchronize access to the kvm->arch.rtas_tokens list. Because this list is scanned inside kvmppc_rtas_hcall(), which is called with the vcpu mutex held, taking kvm->lock cause a lock inversion problem, which could lead to a deadlock. To fix this, we add a new mutex, kvm->arch.rtas_token_lock, which nests inside the vcpu mutexes, and use that instead of kvm->lock when accessing the rtas token list. This removes the lockdep_assert_held() in kvmppc_rtas_tokens_free(). At this point we don't hold the new mutex, but that is OK because kvmppc_rtas_tokens_free() is only called when the whole VM is being destroyed, and at that point nothing can be looking up a token in the list. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paul Mackerras authored
[ Upstream commit 0d4ee88d ] Currently the HV KVM code uses kvm->lock in conjunction with a flag, kvm->arch.mmu_ready, to synchronize MMU setup and hold off vcpu execution until the MMU-related data structures are ready. However, this means that kvm->lock is being taken inside vcpu->mutex, which is contrary to Documentation/virtual/kvm/locking.txt and results in lockdep warnings. To fix this, we add a new mutex, kvm->arch.mmu_setup_lock, which nests inside the vcpu mutexes, and is taken in the places where kvm->lock was taken that are related to MMU setup. Additionally we take the new mutex in the vcpu creation code at the point where we are creating a new vcore, in order to provide mutual exclusion with kvmppc_update_lpcr() and ensure that an update to kvm->arch.lpcr doesn't get missed, which could otherwise lead to a stale vcore->lpcr value. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Gen Zhang authored
[ Upstream commit 50fbc13d ] In flush_cache_ent(), 'ce->ce_path' is allocated by kstrdup_const(). It should be freed by kfree_const(), rather than kfree(). Signed-off-by: Gen Zhang <blackgod016574@gmail.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ross Lagerwall authored
[ Upstream commit d10e0cc1 ] During a suspend/resume, the xenwatch thread waits for all outstanding xenstore requests and transactions to complete. This does not work correctly for transactions started by userspace because it waits for them to complete after freezing userspace threads which means the transactions have no way of completing, resulting in a deadlock. This is trivial to reproduce by running this script and then suspending the VM: import pyxs, time c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus") c.connect() c.transaction() time.sleep(3600) Even if this deadlock were resolved, misbehaving userspace should not prevent a VM from being migrated. So, instead of waiting for these transactions to complete before suspending, store the current generation id for each transaction when it is started. The global generation id is incremented during resume. If the caller commits the transaction and the generation id does not match the current generation id, return EAGAIN so that they try again. If the transaction was instead discarded, return OK since no changes were made anyway. This only affects users of the xenbus file interface. In-kernel users of xenbus are assumed to be well-behaved and complete all transactions before freezing. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
YueHaibing authored
[ Upstream commit 41349672 ] Fixes gcc '-Wunused-but-set-variable' warning: drivers/xen/pvcalls-front.c: In function pvcalls_front_sendmsg: drivers/xen/pvcalls-front.c:543:25: warning: variable bedata set but not used [-Wunused-but-set-variable] drivers/xen/pvcalls-front.c: In function pvcalls_front_recvmsg: drivers/xen/pvcalls-front.c:638:25: warning: variable bedata set but not used [-Wunused-but-set-variable] They are never used since introduction. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Madalin Bucur authored
[ Upstream commit 7aae703f ] Make sure only the portals for the online CPUs are used. Without this change, there are issues when someone boots with maxcpus=n, with n < actual number of cores available as frames either received or corresponding to the transmit confirmation path would be offered for dequeue to the offline CPU portals, getting lost. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Randy Dunlap authored
[ Upstream commit 9a626c4a ] Fix build errors on ia64 when DISCONTIGMEM=y and NUMA=y by exporting paddr_to_nid(). Fixes these build errors: ERROR: "paddr_to_nid" [sound/core/snd-pcm.ko] undefined! ERROR: "paddr_to_nid" [net/sunrpc/sunrpc.ko] undefined! ERROR: "paddr_to_nid" [fs/cifs/cifs.ko] undefined! ERROR: "paddr_to_nid" [drivers/video/fbdev/core/fb.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/mon/usbmon.ko] undefined! ERROR: "paddr_to_nid" [drivers/usb/core/usbcore.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/raid1.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-crypt.ko] undefined! ERROR: "paddr_to_nid" [drivers/md/dm-bufio.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-core.ko] undefined! ERROR: "paddr_to_nid" [drivers/ide/ide-cd_mod.ko] undefined! ERROR: "paddr_to_nid" [drivers/gpu/drm/drm.ko] undefined! ERROR: "paddr_to_nid" [drivers/char/agp/agpgart.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/nbd.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/loop.ko] undefined! ERROR: "paddr_to_nid" [drivers/block/brd.ko] undefined! ERROR: "paddr_to_nid" [crypto/ccm.ko] undefined! Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Thomas Richter authored
[ Upstream commit 6738028d ] Command 'perf record' and 'perf report' on a system without kernel debuginfo packages uses /proc/kallsyms and /proc/modules to find addresses for kernel and module symbols. On x86 this works for root and non-root users. On s390, when invoked as non-root user, many of the following warnings are shown and module symbols are missing: proc/{kallsyms,modules} inconsistency while looking for "[sha1_s390]" module! Command 'perf record' creates a list of module start addresses by parsing the output of /proc/modules and creates a PERF_RECORD_MMAP record for the kernel and each module. The following function call sequence is executed: machine__create_kernel_maps machine__create_module modules__parse machine__create_module --> for each line in /proc/modules arch__fix_module_text_start Function arch__fix_module_text_start() is s390 specific. It opens file /sys/module/<name>/sections/.text to extract the module's .text section start address. On s390 the module loader prepends a header before the first section, whereas on x86 the module's text section address is identical the the module's load address. However module section files are root readable only. For non-root the read operation fails and machine__create_module() returns an error. Command perf record does not generate any PERF_RECORD_MMAP record for loaded modules. Later command perf report complains about missing module maps. To fix this function arch__fix_module_text_start() always returns success. For root users there is no change, for non-root users the module's load address is used as module's text start address (the prepended header then counts as part of the text section). This enable non-root users to use module symbols and avoid the warning when perf report is executed. Output before: [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text Output after: [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP 0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text 0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz 0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz 0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Namhyung Kim authored
[ Upstream commit 6584140b ] It seems that the current code lacks holding the namespace lock in thread__namespaces(). Otherwise it can see inconsistent results. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Krister Johansen <kjlx@templeofstupid.com> Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Harald Freudenberger authored
[ Upstream commit 7379e652 ] The zcrypt device driver does not handle CPRBs which address a control domain correctly. This fix introduces a workaround: The domain field of the request CPRB is checked if there is a valid domain value in there. If this is true and the value is a control only domain (a domain which is enabled in the crypto config ADM mask but disabled in the AQM mask) the CPRB is forwarded to the default usage domain. If there is no default domain, the request is rejected with an ENODEV. This fix is important for maintaining crypto adapters. For example one LPAR can use a crypto adapter domain ('Control and Usage') but another LPAR needs to be able to maintain this adapter domain ('Control'). Scenarios like this did not work properly and the patch enables this. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Shawn Landden authored
[ Upstream commit 97acec7d ] This strncat() is safe because the buffer was allocated with zalloc(), however gcc doesn't know that. Since the string always has 4 non-null bytes, just use memcpy() here. CC /home/shawn/linux/tools/perf/util/data-convert-bt.o In file included from /usr/include/string.h:494, from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27, from util/data-convert-bt.c:22: In function ‘strncat’, inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4: /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation] 136 | return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Shawn Landden <shawn@git.icu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 20190518183238.10954-1-shawn@git.icu Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sahitya Tummala authored
[ Upstream commit f6122ed2 ] In the vfs_statx() context, during path lookup, the dentry gets added to sd->s_dentry via configfs_attach_attr(). In the end, vfs_statx() kills the dentry by calling path_put(), which invokes configfs_d_iput(). Ideally, this dentry must be removed from sd->s_dentry but it doesn't if the sd->s_count >= 3. As a result, sd->s_dentry is holding reference to a stale dentry pointer whose memory is already freed up. This results in use-after-free issue, when this stale sd->s_dentry is accessed later in configfs_readdir() path. This issue can be easily reproduced, by running the LTP test case - sh fs_racer_file_list.sh /config (https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/fs/racer/fs_racer_file_list.sh) Fixes: 76ae281f ('configfs: fix race between dentry put and lookup') Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bard Liao authored
[ Upstream commit fa763f1b ] We observed the same issue as reported by commit a8d7bde2 ("ALSA: hda - Force polling mode on CFL for fixing codec communication") We don't have a better solution. So apply the same workaround to CNL. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Yingjoe Chen authored
[ Upstream commit a0692f0e ] If I2C_M_RECV_LEN check failed, msgs[i].buf allocated by memdup_user will not be freed. Pump index up so it will be freed. Fixes: 838bfa60 ("i2c-dev: Add support for I2C_M_RECV_LEN") Signed-off-by: Yingjoe Chen <yingjoe.chen@mediatek.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Dmitry Bogdanov authored
[ Upstream commit eaeb3b74 ] Driver stops producing skbs on ring if a packet with FCS error was coalesced into LRO session. Ring gets hang forever. Thats a logical error in driver processing descriptors: When rx_stat indicates MAC Error, next pointer and eop flags are not filled. This confuses driver so it waits for descriptor 0 to be filled by HW. Solution is fill next pointer and eop flag even for packets with FCS error. Fixes: bab6de8f ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions.") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Igor Russkikh authored
[ Upstream commit 31bafc49 ] In case no other traffic happening on the ring, full tx cleanup may not be completed. That may cause socket buffer to overflow and tx traffic to stuck until next activity on the ring happens. This is due to logic error in budget variable decrementor. Variable is compared with zero, and then post decremented, causing it to become MAX_INT. Solution is remove decrementor from the `for` statement and rewrite it in a clear way. Fixes: b647d398 ("net: aquantia: Add tx clean budget and valid budget handling logic") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Lucas Stach authored
[ Upstream commit 1396500d ] The devcoredump needs to operate on a stable state of the MMU while it is writing the MMU state to the coredump. The missing lock allowed both the userspace submit, as well as the GPU job finish paths to mutate the MMU state while a coredump is under way. Fixes: a8c21a54 (drm/etnaviv: add initial etnaviv DRM driver) Reported-by: David Jander <david@protonic.nl> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: David Jander <david@protonic.nl> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Rafael J. Wysocki authored
[ Upstream commit 9a51c6b1 ] Both acpi_pci_need_resume() and acpi_dev_needs_resume() check if the current ACPI wakeup configuration of the device matches what is expected as far as system wakeup from sleep states is concerned, as reflected by the device_may_wakeup() return value for the device. However, they only should do that if wakeup.flags.valid is set for the device's ACPI companion, because otherwise the wakeup.prepare_count value for it is meaningless. Add the missing wakeup.flags.valid checks to these functions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Kees Cook authored
[ Upstream commit 3e66b7cc ] Building with Clang reports the redundant use of MODULE_DEVICE_TABLE(): drivers/net/ethernet/dec/tulip/de4x5.c:2110:1: error: redefinition of '__mod_eisa__de4x5_eisa_ids_device_table' MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids); ^ ./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE' extern typeof(name) __mod_##type##__##name##_device_table \ ^ <scratch space>:90:1: note: expanded from here __mod_eisa__de4x5_eisa_ids_device_table ^ drivers/net/ethernet/dec/tulip/de4x5.c:2100:1: note: previous definition is here MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids); ^ ./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE' extern typeof(name) __mod_##type##__##name##_device_table \ ^ <scratch space>:85:1: note: expanded from here __mod_eisa__de4x5_eisa_ids_device_table ^ This drops the one further from the table definition to match the common use of MODULE_DEVICE_TABLE(). Fixes: 07563c71 ("EISA bus MODALIAS attributes support") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ioana Radulescu authored
[ Upstream commit bd8460fa ] Use PTR_ERR_OR_ZERO instead of PTR_ERR in cases where zero is a valid input. Reported by smatch. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ioana Radulescu authored
[ Upstream commit 5a20a093 ] Smatch reports a potential spectre vulnerability in the dpaa2-eth driver, where the value of rxnfc->fs.location (which is provided from user-space) is used as index in an array. Add a call to array_index_nospec() to sanitize the access. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Pavel Begunkov authored
[ Upstream commit a278682d ] If io_copy_iov() fails, it will break the loop and report success, albeit partially completed operation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-