- 03 Aug, 2018 40 commits
-
-
Thierry Escande authored
[ Upstream commit 9960521c ] This patch fixes the following warning during boot: do not call blocking ops when !TASK_RUNNING; state=1 set at [<(ptrval)>] qca_setup+0x194/0x750 [hci_uart] WARNING: CPU: 2 PID: 1878 at kernel/sched/core.c:6135 __might_sleep+0x7c/0x88 In qca_set_baudrate(), the current task state is set to TASK_UNINTERRUPTIBLE before going to sleep for 300ms. It was then restored to TASK_INTERRUPTIBLE. This patch sets the current task state back to TASK_RUNNING instead. Signed-off-by:
Thierry Escande <thierry.escande@linaro.org> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shaul Triebitz authored
[ Upstream commit 0f22e400 ] Make sure the rx_allocator worker is canceled before running the rx_init routine. rx_init frees and re-allocates all rxb's pages. The rx_allocator worker also allocates pages for the used rxb's. Running rx_init and rx_allocator simultaniously causes a kernel panic. Fix that by canceling the work in rx_init. Signed-off-by:
Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by:
Luca Coelho <luciano.coelho@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ethan Lien authored
[ Upstream commit e73e81b6 ] [Problem description and how we fix it] We should balance dirty metadata pages at the end of btrfs_finish_ordered_io, since a small, unmergeable random write can potentially produce dirty metadata which is multiple times larger than the data itself. For example, a small, unmergeable 4KiB write may produce: 16KiB dirty leaf (and possibly 16KiB dirty node) in subvolume tree 16KiB dirty leaf (and possibly 16KiB dirty node) in checksum tree 16KiB dirty leaf (and possibly 16KiB dirty node) in extent tree Although we do call balance dirty pages in write side, but in the buffered write path, most metadata are dirtied only after we reach the dirty background limit (which by far only counts dirty data pages) and wakeup the flusher thread. If there are many small, unmergeable random writes spread in a large btree, we'll find a burst of dirty pages exceeds the dirty_bytes limit after we wakeup the flusher thread - which is not what we expect. In our machine, it caused out-of-memory problem since a page cannot be dropped if it is marked dirty. Someone may worry about we may sleep in btrfs_btree_balance_dirty_nodelay, but since we do btrfs_finish_ordered_io in a separate worker, it will not stop the flusher consuming dirty pages. Also, we use different worker for metadata writeback endio, sleep in btrfs_finish_ordered_io help us throttle the size of dirty metadata pages. [Reproduce steps] To reproduce the problem, we need to do 4KiB write randomly spread in a large btree. In our 2GiB RAM machine: 1) Create 4 subvolumes. 2) Run fio on each subvolume: [global] direct=0 rw=randwrite ioengine=libaio bs=4k iodepth=16 numjobs=1 group_reporting size=128G runtime=1800 norandommap time_based randrepeat=0 3) Take snapshot on each subvolume and repeat fio on existing files. 4) Repeat step (3) until we get large btrees. In our case, by observing btrfs_root_item->bytes_used, we have 2GiB of metadata in each subvolume tree and 12GiB of metadata in extent tree. 5) Stop all fio, take snapshot again, and wait until all delayed work is completed. 6) Start all fio. Few seconds later we hit OOM when the flusher starts to work. It can be reproduced even when using nocow write. Signed-off-by:
Ethan Lien <ethanlien@synology.com> Reviewed-by:
David Sterba <dsterba@suse.com> [ add comment ] Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kiszka authored
[ Upstream commit 3bbce531 ] Fix a memory leak by freeing the PCI resource list in devm_pci_release_host_bridge_dev(). Fixes: 5c3f18cc ("PCI: Add devm_pci_alloc_host_bridge() interface") Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shuah Khan (Samsung OSG) authored
[ Upstream commit 5c30a038 ] When intel_pstate test is skipped because of unmet dependencies and/or unsupported configuration, it returns 0 which is treated as a pass by the Kselftest framework. This leads to false positive result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shuah Khan (Samsung OSG) authored
[ Upstream commit b27f0259 ] When memfd test is skipped because of unmet dependencies and/or unsupported configuration, it returns non-zero value which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Added an explicit check for root user at the start of memfd hugetlbfs test and return skip code if a non-root user attempts to run it. In addition, return skip code when not enough huge pages are available to run the test. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Reviewed-by:
Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Díaz authored
[ Upstream commit e9d33f14 ] A few changes improve the overall usability of the test: * fix a hard-coded maximum frequency (3300), * don't adjust the CPU frequency if only evaluating results, * fix a comparison for multiple frequencies. A symptom of that last issue looked like this: ./run.sh: line 107: [: too many arguments ./run.sh: line 110: 3099 3099 3100-3100: syntax error in expression (error token is \"3099 3100-3100\") Because a check will count how many differente frequencies there are among the CPUs of the system, and after they are tallied another read is performed, which might produce different results. Signed-off-by:
Daniel Díaz <daniel.diaz@linaro.org> Signed-off-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
[ Upstream commit d71f11c0 ] For Nehalem and Westmere, there is only one fixed counter for W-Box. There is no index which is bigger than UNCORE_PMC_IDX_FIXED. It is not correct to use >= to check fixed counter. The code quality issue will bring problem when new counter index is introduced. Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
[ Upstream commit 4749f819 ] There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only exception is client IMC uncore, which has been specially handled. For generic code, it is not correct to use >= to check fixed counter. The code quality issue will bring problem when a new counter index is introduced. Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Grzeschik authored
[ Upstream commit de19ca6f ] As the amount of available ports varies by the kernels build configuration. To remove the limitation of the fixed 128 ports we allocate the amount of idevs by using the number we get from the kernel. Signed-off-by:
Michael Grzeschik <m.grzeschik@pengutronix.de> Acked-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shuah Khan (Samsung OSG) authored
[ Upstream commit d179f99a ] detach_port() fails to call usbip_vhci_driver_close() from its error path after usbip_vhci_detach_device() returns failure, leaking memory allocated in usbip_vhci_driver_open() and holding udev_context and udev references. Fix it to call usbip_vhci_driver_close(). Signed-off-by:
Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Filippo Muzzini authored
[ Upstream commit a12bffeb ] In bfq_requests_merged(), there is a deadlock because the lock on bfqq->bfqd->lock is held by the calling function, but the code of this function tries to grab the lock again. This deadlock is currently hidden by another bug (fixed by next commit for this source file), which causes the body of bfq_requests_merged() to be never executed. This commit removes the deadlock by removing the lock/unlock pair. Signed-off-by:
Filippo Muzzini <filippo.muzzini@outlook.it> Signed-off-by:
Paolo Valente <paolo.valente@linaro.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
[ Upstream commit 27319ba4 ] Thread GC thread - f2fs_ioc_start_atomic_write - get_dirty_pages - filemap_write_and_wait_range - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - f2fs_is_atomic_file - set_page_dirty - set_inode_flag(, FI_ATOMIC_FILE) Dirty data page can still be generated by GC in race condition as above call stack. This patch adds fi->dio_rwsem[WRITE] in f2fs_ioc_start_atomic_write to avoid such race. Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
[ Upstream commit c22aecd7 ] dquot_initialize() can fail due to any exception inside quota subsystem, f2fs needs to be aware of it, and return correct return value to caller. Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sahitya Tummala authored
[ Upstream commit 60b2b4ee ] f2fs_ioc_shutdown() ioctl gets stuck in the below path when issued with F2FS_GOING_DOWN_FULLSYNC option. __switch_to+0x90/0xc4 percpu_down_write+0x8c/0xc0 freeze_super+0xec/0x1e4 freeze_bdev+0xc4/0xcc f2fs_ioctl+0xc0c/0x1ce0 f2fs_compat_ioctl+0x98/0x1f0 Signed-off-by:
Sahitya Tummala <stummala@codeaurora.org> Reviewed-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
[ Upstream commit e5e5732d ] After revoking atomic write, related LBA can be reused by others, so we need to wait page writeback before reusing the LBA, in order to avoid interference between old atomic written in-flight IO and new IO. Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
[ Upstream commit 64c74a7a ] - f2fs_fill_super - recover_fsync_data - recover_data - del_fsync_inode - iput - iput_final - write_inode_now - f2fs_write_inode - f2fs_balance_fs - f2fs_balance_fs_bg - sync_dirty_inodes With data_flush mount option, during recovery, in order to avoid entering above writeback flow, let's detect recovery status and do skip in f2fs_balance_fs_bg. Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Yunlei He <heyunlei@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
[ Upstream commit 14a28559 ] This patch fixes error path of move_data_page: - clear cold data flag if it fails to write page. - redirty page for non-ENOMEM case. Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anatoly Pugachev authored
[ Upstream commit 4071e67c ] The following patch disables loading of f2fs module on architectures which have PAGE_SIZE > 4096 , since it is impossible to mount f2fs on such architectures , log messages are: mount: /mnt: wrong fs type, bad option, bad superblock on /dev/vdiskb1, missing codepage or helper program, or other error. /dev/vdiskb1: F2FS filesystem, UUID=1d8b9ca4-2389-4910-af3b-10998969f09c, volume name "" May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 1th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 2th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB which was introduced by git commit 5c9b4692 tested on git kernel 4.17.0-rc6-00309-gec30dcf7 with patch applied: modprobe: ERROR: could not insert 'f2fs': Invalid argument May 28 01:40:28 v215 kernel: F2FS not supported on PAGE_SIZE(8192) != 4096 Signed-off-by:
Anatoly Pugachev <matorola@gmail.com> Reviewed-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
[ Upstream commit ae55e59d ] If the server recalls the layout that was just handed out, we risk hitting a race as described in RFC5661 Section 2.10.6.3 unless we ensure that we release the sequence slot after processing the LAYOUTGET operation that was sent as part of the OPEN compound. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexey Kodanev authored
[ Upstream commit 9c7f96fd ] The patch moves the "trans->msg_type == NFT_MSG_NEWSET" check before using nft_trans_set(trans). Otherwise we can get out of bounds read. For example, KASAN reported the one when running 0001_cache_handling_0 nft test. In this case "trans->msg_type" was NFT_MSG_NEWTABLE: [75517.177808] BUG: KASAN: slab-out-of-bounds in nft_set_lookup_global+0x22f/0x270 [nf_tables] [75517.279094] Read of size 8 at addr ffff881bdb643fc8 by task nft/7356 ... [75517.375605] CPU: 26 PID: 7356 Comm: nft Tainted: G E 4.17.0-rc7.1.x86_64 #1 [75517.489587] Hardware name: Oracle Corporation SUN SERVER X4-2 [75517.618129] Call Trace: [75517.648821] dump_stack+0xd1/0x13b [75517.691040] ? show_regs_print_info+0x5/0x5 [75517.742519] ? kmsg_dump_rewind_nolock+0xf5/0xf5 [75517.799300] ? lock_acquire+0x143/0x310 [75517.846738] print_address_description+0x85/0x3a0 [75517.904547] kasan_report+0x18d/0x4b0 [75517.949892] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.019153] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.088420] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.157689] nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.224869] nf_tables_newsetelem+0x1a5/0x5d0 [nf_tables] [75518.291024] ? nft_add_set_elem+0x2280/0x2280 [nf_tables] [75518.357154] ? nla_parse+0x1a5/0x300 [75518.401455] ? kasan_kmalloc+0xa6/0xd0 [75518.447842] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75518.507743] ? nfnetlink_rcv+0x7a5/0x1bdf [nfnetlink] [75518.569745] ? nfnl_err_reset+0x3c0/0x3c0 [nfnetlink] [75518.631711] ? lock_acquire+0x143/0x310 [75518.679133] ? netlink_deliver_tap+0x9b/0x1070 [75518.733840] ? kasan_unpoison_shadow+0x31/0x40 [75518.788542] netlink_unicast+0x45d/0x680 [75518.837111] ? __isolate_free_page+0x890/0x890 [75518.891913] ? netlink_attachskb+0x6b0/0x6b0 [75518.944542] netlink_sendmsg+0x6fa/0xd30 [75518.993107] ? netlink_unicast+0x680/0x680 [75519.043758] ? netlink_unicast+0x680/0x680 [75519.094402] sock_sendmsg+0xd9/0x160 [75519.138810] ___sys_sendmsg+0x64d/0x980 [75519.186234] ? copy_msghdr_from_user+0x350/0x350 [75519.243118] ? lock_downgrade+0x650/0x650 [75519.292738] ? do_raw_spin_unlock+0x5d/0x250 [75519.345456] ? _raw_spin_unlock+0x24/0x30 [75519.395065] ? __handle_mm_fault+0xbde/0x3410 [75519.448830] ? sock_setsockopt+0x3d2/0x1940 [75519.500516] ? __lock_acquire.isra.25+0xdc/0x19d0 [75519.558448] ? lock_downgrade+0x650/0x650 [75519.608057] ? __audit_syscall_entry+0x317/0x720 [75519.664960] ? __fget_light+0x58/0x250 [75519.711325] ? __sys_sendmsg+0xde/0x170 [75519.758850] __sys_sendmsg+0xde/0x170 [75519.804193] ? __ia32_sys_shutdown+0x90/0x90 [75519.856725] ? syscall_trace_enter+0x897/0x10e0 [75519.912354] ? trace_event_raw_event_sys_enter+0x920/0x920 [75519.979432] ? __audit_syscall_entry+0x720/0x720 [75520.036118] do_syscall_64+0xa3/0x3d0 [75520.081248] ? prepare_exit_to_usermode+0x47/0x1d0 [75520.139904] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75520.201680] RIP: 0033:0x7fc153320ba0 [75520.245772] RSP: 002b:00007ffe294c3638 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [75520.337708] RAX: ffffffffffffffda RBX: 00007ffe294c4820 RCX: 00007fc153320ba0 [75520.424547] RDX: 0000000000000000 RSI: 00007ffe294c46b0 RDI: 0000000000000003 [75520.511386] RBP: 00007ffe294c47b0 R08: 0000000000000004 R09: 0000000002114090 [75520.598225] R10: 00007ffe294c30a0 R11: 0000000000000246 R12: 00007ffe294c3660 [75520.684961] R13: 0000000000000001 R14: 00007ffe294c3650 R15: 0000000000000001 [75520.790946] Allocated by task 7356: [75520.833994] kasan_kmalloc+0xa6/0xd0 [75520.878088] __kmalloc+0x189/0x450 [75520.920107] nft_trans_alloc_gfp+0x20/0x190 [nf_tables] [75520.983961] nf_tables_newtable+0xcd0/0x1bd0 [nf_tables] [75521.048857] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75521.108655] netlink_unicast+0x45d/0x680 [75521.157013] netlink_sendmsg+0x6fa/0xd30 [75521.205271] sock_sendmsg+0xd9/0x160 [75521.249365] ___sys_sendmsg+0x64d/0x980 [75521.296686] __sys_sendmsg+0xde/0x170 [75521.341822] do_syscall_64+0xa3/0x3d0 [75521.386957] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.467867] Freed by task 23454: [75521.507804] __kasan_slab_free+0x132/0x180 [75521.558137] kfree+0x14d/0x4d0 [75521.596005] free_rt_sched_group+0x153/0x280 [75521.648410] sched_autogroup_create_attach+0x19a/0x520 [75521.711330] ksys_setsid+0x2ba/0x400 [75521.755529] __ia32_sys_setsid+0xa/0x10 [75521.802850] do_syscall_64+0xa3/0x3d0 [75521.848090] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.929000] The buggy address belongs to the object at ffff881bdb643f80 which belongs to the cache kmalloc-96 of size 96 [75522.079797] The buggy address is located 72 bytes inside of 96-byte region [ffff881bdb643f80, ffff881bdb643fe0) [75522.221234] The buggy address belongs to the page: [75522.280100] page:ffffea006f6d90c0 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [75522.377443] flags: 0x2fffff80000100(slab) [75522.426956] raw: 002fffff80000100 0000000000000000 0000000000000000 0000000180200020 [75522.521275] raw: ffffea006e6fafc0 0000000c0000000c ffff881bf180f400 0000000000000000 [75522.615601] page dumped because: kasan: bad access detected Fixes: 37a9cc52 ("netfilter: nf_tables: add generation mask to sets") Signed-off-by:
Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Javier González authored
[ Upstream commit e37d0798 ] When cleaning up buffer entries as we wrap up, their state should be "completed". If any of the entries is in "submitted" state, it means that something bad has happened. Trigger a warning immediately instead of waiting for the state flag to eventually be updated, thus hiding the issue. Signed-off-by:
Javier González <javier@cnexlabs.com> Signed-off-by:
Matias Bjørling <mb@lightnvm.io> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Leon Romanovsky authored
[ Upstream commit 2468b82d ] Let's perform checks in-place instead of BUG_ONs. Signed-off-by:
Leon Romanovsky <leonro@mellanox.com> Signed-off-by:
Doug Ledford <dledford@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicholas Piggin authored
[ Upstream commit 926bc2f1 ] The stores to update the SLB shadow area must be made as they appear in the C code, so that the hypervisor does not see an entry with mismatched vsid and esid. Use WRITE_ONCE for this. GCC has been observed to elide the first store to esid in the update, which means that if the hypervisor interrupts the guest after storing to vsid, it could see an entry with old esid and new vsid, which may possibly result in memory corruption. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stewart Smith authored
[ Upstream commit 447808bf ] time_init() will set up tb_ticks_per_usec based on reality. time_init() is called *after* udbg_init_opal_common() during boot. from arch/powerpc/kernel/time.c: unsigned long tb_ticks_per_usec = 100; /* sane default */ Currently, all powernv systems have a timebase frequency of 512mhz (512000000/1000000 == 0x200) - although there's nothing written down anywhere that I can find saying that we couldn't make that different based on the requirements in the ISA. So, we've been (accidentally) thwacking the (currently) correct (for powernv at least) value for tb_ticks_per_usec earlier than we otherwise would have. The "sane default" seems to be adequate for our purposes between udbg_init_opal_common() and time_init() being called, and if it isn't, then we should probably be setting it somewhere that isn't hvc_opal.c! Signed-off-by:
Stewart Smith <stewart@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sam Bobroff authored
[ Upstream commit 46d4be41 ] Correct two cases where eeh_pcid_get() is used to reference the driver's module but the reference is dropped before the driver pointer is used. In eeh_rmv_device() also refactor a little so that only two calls to eeh_pcid_put() are needed, rather than three and the reference isn't taken at all if it wasn't needed. Signed-off-by:
Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michal Suchanek authored
[ Upstream commit a6b3964a ] A no-op form of ori (or immediate of 0 into r31 and the result stored in r31) has been re-tasked as a speculation barrier. The instruction only acts as a barrier on newer machines with appropriate firmware support. On older CPUs it remains a harmless no-op. Implement barrier_nospec using this instruction. mpe: The semantics of the instruction are believed to be that it prevents execution of subsequent instructions until preceding branches have been fully resolved and are no longer executing speculatively. There is no further documentation available at this time. Signed-off-by:
Michal Suchanek <msuchanek@suse.de> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe Leroy authored
[ Upstream commit 1128bb78 ] commit 87a156fb ("Align hot loops of some string functions") degraded the performance of string functions by adding useless nops A simple benchmark on an 8xx calling 100000x a memchr() that matches the first byte runs in 41668 TB ticks before this patch and in 35986 TB ticks after this patch. So this gives an improvement of approx 10% Another benchmark doing the same with a memchr() matching the 128th byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks after this patch, so regardless on the number of loops, removing those useless nops improves the test by 5683 TB ticks. Fixes: 87a156fb ("Align hot loops of some string functions") Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Cong Wang authored
[ Upstream commit cb2595c1 ] ucma_process_join() will free the new allocated "mc" struct, if there is any error after that, especially the copy_to_user(). But in parallel, ucma_leave_multicast() could find this "mc" through idr_find() before ucma_process_join() frees it, since it is already published. So "mc" could be used in ucma_leave_multicast() after it is been allocated and freed in ucma_process_join(), since we don't refcnt it. Fix this by separating "publish" from ID allocation, so that we can get an ID first and publish it later after copy_to_user(). Fixes: c8f6a362 ("RDMA/cma: Add multicast communication support") Reported-by:
Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
Jason Gunthorpe <jgg@mellanox.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Benjamin Poirier authored
[ Upstream commit fff200ca ] There have been multiple reports of crashes that look like kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50 [...] kernel: Call Trace: kernel: [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e] kernel: [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80 [e1000e] kernel: [<ffffffff810992c5>] process_one_work+0x155/0x440 kernel: [<ffffffff81099e16>] worker_thread+0x116/0x4b0 kernel: [<ffffffff8109f422>] kthread+0xd2/0xf0 kernel: [<ffffffff8163184f>] ret_from_fork+0x3f/0x70 These can be traced back to the fact that e1000e_systim_reset() skips the timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL, which leads to a null deref in timecounter_read(). Commit 83129b37 ("e1000e: fix systim issues", v4.2-rc1) reworked e1000e_get_base_timinca() in such a way that it can return -EINVAL for e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL. Some experimentation has shown that on I219 (e1000_pch_spt, "MAC: 12") adapters, the E1000_TSYNCRXCTL_SYSCFI flag is unstable; TSYNCRXCTL reads sometimes don't have the SYSCFI bit set. Retrying the read shortly after finds the bit to be set. This was observed at boot (probe) but also link up and link down. Moreover, the phc (PTP Hardware Clock) seems to operate normally even after reads where SYSCFI=0. Therefore, remove this register read and unconditionally set the clock parameters. Reported-by:
Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> Message-Id: <20180425065243.g5mqewg5irkwgwgv@f2> Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1075876 Fixes: 83129b37 ("e1000e: fix systim issues") Signed-off-by:
Benjamin Poirier <bpoirier@suse.com> Tested-by:
Aaron Brown <aaron.f.brown@intel.com> Signed-off-by:
Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chengguang Xu authored
[ Upstream commit c36ed50d ] On currently logic: when I specify rasize=0~1 then it will be 4096. when I specify rasize=2~4097 then it will be 8192. Make it the same as rsize & wsize. Signed-off-by:
Chengguang Xu <cgxu519@gmx.com> Reviewed-by:
"Yan, Zheng" <zyan@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wang YanQing authored
[ Upstream commit 68565a1a ] The names for BPF_ALU64 | BPF_ARSH are emit_a32_arsh_*, the names for BPF_ALU64 | BPF_LSH are emit_a32_lsh_*, but the names for BPF_ALU64 | BPF_RSH are emit_a32_lsr_*. For consistence reason, let's rename emit_a32_lsr_* to emit_a32_rsh_*. This patch also corrects a wrong comment. Fixes: 39c13c20 ("arm: eBPF JIT compiler") Signed-off-by:
Wang YanQing <udknight@gmail.com> Cc: Shubham Bansal <illusionist.neo@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux@armlinux.org.uk Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sergey Senozhatsky authored
[ Upstream commit 554755be ] Drop the in_nmi() check from printk_safe_flush_on_panic() and attempt to re-init (IOW unlock) locked logbuf spinlock from panic CPU regardless of its context. Otherwise, theoretically, we can deadlock on logbuf trying to flush per-CPU buffers: a) Panic CPU is running in non-NMI context b) Panic CPU sends out shutdown IPI via reboot vector c) Panic CPU fails to stop all remote CPUs d) Panic CPU sends out shutdown IPI via NMI vector One of the CPUs that we bring down via NMI vector can hold logbuf spin lock (theoretically). Link: http://lkml.kernel.org/r/20180530070350.10131-1-sergey.senozhatsky@gmail.com To: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by:
Petr Mladek <pmladek@suse.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marco Felsch authored
[ Upstream commit 44ee54aa ] The DA9063 watchdog has only one register field to store the timeout value and to enable the watchdog. The watchdog gets enabled if the value is not zero. There is no issue if the watchdog is already running but it leads into problems if the watchdog is disabled. If the watchdog is disabled and only the timeout value should be prepared the watchdog gets enabled too. Add a check to get the current watchdog state and update the watchdog timeout value on hw-side only if the watchdog is already active. Fixes: 5e9c16e3 ("watchdog: Add DA9063 PMIC watchdog driver.") Signed-off-by:
Marco Felsch <m.felsch@pengutronix.de> Reviewed-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Laurentiu Tudor authored
[ Upstream commit 0cdd431c ] Add the required iommu_dma_map_msi_msg() when composing the MSI message, otherwise the interrupts will not work. Signed-off-by:
Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: jason@lakedaemon.net Cc: marc.zyngier@arm.com Cc: zhiqiang.hou@nxp.com Cc: minghuan.lian@nxp.com Link: https://lkml.kernel.org/r/20180605122727.12831-1-laurentiu.tudor@nxp.comSigned-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jozsef Kadlecsik authored
[ Upstream commit bd975e69 ] When listing sets with timeout support, there's a probability that just timing out entries with "0" timeout value is listed/saved. However when restoring the saved list, the zero timeout value means permanent elelements. The new behaviour is that timing out entries are listed with "timeout 1" instead of zero. Fixes netfilter bugzilla #1258. Signed-off-by:
Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florent Fourcot authored
[ Upstream commit cbdebe48 ] Userspace `ipset` command forbids family option for hash:mac type: ipset create test hash:mac family inet4 ipset v6.30: Unknown argument: `family' However, this check is not done in kernel itself. When someone use external netlink applications (pyroute2 python library for example), one can create hash:mac with invalid family and inconsistant results from userspace (`ipset` command cannot read set content anymore). This patch enforce the logic in kernel, and forbids insertion of hash:mac with a family set. Since IP_SET_PROTO_UNDEF is defined only for hash:mac, this patch has no impact on other hash:* sets Signed-off-by:
Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by:
Victorien Molle <victorien.molle@wifirst.fr> Signed-off-by:
Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Olsa authored
[ Upstream commit ceac7b79 ] Currently all the event parsing fails end up in the event_pmu rule, and display misleading help like: $ perf stat -e inst kill event syntax error: 'inst' \___ Cannot find PMU `inst'. Missing kernel support? ... The reason is that the event_pmu is too strong and match also single string. Changing it to force the '/' separators to be part of the rule, and getting the proper error now: $ perf stat -e inst kill event syntax error: 'inst' \___ parser error Run 'perf list' for a list of valid events ... Suggested-by:
Adrian Hunter <adrian.hunter@intel.com> Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Tested-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180605121416.31645-1-jolsa@kernel.orgSigned-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexandre Belloni authored
[ Upstream commit abfdff44 ] When using RTC_ALM_SET or RTC_WKALM_SET with rtc_wkalrm.enabled not set, rtc_timer_enqueue() is not called and rtc_set_alarm() may succeed but the subsequent RTC_AIE_ON ioctl will fail. RTC_ALM_READ would also fail in that case. Ensure rtc_set_alarm() fails when alarms are not supported to avoid letting programs think the alarms are working for a particular RTC when they are not. Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mathieu Malaterre authored
[ Upstream commit a38965bf ] __printf is useful to verify format and arguments. Remove the following warning (with W=1): mm/slub.c:721:2: warning: function might be possible candidate for `gnu_printf' format attribute [-Wsuggest-attribute=format] Link: http://lkml.kernel.org/r/20180505200706.19986-1-malat@debian.orgSigned-off-by:
Mathieu Malaterre <malat@debian.org> Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-