- 03 Jun, 2016 10 commits
-
-
Mel Gorman authored
The optimistic fast path may use cpuset_current_mems_allowed instead of of a NULL nodemask supplied by the caller for cpuset allocations. The preferred zone is calculated on this basis for statistic purposes and as a starting point in the zonelist iterator. However, if the context can ignore memory policies due to being atomic or being able to ignore watermarks then the starting point in the zonelist iterator is no longer correct. This patch resets the zonelist iterator in the allocator slowpath if the context can ignore memory policies. This will alter the zone used for statistics but only after it is known that it makes sense for that context. Resetting it before entering the slowpath would potentially allow an ALLOC_CPUSET allocation to be accounted for against the wrong zone. Note that while nodemask is not explicitly set to the original nodemask, it would only have been overwritten if cpuset_enabled() and it was reset before the slowpath was entered. Link: http://lkml.kernel.org/r/20160602103936.GU2527@techsingularity.net Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
Geert Uytterhoeven reported the following problem that bisected to commit c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") on m68k/ARAnyM BUG: scheduling while atomic: cron/668/0x10c9a0c0 Modules linked in: CPU: 0 PID: 668 Comm: cron Not tainted 4.6.0-atari-05133-gc33d6c06 #364 Call Trace: [<0003d7d0>] __schedule_bug+0x40/0x54 __schedule+0x312/0x388 __schedule+0x0/0x388 prepare_to_wait+0x0/0x52 schedule+0x64/0x82 schedule_timeout+0xda/0x104 set_next_entity+0x18/0x40 pick_next_task_fair+0x78/0xda io_schedule_timeout+0x36/0x4a bit_wait_io+0x0/0x40 bit_wait_io+0x12/0x40 __wait_on_bit+0x46/0x76 wait_on_page_bit_killable+0x64/0x6c bit_wait_io+0x0/0x40 wake_bit_function+0x0/0x4e __lock_page_or_retry+0xde/0x124 do_scan_async+0x114/0x17c lookup_swap_cache+0x24/0x4e handle_mm_fault+0x626/0x7de find_vma+0x0/0x66 down_read+0x0/0xe wait_on_page_bit_killable_timeout+0x77/0x7c find_vma+0x16/0x66 do_page_fault+0xe6/0x23a res_func+0xa3c/0x141a buserr_c+0x190/0x6d4 res_func+0xa3c/0x141a buserr+0x20/0x28 res_func+0xa3c/0x141a buserr+0x20/0x28 The relationship is not obvious but it's due to a failure to rescan the full zonelist after the fair zone allocation policy exhausts the batch count. While this is a functional problem, it's also a performance issue. A page allocator microbenchmark showed the following 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 Min alloc-odr0-1 327.00 ( 0.00%) 326.00 ( 0.31%) Min alloc-odr0-2 235.00 ( 0.00%) 235.00 ( 0.00%) Min alloc-odr0-4 198.00 ( 0.00%) 198.00 ( 0.00%) Min alloc-odr0-8 170.00 ( 0.00%) 170.00 ( 0.00%) Min alloc-odr0-16 156.00 ( 0.00%) 156.00 ( 0.00%) Min alloc-odr0-32 150.00 ( 0.00%) 150.00 ( 0.00%) Min alloc-odr0-64 146.00 ( 0.00%) 146.00 ( 0.00%) Min alloc-odr0-128 145.00 ( 0.00%) 145.00 ( 0.00%) Min alloc-odr0-256 155.00 ( 0.00%) 155.00 ( 0.00%) Min alloc-odr0-512 168.00 ( 0.00%) 165.00 ( 1.79%) Min alloc-odr0-1024 175.00 ( 0.00%) 174.00 ( 0.57%) Min alloc-odr0-2048 180.00 ( 0.00%) 180.00 ( 0.00%) Min alloc-odr0-4096 187.00 ( 0.00%) 186.00 ( 0.53%) Min alloc-odr0-8192 190.00 ( 0.00%) 190.00 ( 0.00%) Min alloc-odr0-16384 191.00 ( 0.00%) 191.00 ( 0.00%) Min alloc-odr1-1 736.00 ( 0.00%) 445.00 ( 39.54%) Min alloc-odr1-2 343.00 ( 0.00%) 335.00 ( 2.33%) Min alloc-odr1-4 277.00 ( 0.00%) 270.00 ( 2.53%) Min alloc-odr1-8 238.00 ( 0.00%) 233.00 ( 2.10%) Min alloc-odr1-16 224.00 ( 0.00%) 218.00 ( 2.68%) Min alloc-odr1-32 210.00 ( 0.00%) 208.00 ( 0.95%) Min alloc-odr1-64 207.00 ( 0.00%) 203.00 ( 1.93%) Min alloc-odr1-128 276.00 ( 0.00%) 202.00 ( 26.81%) Min alloc-odr1-256 206.00 ( 0.00%) 202.00 ( 1.94%) Min alloc-odr1-512 207.00 ( 0.00%) 202.00 ( 2.42%) Min alloc-odr1-1024 208.00 ( 0.00%) 205.00 ( 1.44%) Min alloc-odr1-2048 213.00 ( 0.00%) 212.00 ( 0.47%) Min alloc-odr1-4096 218.00 ( 0.00%) 216.00 ( 0.92%) Min alloc-odr1-8192 341.00 ( 0.00%) 219.00 ( 35.78%) Note that order-0 allocations are unaffected but higher orders get a small boost from this patch and a large reduction in system CPU usage overall as can be seen here: 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 User 85.32 86.31 System 2221.39 2053.36 Elapsed 2368.89 2202.47 Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Link: http://lkml.kernel.org/r/20160531100848.GR2527@techsingularity.netSigned-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Oleg has noted that siglock usage in try_oom_reaper is both pointless and dangerous. signal_group_exit can be checked lockless. The problem is that sighand becomes NULL in __exit_signal so we can crash. Fixes: 3ef22dff ("oom, oom_reaper: try to reap tasks which skip regular OOM killer path") Link: http://lkml.kernel.org/r/1464679423-30218-1-git-send-email-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
In DEBUG_VM kernel, we can hit infinite loop for order == 0 in buffered_rmqueue() when check_new_pcp() returns 1, because the bad page is never removed from the pcp list. Fix this by removing the page before retrying. Also we don't need to check if page is non-NULL, because we simply grab it from the list which was just tested for being non-empty. Fixes: 479f854a ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Link: http://lkml.kernel.org/r/20160530090154.GM2527@techsingularity.netSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Some lines in a commit log appear to be commit SHA1 ids like: ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")' Link: http://lkml.kernel.org/r/40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com Reduce the false positives. Link: http://lkml.kernel.org/r/eda977eaa8328fef42bb3c87935d97e10ea8ff67.1464384023.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vitaly Wool authored
Fix erroneous z3fold header access in a HEADLESS page in reclaim function, and change one remaining direct handle-to-buddy conversion to use the appropriate helper. Link: http://lkml.kernel.org/r/5748706F.9020208@gmail.comSigned-off-by: Vitaly Wool <vitalywool@gmail.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tejun Heo authored
memcg_offline_kmem() may be called from memcg_free_kmem() after a css init failure. memcg_free_kmem() is a ->css_free callback which is called without cgroup_mutex and memcg_offline_kmem() ends up using css_for_each_descendant_pre() without any locking. Fix it by adding rcu read locking around it. mkdir: cannot create directory `65530': No space left on device =============================== [ INFO: suspicious RCU usage. ] 4.6.0-work+ #321 Not tainted ------------------------------- kernel/cgroup.c:4008 cgroup_mutex or RCU read lock required! [ 527.243970] other info that might help us debug this: [ 527.244715] rcu_scheduler_active = 1, debug_locks = 0 2 locks held by kworker/0:5/1664: #0: ("cgroup_destroy"){.+.+..}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 #1: ((&css->destroy_work)#3){+.+...}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 [ 527.248098] stack backtrace: CPU: 0 PID: 1664 Comm: kworker/0:5 Not tainted 4.6.0-work+ #321 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 Workqueue: cgroup_destroy css_free_work_fn Call Trace: dump_stack+0x68/0xa1 lockdep_rcu_suspicious+0xd7/0x110 css_next_descendant_pre+0x7d/0xb0 memcg_offline_kmem.part.44+0x4a/0xc0 mem_cgroup_css_free+0x1ec/0x200 css_free_work_fn+0x49/0x5e0 process_one_work+0x1c5/0x4a0 worker_thread+0x49/0x490 kthread+0xea/0x100 ret_from_fork+0x1f/0x40 Link: http://lkml.kernel.org/r/20160526203018.GG23194@mtj.duckdns.orgSigned-off-by: Tejun Heo <tj@kernel.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yang Shi authored
Per the discussion with Joonsoo Kim [1], we need check the return value of lookup_page_ext() for all call sites since it might return NULL in some cases, although it is unlikely, i.e. memory hotplug. Tested with ltp with "page_owner=0". [1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE [akpm@linux-foundation.org: fix build-breaking typos] [arnd@arndb.de: fix build problems from lookup_page_ext] Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.orgSigned-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Corey Minyard authored
Commit 7ff9554b ("printk: convert byte-buffer to variable-length record buffer") introduced a record based printk buffer. Modify gdbmacros.txt to parse this new structure so dmesg will work properly. Link: http://lkml.kernel.org/r/1463515794-1599-1-git-send-email-minyard@acm.orgSigned-off-by: Corey Minyard <cminyard@mvista.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Guillermo Julián Moreno authored
When remapping pages accounting for 4G or more memory space, the operation 'count << PAGE_SHIFT' overflows as it is performed on an integer. Solution: cast before doing the bitshift. [akpm@linux-foundation.org: fix vm_unmap_ram() also] [akpm@linux-foundation.org: fix vmap() as well, per Guillermo] Link: http://lkml.kernel.org/r/etPan.57175fb3.7a271c6b.2bd@naudit.esSigned-off-by: Guillermo Julián Moreno <guillermo.julian@naudit.es> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 02 Jun, 2016 11 commits
-
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Radim Krčmář: "ARM: - two fixes for 4.6 vgic [Christoffer] (cc stable) - six fixes for 4.7 vgic [Marc] x86: - six fixes from syzkaller reports [Paolo] (two of them cc stable) - allow OS X to boot [Dmitry] - don't trust compilers [Nadav]" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR KVM: Handle MSR_IA32_PERF_CTL KVM: x86: avoid write-tearing of TDP KVM: arm/arm64: vgic-new: Removel harmful BUG_ON arm64: KVM: vgic-v3: Relax synchronization when SRE==1 arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1 arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value KVM: arm/arm64: vgic-v3: Always resample level interrupts KVM: arm/arm64: vgic-v2: Always resample level interrupts KVM: arm/arm64: vgic-v3: Clear all dirty LRs KVM: arm/arm64: vgic-v2: Clear all dirty LRs
-
Paolo Bonzini authored
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40 RSP <ffff88005836bd50> Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Paolo Bonzini authored
This causes an ugly dmesg splat. Beautified syzkaller testcase: #include <unistd.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <fcntl.h> #include <linux/kvm.h> long r[8]; int main() { struct kvm_irq_routing ir = { 0 }; r[2] = open("/dev/kvm", O_RDWR); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_SET_GSI_ROUTING, &ir); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Paolo Bonzini authored
Found by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000120 IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] PGD 6f80b067 PUD b6535067 PMD 0 Oops: 0000 [#1] SMP CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 [...] Call Trace: [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm] [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm] [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a1062>] tracesys_phase2+0x84/0x89 Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85 RIP [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] RSP <ffff8800926cbca8> CR2: 0000000000000120 Testcase: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[26]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", 0); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); struct kvm_irqfd ifd; ifd.fd = syscall(SYS_eventfd2, 5, 0); ifd.gsi = 3; ifd.flags = 2; ifd.resamplefd = ifd.fd; r[25] = ioctl(r[3], KVM_IRQFD, &ifd); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Paolo Bonzini authored
This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return EINVAL. It causes a WARN from exception_type: WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]() CPU: 3 PID: 16732 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e 0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2 ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001 Call Trace: [<ffffffff813b542e>] dump_stack+0x63/0x85 [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm] [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm] [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71 ---[ end trace b1a0391266848f50 ]--- Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kvm.h> long r[31]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0); struct kvm_vcpu_events ve = { .exception.injected = 1, .exception.nr = 0xd4 }; r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve); r[30] = ioctl(r[7], KVM_RUN, 0); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Paolo Bonzini authored
This causes an ugly dmesg splat. Beautified syzkaller testcase: #include <unistd.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <fcntl.h> #include <linux/kvm.h> long r[8]; int main() { struct kvm_cpuid2 c = { 0 }; r[2] = open("/dev/kvm", O_RDWR); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8); r[7] = ioctl(r[4], KVM_SET_CPUID, &c); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Paolo Bonzini authored
Found by syzkaller: WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]() CPU: 3 PID: 15175 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e 0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2 00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000 Call Trace: [<ffffffff813b542e>] dump_stack+0x63/0x85 [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm] [<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm] [<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel] [<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm] [<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm] [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71 Testcase: #include <unistd.h> #include <sys/ioctl.h> #include <fcntl.h> #include <string.h> #include <linux/kvm.h> long r[8]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC); r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul); r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul); r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Dmitry Bilunov authored
Intel CPUs having Turbo Boost feature implement an MSR to provide a control interface via rdmsr/wrmsr instructions. One could detect the presence of this feature by issuing one of these instructions and handling the #GP exception which is generated in case the referenced MSR is not implemented by the CPU. KVM's vCPU model behaves exactly as a real CPU in this case by injecting a fault when MSR_IA32_PERF_CTL is called (which KVM does not support). However, some operating systems use this register during an early boot stage in which their kernel is not capable of handling #GP correctly, causing #DP and finally a triple fault effectively resetting the vCPU. This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the crashes. Signed-off-by: Dmitry Bilunov <kmeaw@yandex-team.ru> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Nadav Amit authored
In theory, nothing prevents the compiler from write-tearing PTEs, or split PTE writes. These partially-modified PTEs can be fetched by other cores and cause mayhem. I have not really encountered such case in real-life, but it does seem possible. For example, the compiler may try to do something creative for kvm_set_pte_rmapp() and perform multiple writes to the PTE. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarmRadim Krčmář authored
KVM/ARM Fixes for v4.7-rc2 Fixes for the vgic, 2 of the patches address a bug introduced in v4.6 while the rest are for the new vgic.
-
Marc Zyngier authored
When changing the active bit from an MMIO trap, we decide to explode if the intid is that of a private interrupt. This flawed logic comes from the fact that we were assuming that kvm_vcpu_kick() as called by kvm_arm_halt_vcpu() would not return before the called vcpu responded, but this is not the case, so we need to perform this wait even for private interrupts. Dropping the BUG_ON seems like the right thing to do. [ Commit message tweaked by Christoffer ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 01 Jun, 2016 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrlLinus Torvalds authored
Pull pin control fixes from Linus Walleij: "Here are three pin control fixes for v4.7. Not much, and just driver fixes: - add device tree matches to MAINTAINERS - inversion bug in the Nomadik driver - dual edge handling bug in the mediatek driver" * tag 'pinctrl-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: mediatek: fix dual-edge code defect MAINTAINERS: Add file patterns for pinctrl device tree bindings pinctrl: nomadik: fix inversion of gpio direction
-
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-bufLinus Torvalds authored
Pull dma-buf updates from Sumit Semwal: - use of vma_pages instead of explicit computation - DocBook and headerdoc updates for dma-buf * tag 'dma-buf-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf: dma-buf: use vma_pages() fence: add missing descriptions for fence doc: update/fixup dma-buf related DocBook reservation: add headerdoc comments dma-buf: headerdoc fixes
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix negative error code usage in ATM layer, from Stefan Hajnoczi. 2) If CONFIG_SYSCTL is disabled, the default TTL is not initialized properly. From Ezequiel Garcia. 3) Missing spinlock init in mvneta driver, from Gregory CLEMENT. 4) Missing unlocks in hwmb error paths, also from Gregory CLEMENT. 5) Fix deadlock on team->lock when propagating features, from Ivan Vecera. 6) Work around buffer offset hw bug in alx chips, from Feng Tang. 7) Fix double listing of SCTP entries in sctp_diag dumps, from Xin Long. 8) Various statistics bug fixes in mlx4 from Eric Dumazet. 9) Fix some randconfig build errors wrt fou ipv6 from Arnd Bergmann. 10) All of l2tp was namespace aware, but the ipv6 support code was not doing so. From Shmulik Ladkani. 11) Handle on-stack hrtimers properly in pktgen, from Guenter Roeck. 12) Propagate MAC changes properly through VLAN devices, from Mike Manning. 13) Fix memory leak in bnx2x_init_one(), from Vitaly Kuznetsov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits) sfc: Track RPS flow IDs per channel instead of per function usbnet: smsc95xx: fix link detection for disabled autonegotiation virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv bnx2x: avoid leaking memory on bnx2x_init_one() failures fou: fix IPv6 Kconfig options openvswitch: update checksum in {push,pop}_mpls sctp: sctp_diag should dump sctp socket type net: fec: update dirty_tx even if no skb vlan: Propagate MAC address to VLANs atm: iphase: off by one in rx_pkt() atm: firestream: add more reserved strings vxlan: Accept user specified MTU value when create new vxlan link net: pktgen: Call destroy_hrtimer_on_stack() timer: Export destroy_hrtimer_on_stack() net: l2tp: Make l2tp_ip6 namespace aware Documentation: ip-sysctl.txt: clarify secure_redirects sfc: use flow dissector helpers for aRFS ieee802154: fix logic error in ieee802154_llsec_parse_dev_addr net: nps_enet: Disable interrupts before napi reschedule net/lapb: tuse %*ph to dump buffers ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fixes from David Miller: "sparc64 mmu context allocation and trap return bug fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix return from trap window fill crashes. sparc: Harden signal return frame checks. sparc64: Take ctx_alloc_lock properly in hugetlb_setup().
-
Jon Cooper authored
Otherwise we get confused when two flows on different channels get the same flow ID. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 31 May, 2016 14 commits
-
-
Christoph Fritz authored
To detect link status up/down for connections where autonegotiation is explicitly disabled, we don't get an irq but need to poll the status register for link up/down detection. This patch adds a workqueue to poll for link status. Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
wangyunjian authored
In function virtnet_open() and virtnet_probe(), func try_fill_recv() may be executed at the same time. VQ in virtqueue_add() has not been protected well and BUG_ON will be triggered when virito_net.ko being removed. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vitaly Kuznetsov authored
bnx2x_init_bp() allocates memory with bnx2x_alloc_mem_bp() so if we fail later in bnx2x_init_one() we need to free this memory with bnx2x_free_mem_bp() to avoid leakages. E.g. I'm observing memory leaks reported by kmemleak when a failure (unrelated) happens in bnx2x_vfpf_acquire(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The Kconfig options I added to work around broken compilation ended up screwing up things more, as I used the wrong symbol to control compilation of the file, resulting in IPv6 fou support to never be built into the kernel. Changing CONFIG_NET_FOU_IPV6_TUNNELS to CONFIG_IPV6_FOU fixes that problem, I had renamed the symbol in one location but not the other, and as the file is never being used by other kernel code, this did not lead to a build failure that I would have caught. After that fix, another issue with the same patch becomes obvious, as we 'select INET6_TUNNEL', which is related to IPV6_TUNNEL, but not the same, and this can still cause the original build failure when IPV6_TUNNEL is not built-in but IPV6_FOU is. The fix is equally trivial, we just need to select the right symbol. I have successfully build 350 randconfig kernels with this patch and verified that the driver is now being built. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Valentin Rothberg <valentinrothberg@gmail.com> Fixes: fabb13db ("fou: add Kconfig options for IPv6 support") Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Horman authored
In the case of CHECKSUM_COMPLETE the skb checksum should be updated in {push,pop}_mpls() as they the type in the ethernet header. As suggested by Pravin Shelar. Cc: Pravin Shelar <pshelar@nicira.com> Fixes: 25cd9ba0 ("openvswitch: Add basic MPLS support to kernel") Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
Now we cannot distinguish that one sk is a udp or sctp style when we use ss to dump sctp_info. it's necessary to dump it as well. For sctp_diag, ss support is not officially available, thus there are no official users of this yet, so we can add this field in the middle of sctp_info without breaking user API. v1->v2: - move 'sctpi_s_type' field to the end of struct sctp_info, so that it won't cause incompatibility with applications already built. - add __reserved3 in sctp_info to make sure sctp_info is 8-byte alignment. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Troy Kisky authored
If dirty_tx isn't updated, then dma_unmap_single can be called twice. This fixes a [ 58.420980] ------------[ cut here ]------------ [ 58.425667] WARNING: CPU: 0 PID: 377 at /home/schurig/d/mkarm/linux-4.5/lib/dma-debug.c:1096 check_unmap+0x9d0/0xab8() [ 58.436405] fec 2188000.ethernet: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=66 bytes] encountered by Holger Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com> Tested-by: <holgerschurig@gmail.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mike Manning authored
The MAC address of the physical interface is only copied to the VLAN when it is first created, resulting in an inconsistency after MAC address changes of only newly created VLANs having an up-to-date MAC. The VLANs should continue inheriting the MAC address of the physical interface until the VLAN MAC address is explicitly set to any value. This allows IPv6 EUI64 addresses for the VLAN to reflect any changes to the MAC of the physical interface and thus for DAD to behave as expected. Signed-off-by: Mike Manning <mmanning@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
The iadev->rx_open[] array holds "iadev->num_vc" pointers (this code assumes that pointers are 32 bits). So the > here should be >= or else we could end up reading a garbage pointer from one element beyond the end of the array. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
This bug was there when the driver was first added in back in year 2000. It causes a Smatch warning: drivers/atm/firestream.c:849 process_incoming() error: buffer overflow 'res_strings' 60 <= 63 There are supposed to be 64 entries in this array and the missing strings are clearly in the 30 40 range. I added them as reserved 37 to reserved 40. It's possible that strings are really supposed to be added in the middle instead of at the end, but this approach is safe, in that it fixes the bug and doesn't break anything that wasn't already broken. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chen Haiquan authored
When create a new vxlan link, example: ip link add vtap mtu 1440 type vxlan vni 1 dev eth0 The argument "mtu" has no effect, because it is not set to conf->mtu. The default value is used in vxlan_dev_configure function. This problem was introduced by commit 0dfbdf41 (vxlan: Factor out device configuration). Fixes: 0dfbdf41 (vxlan: Factor out device configuration) Signed-off-by: Chen Haiquan <oc@yunify.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guenter Roeck authored
If CONFIG_DEBUG_OBJECTS_TIMERS=y, hrtimer_init_on_stack() requires a matching call to destroy_hrtimer_on_stack() to clean up timer debug objects. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guenter Roeck authored
hrtimer_init_on_stack() needs a matching call to destroy_hrtimer_on_stack(), so both need to be exported. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Muhammad Falak R Wani authored
Replace explicit computation of vma page count by a call to vma_pages(). Also, include <linux/mm.h> Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
-