1. 15 Mar, 2018 10 commits
  2. 11 Mar, 2018 10 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.26 · 96427a51
      Greg Kroah-Hartman authored
      96427a51
    • Radim Krčmář's avatar
      KVM: x86: fix backward migration with async_PF · dc6fb79d
      Radim Krčmář authored
      commit fe2a3027 upstream.
      
      Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
      bit when enabling async_PF, but this bit is reserved on old hypervisors,
      which results in a failure upon migration.
      
      To avoid breaking different cases, we are checking for CPUID feature bit
      before enabling the feature and nothing else.
      
      Fixes: 52a5c155 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [jwang: port to 4.14]
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc6fb79d
    • Daniel Borkmann's avatar
      bpf, ppc64: fix out of bounds access in tail call · a91064ff
      Daniel Borkmann authored
      [ upstream commit d269176e ]
      
      While working on 16338a9b ("bpf, arm64: fix out of bounds access in
      tail call") I noticed that ppc64 JIT is partially affected as well. While
      the bound checking is correctly performed as unsigned comparison, the
      register with the index value however, is never truncated into 32 bit
      space, so e.g. a index value of 0x100000000ULL with a map of 1 element
      would pass with PPC_CMPLW() whereas we later on continue with the full
      64 bit register value. Therefore, as we do in interpreter and other JITs
      truncate the value to 32 bit initially in order to fix access.
      
      Fixes: ce076141 ("powerpc/bpf: Implement support for tail calls")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Tested-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a91064ff
    • Daniel Borkmann's avatar
      bpf: allow xadd only on aligned memory · 3e272a8c
      Daniel Borkmann authored
      [ upstream commit ca369602 ]
      
      The requirements around atomic_add() / atomic64_add() resp. their
      JIT implementations differ across architectures. E.g. while x86_64
      seems just fine with BPF's xadd on unaligned memory, on arm64 it
      triggers via interpreter but also JIT the following crash:
      
        [  830.864985] Unable to handle kernel paging request at virtual address ffff8097d7ed6703
        [...]
        [  830.916161] Internal error: Oops: 96000021 [#1] SMP
        [  830.984755] CPU: 37 PID: 2788 Comm: test_verifier Not tainted 4.16.0-rc2+ #8
        [  830.991790] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.29 07/17/2017
        [  830.998998] pstate: 80400005 (Nzcv daif +PAN -UAO)
        [  831.003793] pc : __ll_sc_atomic_add+0x4/0x18
        [  831.008055] lr : ___bpf_prog_run+0x1198/0x1588
        [  831.012485] sp : ffff00001ccabc20
        [  831.015786] x29: ffff00001ccabc20 x28: ffff8017d56a0f00
        [  831.021087] x27: 0000000000000001 x26: 0000000000000000
        [  831.026387] x25: 000000c168d9db98 x24: 0000000000000000
        [  831.031686] x23: ffff000008203878 x22: ffff000009488000
        [  831.036986] x21: ffff000008b14e28 x20: ffff00001ccabcb0
        [  831.042286] x19: ffff0000097b5080 x18: 0000000000000a03
        [  831.047585] x17: 0000000000000000 x16: 0000000000000000
        [  831.052885] x15: 0000ffffaeca8000 x14: 0000000000000000
        [  831.058184] x13: 0000000000000000 x12: 0000000000000000
        [  831.063484] x11: 0000000000000001 x10: 0000000000000000
        [  831.068783] x9 : 0000000000000000 x8 : 0000000000000000
        [  831.074083] x7 : 0000000000000000 x6 : 000580d428000000
        [  831.079383] x5 : 0000000000000018 x4 : 0000000000000000
        [  831.084682] x3 : ffff00001ccabcb0 x2 : 0000000000000001
        [  831.089982] x1 : ffff8097d7ed6703 x0 : 0000000000000001
        [  831.095282] Process test_verifier (pid: 2788, stack limit = 0x0000000018370044)
        [  831.102577] Call trace:
        [  831.105012]  __ll_sc_atomic_add+0x4/0x18
        [  831.108923]  __bpf_prog_run32+0x4c/0x70
        [  831.112748]  bpf_test_run+0x78/0xf8
        [  831.116224]  bpf_prog_test_run_xdp+0xb4/0x120
        [  831.120567]  SyS_bpf+0x77c/0x1110
        [  831.123873]  el0_svc_naked+0x30/0x34
        [  831.127437] Code: 97fffe97 17ffffec 00000000 f9800031 (885f7c31)
      
      Reason for this is because memory is required to be aligned. In
      case of BPF, we always enforce alignment in terms of stack access,
      but not when accessing map values or packet data when the underlying
      arch (e.g. arm64) has CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.
      
      xadd on packet data that is local to us anyway is just wrong, so
      forbid this case entirely. The only place where xadd makes sense in
      fact are map values; xadd on stack is wrong as well, but it's been
      around for much longer. Specifically enforce strict alignment in case
      of xadd, so that we handle this case generically and avoid such crashes
      in the first place.
      
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e272a8c
    • Eric Dumazet's avatar
      bpf: add schedule points in percpu arrays management · e1760b35
      Eric Dumazet authored
      [ upstream commit 32fff239 ]
      
      syszbot managed to trigger RCU detected stalls in
      bpf_array_free_percpu()
      
      It takes time to allocate a huge percpu map, but even more time to free
      it.
      
      Since we run in process context, use cond_resched() to yield cpu if
      needed.
      
      Fixes: a10423b8 ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1760b35
    • Daniel Borkmann's avatar
      bpf, arm64: fix out of bounds access in tail call · 03549a34
      Daniel Borkmann authored
      [ upstream commit 16338a9b ]
      
      I recently noticed a crash on arm64 when feeding a bogus index
      into BPF tail call helper. The crash would not occur when the
      interpreter is used, but only in case of JIT. Output looks as
      follows:
      
        [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
        [...]
        [  347.043065] [fffb850e96492510] address between user and kernel address ranges
        [  347.050205] Internal error: Oops: 96000004 [#1] SMP
        [...]
        [  347.190829] x13: 0000000000000000 x12: 0000000000000000
        [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
        [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
        [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
        [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
        [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
        [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
        [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
        [  347.235221] Call trace:
        [  347.237656]  0xffff000002f3a4fc
        [  347.240784]  bpf_test_run+0x78/0xf8
        [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
        [  347.248694]  SyS_bpf+0x77c/0x1110
        [  347.251999]  el0_svc_naked+0x30/0x34
        [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
        [...]
      
      In this case the index used in BPF r3 is the same as in r1
      at the time of the call, meaning we fed a pointer as index;
      here, it had the value 0xffff808fd7cf0500 which sits in x2.
      
      While I found tail calls to be working in general (also for
      hitting the error cases), I noticed the following in the code
      emission:
      
        # bpftool p d j i 988
        [...]
        38:   ldr     w10, [x1,x10]
        3c:   cmp     w2, w10
        40:   b.ge    0x000000000000007c              <-- signed cmp
        44:   mov     x10, #0x20                      // #32
        48:   cmp     x26, x10
        4c:   b.gt    0x000000000000007c
        50:   add     x26, x26, #0x1
        54:   mov     x10, #0x110                     // #272
        58:   add     x10, x1, x10
        5c:   lsl     x11, x2, #3
        60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
        64:   cbz     x11, 0x000000000000007c
        [...]
      
      Meaning, the tests passed because commit ddb55992 ("arm64:
      bpf: implement bpf_tail_call() helper") was using signed compares
      instead of unsigned which as a result had the test wrongly passing.
      
      Change this but also the tail call count test both into unsigned
      and cap the index as u32. Latter we did as well in 90caccdd
      ("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
      too. Tested on HiSilicon Hi1616.
      
      Result after patch:
      
        # bpftool p d j i 268
        [...]
        38:	ldr	w10, [x1,x10]
        3c:	add	w2, w2, #0x0
        40:	cmp	w2, w10
        44:	b.cs	0x0000000000000080
        48:	mov	x10, #0x20                  	// #32
        4c:	cmp	x26, x10
        50:	b.hi	0x0000000000000080
        54:	add	x26, x26, #0x1
        58:	mov	x10, #0x110                 	// #272
        5c:	add	x10, x1, x10
        60:	lsl	x11, x2, #3
        64:	ldr	x11, [x10,x11]
        68:	cbz	x11, 0x0000000000000080
        [...]
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03549a34
    • Daniel Borkmann's avatar
      bpf, x64: implement retpoline for tail call · 7e657aa3
      Daniel Borkmann authored
      [ upstream commit a493a87f ]
      
      Implement a retpoline [0] for the BPF tail call JIT'ing that converts
      the indirect jump via jmp %rax that is used to make the long jump into
      another JITed BPF image. Since this is subject to speculative execution,
      we need to control the transient instruction sequence here as well
      when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
      The latter aligns also with what gcc / clang emits (e.g. [1]).
      
      JIT dump after patch:
      
        # bpftool p d x i 1
         0: (18) r2 = map[id:1]
         2: (b7) r3 = 0
         3: (85) call bpf_tail_call#12
         4: (b7) r0 = 2
         5: (95) exit
      
      With CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000072  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000072  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000072  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	callq  0x000000000000006d  |+
        66:	pause                      |
        68:	lfence                     |
        6b:	jmp    0x0000000000000066  |
        6d:	mov    %rax,(%rsp)         |
        71:	retq                       |
        72:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        + retpoline for indirect jump
      
      Without CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000063  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000063  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000063  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	jmpq   *%rax               |-
        63:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        - plain indirect jump as before
      
        [0] https://support.google.com/faqs/answer/7625886
        [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2bSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e657aa3
    • Yonghong Song's avatar
      bpf: fix rcu lockdep warning for lpm_trie map_free callback · 853223c2
      Yonghong Song authored
      [ upstream commit 6c5f6102 ]
      
      Commit 9a3efb6b ("bpf: fix memory leak in lpm_trie map_free callback function")
      fixed a memory leak and removed unnecessary locks in map_free callback function.
      Unfortrunately, it introduced a lockdep warning. When lockdep checking is turned on,
      running tools/testing/selftests/bpf/test_lpm_map will have:
      
        [   98.294321] =============================
        [   98.294807] WARNING: suspicious RCU usage
        [   98.295359] 4.16.0-rc2+ #193 Not tainted
        [   98.295907] -----------------------------
        [   98.296486] /home/yhs/work/bpf/kernel/bpf/lpm_trie.c:572 suspicious rcu_dereference_check() usage!
        [   98.297657]
        [   98.297657] other info that might help us debug this:
        [   98.297657]
        [   98.298663]
        [   98.298663] rcu_scheduler_active = 2, debug_locks = 1
        [   98.299536] 2 locks held by kworker/2:1/54:
        [   98.300152]  #0:  ((wq_completion)"events"){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
        [   98.301381]  #1:  ((work_completion)(&map->work)){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
      
      Since actual trie tree removal happens only after no other
      accesses to the tree are possible, replacing
        rcu_dereference_protected(*slot, lockdep_is_held(&trie->lock))
      with
        rcu_dereference_protected(*slot, 1)
      fixed the issue.
      
      Fixes: 9a3efb6b ("bpf: fix memory leak in lpm_trie map_free callback function")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      853223c2
    • Yonghong Song's avatar
      bpf: fix memory leak in lpm_trie map_free callback function · 62a2caa5
      Yonghong Song authored
      [ upstream commit 9a3efb6b ]
      
      There is a memory leak happening in lpm_trie map_free callback
      function trie_free. The trie structure itself does not get freed.
      
      Also, trie_free function did not do synchronize_rcu before freeing
      various data structures. This is incorrect as some rcu_read_lock
      region(s) for lookup, update, delete or get_next_key may not complete yet.
      The fix is to add synchronize_rcu in the beginning of trie_free.
      The useless spin_lock is removed from this function as well.
      
      Fixes: b95a5c4d ("bpf: add a longest prefix match trie map implementation")
      Reported-by: default avatarMathieu Malaterre <malat@debian.org>
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: default avatarMathieu Malaterre <malat@debian.org>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62a2caa5
    • Daniel Borkmann's avatar
      bpf: fix mlock precharge on arraymaps · d9fd73c6
      Daniel Borkmann authored
      [ upstream commit 9c2d63b8 ]
      
      syzkaller recently triggered OOM during percpu map allocation;
      while there is work in progress by Dennis Zhou to add __GFP_NORETRY
      semantics for percpu allocator under pressure, there seems also a
      missing bpf_map_precharge_memlock() check in array map allocation.
      
      Given today the actual bpf_map_charge_memlock() happens after the
      find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
      is there to bail out early before we go and do the map setup work
      when we find that we hit the limits anyway. Therefore add this for
      array map as well.
      
      Fixes: 6c905981 ("bpf: pre-allocate hash map elements")
      Fixes: a10423b8 ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
      Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Dennis Zhou <dennisszhou@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9fd73c6
  3. 09 Mar, 2018 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.25 · 8773f9bf
      Greg Kroah-Hartman authored
      8773f9bf
    • Sagi Grimberg's avatar
      nvme-rdma: don't suppress send completions · df11c226
      Sagi Grimberg authored
      commit b4b591c8 upstream.
      
      The entire completions suppress mechanism is currently broken because the
      HCA might retry a send operation (due to dropped ack) after the nvme
      transaction has completed.
      
      In order to handle this, we signal all send completions and introduce a
      separate done handler for async events as they will be handled differently
      (as they don't include in-capsule data by definition).
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      df11c226
    • NeilBrown's avatar
      md: only allow remove_and_add_spares when no sync_thread running. · 9474d8fa
      NeilBrown authored
      commit 39772f0a upstream.
      
      The locking protocols in md assume that a device will
      never be removed from an array during resync/recovery/reshape.
      When that isn't happening, rcu or reconfig_mutex is needed
      to protect an rdev pointer while taking a refcount.  When
      it is happening, that protection isn't needed.
      
      Unfortunately there are cases were remove_and_add_spares() is
      called when recovery might be happening: is state_store(),
      slot_store() and hot_remove_disk().
      In each case, this is just an optimization, to try to expedite
      removal from the personality so the device can be removed from
      the array.  If resync etc is happening, we just have to wait
      for md_check_recover to find a suitable time to call
      remove_and_add_spares().
      
      This optimization and not essential so it doesn't
      matter if it fails.
      So change remove_and_add_spares() to abort early if
      resync/recovery/reshape is happening, unless it is called
      from md_check_recovery() as part of a newly started recovery.
      The parameter "this" is only NULL when called from
      md_check_recovery() so when it is NULL, there is no need to abort.
      
      As this can result in a NULL dereference, the fix is suitable
      for -stable.
      
      cc: yuyufen <yuyufen@huawei.com>
      Cc: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
      Fixes: 8430e7e0 ("md: disconnect device from personality before trying to remove it.")
      Cc: stable@ver.kernel.org (v4.8+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <sh.li@alibaba-inc.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9474d8fa
    • Adam Ford's avatar
      ARM: dts: LogicPD Torpedo: Fix I2C1 pinmux · 4df591f7
      Adam Ford authored
      commit 74402055 upstream.
      
      The pinmuxing was missing for I2C1 which was causing intermittent issues
      with the PMIC which is connected to I2C1.  The bootloader did not quite
      configure the I2C1 either, so when running at 2.6MHz, it was generating
      errors at time.
      
      This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
      
      Fixes: 687c2767 ("ARM: dts: Add minimal support for LogicPD Torpedo
      DM3730 devkit")
      Signed-off-by: default avatarAdam Ford <aford173@gmail.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4df591f7
    • Adam Ford's avatar
      ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux · 2b844657
      Adam Ford authored
      commit 84c7efd6 upstream.
      
      The pinmuxing was missing for I2C1 which was causing intermittent issues
      with the PMIC which is connected to I2C1.  The bootloader did not quite
      configure the I2C1 either, so when running at 2.6MHz, it was generating
      errors at times.
      
      This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
      
      Fixes: ab8dd3ae ("ARM: DTS: Add minimal Support for Logic PD DM3730
      SOM-LV")
      Signed-off-by: default avatarAdam Ford <aford173@gmail.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b844657
    • Kai Heng Feng's avatar
      ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530 · b2190cc3
      Kai Heng Feng authored
      commit 36904703 upstream.
      
      The i2c touchpad on Dell XPS 9570 and Precision M5530 doesn't work out
      of box.
      
      The touchpad relies on its _INI method to update its _HID value from
      XXXX0000 to SYNA2393.
      
      Also, the _STA relies on value of I2CN to report correct status.
      
      Set acpi_gbl_parse_table_as_term_list so the value of I2CN can be
      correctly set up, and _INI can get run. The ACPI table in this machine
      is designed to get parsed this way.
      
      Also, change the quirk table to a more generic name.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2190cc3
    • Eric Biggers's avatar
      KVM/x86: remove WARN_ON() for when vm_munmap() fails · b95f8ca8
      Eric Biggers authored
      commit 103c763c upstream.
      
      On x86, special KVM memslots such as the TSS region have anonymous
      memory mappings created on behalf of userspace, and these mappings are
      removed when the VM is destroyed.
      
      It is however possible for removing these mappings via vm_munmap() to
      fail.  This can most easily happen if the thread receives SIGKILL while
      it's waiting to acquire ->mmap_sem.   This triggers the 'WARN_ON(r < 0)'
      in __x86_set_memory_region().  syzkaller was able to hit this, using
      'exit()' to send the SIGKILL.  Note that while the vm_munmap() failure
      results in the mapping not being removed immediately, it is not leaked
      forever but rather will be freed when the process exits.
      
      It's not really possible to handle this failure properly, so almost
      every other caller of vm_munmap() doesn't check the return value.  It's
      a limitation of having the kernel manage these mappings rather than
      userspace.
      
      So just remove the WARN_ON() so that users can't spam the kernel log
      with this warning.
      
      Fixes: f0d648bd ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b95f8ca8
    • Tianyu Lan's avatar
      KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs() · 61546237
      Tianyu Lan authored
      commit 37b95951 upstream.
      
      kvm_valid_sregs() should use X86_CR0_PG and X86_CR4_PAE to check bit
      status rather than X86_CR0_PG_BIT and X86_CR4_PAE_BIT. This patch is
      to fix it.
      
      Fixes: f2981033(KVM/x86: Check input paging mode when cs.l is set)
      Reported-by: default avatarJeremi Piotrowski <jeremi.piotrowski@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61546237
    • Ard Biesheuvel's avatar
      PCI/ASPM: Deal with missing root ports in link state handling · db98acd6
      Ard Biesheuvel authored
      commit ee8bdfb6 upstream.
      
      Even though it is unconventional, some PCIe host implementations omit the
      root ports entirely, and simply consist of a host bridge (which is not
      modeled as a device in the PCI hierarchy) and a link.
      
      When the downstream device is an endpoint, our current code does not seem
      to mind this unusual configuration. However, when PCIe switches are
      involved, the ASPM code assumes that any downstream switch port has a
      parent, and blindly dereferences the bus->parent->self field of the pci_dev
      struct to chain the downstream link state to the link state of the root
      port. Given that the root port is missing, the link is not modeled at all,
      and nor is the link state, and attempting to access it results in a NULL
      pointer dereference and a crash.
      
      Avoid this by allowing the link state chain to terminate at the downstream
      port if no root port exists.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db98acd6
    • Radim Krčmář's avatar
      KVM: x86: fix vcpu initialization with userspace lapic · b4830f3a
      Radim Krčmář authored
      commit b7e31be3 upstream.
      
      Moving the code around broke this rare configuration.
      Use this opportunity to finally call lapic reset from vcpu reset.
      
      Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Fixes: 0b2e9904 ("KVM: x86: move LAPIC initialization after VMCS creation")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4830f3a
    • Paolo Bonzini's avatar
      KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() · 1f17daea
      Paolo Bonzini authored
      commit 946fbbc1 upstream.
      
      vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
      branch hints to the compiler can actually make a substantial cycle
      difference by keeping the fast path contiguous in memory.
      
      With this optimization, the retpoline-guest/retpoline-host case is
      about 50 cycles faster.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: KarimAllah Ahmed <karahmed@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f17daea
    • Paolo Bonzini's avatar
      KVM: x86: move LAPIC initialization after VMCS creation · 03d62460
      Paolo Bonzini authored
      commit 0b2e9904 upstream.
      
      The initial reset of the local APIC is performed before the VMCS has been
      created, but it tries to do a vmwrite:
      
       vmwrite error: reg 810 value 4a00 (err 18944)
       CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G        W I      4.16.0-0.rc2.git0.1.fc28.x86_64 #1
       Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
       Call Trace:
        vmx_set_rvi [kvm_intel]
        vmx_hwapic_irr_update [kvm_intel]
        kvm_lapic_reset [kvm]
        kvm_create_lapic [kvm]
        kvm_arch_vcpu_init [kvm]
        kvm_vcpu_init [kvm]
        vmx_create_vcpu [kvm_intel]
        kvm_vm_ioctl [kvm]
      
      Move it later, after the VMCS has been created.
      
      Fixes: 4191db26 ("KVM: x86: Update APICv on APIC reset")
      Cc: stable@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03d62460
    • Paolo Bonzini's avatar
      KVM/x86: Remove indirect MSR op calls from SPEC_CTRL · 0d62a56d
      Paolo Bonzini authored
      commit ecb586bd upstream.
      
      Having a paravirt indirect call in the IBRS restore path is not a
      good idea, since we are trying to protect from speculative execution
      of bogus indirect branch targets.  It is also slower, so use
      native_wrmsrl() on the vmentry path too.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: KarimAllah Ahmed <karahmed@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: d28b387f
      Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d62a56d
    • Wanpeng Li's avatar
      KVM: mmu: Fix overlap between public and private memslots · 7135aaf3
      Wanpeng Li authored
      commit b28676bb upstream.
      
      Reported by syzkaller:
      
          pte_list_remove: ffff9714eb1f8078 0->BUG
          ------------[ cut here ]------------
          kernel BUG at arch/x86/kvm/mmu.c:1157!
          invalid opcode: 0000 [#1] SMP
          RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
          Call Trace:
           drop_spte+0x83/0xb0 [kvm]
           mmu_page_zap_pte+0xcc/0xe0 [kvm]
           kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
           kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
           kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
           kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
           ? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
           __mmu_notifier_release+0x79/0x110
           ? __mmu_notifier_release+0x5/0x110
           exit_mmap+0x15a/0x170
           ? do_exit+0x281/0xcb0
           mmput+0x66/0x160
           do_exit+0x2c9/0xcb0
           ? __context_tracking_exit.part.5+0x4a/0x150
           do_group_exit+0x50/0xd0
           SyS_exit_group+0x14/0x20
           do_syscall_64+0x73/0x1f0
           entry_SYSCALL64_slow_path+0x25/0x25
      
      The reason is that when creates new memslot, there is no guarantee for new
      memslot not overlap with private memslots. This can be triggered by the
      following program:
      
         #include <fcntl.h>
         #include <pthread.h>
         #include <setjmp.h>
         #include <signal.h>
         #include <stddef.h>
         #include <stdint.h>
         #include <stdio.h>
         #include <stdlib.h>
         #include <string.h>
         #include <sys/ioctl.h>
         #include <sys/stat.h>
         #include <sys/syscall.h>
         #include <sys/types.h>
         #include <unistd.h>
         #include <linux/kvm.h>
      
         long r[16];
      
         int main()
         {
      	void *p = valloc(0x4000);
      
      	r[2] = open("/dev/kvm", 0);
      	r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
      
      	uint64_t addr = 0xf000;
      	ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
      	r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
      	ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
      	ioctl(r[6], KVM_RUN, 0);
      	ioctl(r[6], KVM_RUN, 0);
      
      	struct kvm_userspace_memory_region mr = {
      		.slot = 0,
      		.flags = KVM_MEM_LOG_DIRTY_PAGES,
      		.guest_phys_addr = 0xf000,
      		.memory_size = 0x4000,
      		.userspace_addr = (uintptr_t) p
      	};
      	ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
      	return 0;
         }
      
      This patch fixes the bug by not adding a new memslot even if it
      overlaps with private memslots.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      7135aaf3
    • Wanpeng Li's avatar
      KVM: X86: Fix SMRAM accessing even if VM is shutdown · 1ebf9ab6
      Wanpeng Li authored
      commit 95e057e2 upstream.
      
      Reported by syzkaller:
      
         WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
         CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
         RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
         Call Trace:
          vmx_handle_exit+0xbd/0xe20 [kvm_intel]
          kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
          kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
          do_vfs_ioctl+0xa4/0x6a0
          SyS_ioctl+0x79/0x90
          entry_SYSCALL_64_fastpath+0x25/0x9c
      
      The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
      a second thread to mmap and operate on the same vCPU.  This triggers a race
      condition when running the testcase with multiple threads. Sometimes one thread
      exits with a triple fault while another thread mmaps and operates on the same
      vCPU.  Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
      results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
      in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
      exit with KVM_EXIT_INTERNAL_ERROR.
      
      Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ebf9ab6
    • Paolo Bonzini's avatar
      KVM: x86: extend usage of RET_MMIO_PF_* constants · f925158c
      Paolo Bonzini authored
      commit 9b8ebbdb upstream.
      
      The x86 MMU if full of code that returns 0 and 1 for retry/emulate.  Use
      the existing RET_MMIO_PF_RETRY/RET_MMIO_PF_EMULATE enum, renaming it to
      drop the MMIO part.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Backlund <tmb@mageia.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f925158c
    • Arnd Bergmann's avatar
      ARM: kvm: fix building with gcc-8 · e0c7b2b1
      Arnd Bergmann authored
      commit 67870eb1 upstream.
      
      In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
      statement to allow compilation of a multi-CPU kernel for ARMv6
      and older ARMv7-A that don't normally support access to the banked
      registers.
      
      This is considered to be a programming error by the gcc developers
      and will no longer work in gcc-8, where we now get a build error:
      
      /tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
      /tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
      /tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
      /tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
      /tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
      /tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
      
      Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
      we know the functions won't ever be called on pre-ARMv7VE machines.
      Unfortunately, older compiler versions (4.8 and earlier) do not understand
      that flag, so we still need to keep the asm around.
      
      Backporting to stable kernels (4.6+) is needed to allow those to be built
      with future compilers as well.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
      Fixes: 33280b4c ("ARM: KVM: Add banked registers save/restore")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c7b2b1
    • Ulf Magnusson's avatar
      ARM: mvebu: Fix broken PL310_ERRATA_753970 selects · fc6be8bc
      Ulf Magnusson authored
      commit 8aa36a8d upstream.
      
      The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
      but it was renamed to PL310_ERRATA_753970 by commit fa0ce403 ("ARM:
      7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
      
      Fix the selects to use the new name.
      
      Discovered with the
      https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
      script.
      Fixes: fa0ce403 ("ARM: 7162/1: errata: tidy up Kconfig options for
      PL310 errata workarounds"
      cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Magnusson <ulfalizer@gmail.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc6be8bc
    • Daniel Schultz's avatar
      ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som · 4c02f016
      Daniel Schultz authored
      commit 5ce0bad4 upstream.
      
      Rockchip recommends to run the CPU cores only with operations points of
      1.6 GHz or lower.
      
      Removed the cpu0 node with too high operation points and use the default
      values instead.
      
      Fixes: 903d31e3 ("ARM: dts: rockchip: Add support for phyCORE-RK3288 SoM")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDaniel Schultz <d.schultz@phytec.de>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c02f016
    • Arnd Bergmann's avatar
      ARM: orion: fix orion_ge00_switch_board_info initialization · 8dc356e5
      Arnd Bergmann authored
      commit 8337d083 upstream.
      
      A section type mismatch warning shows up when building with LTO,
      since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
      const itself:
      
      include/linux/of.h: In function 'spear_setup_of_timer':
      arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section type conflict with 'orion_ge00_mvmdio_bus_name'
       static const struct of_device_id timer_of_match[] __initconst = {
                                        ^
      arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was declared here
       static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
                                      ^
      
      As pointed out by Andrew Lunn, it should in fact be 'const' but not
      '__initconst' because the string is never copied but may be accessed
      after the init sections are freed. To fix that, I get rid of the
      extra symbol and rewrite the initialization in a simpler way that
      assigns both the bus_id and modalias statically.
      
      I spotted another theoretical bug in the same place, where d->netdev[i]
      may be an out of bounds access, this can be fixed by moving the device
      assignment into the loop.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dc356e5