1. 05 Jul, 2017 15 commits
    • Serhey Popovych's avatar
      ipv6: Do not leak throw route references · 640a09c6
      Serhey Popovych authored
      
      [ Upstream commit 07f61557 ]
      
      While commit 73ba57bf ("ipv6: fix backtracking for throw routes")
      does good job on error propagation to the fib_rules_lookup()
      in fib rules core framework that also corrects throw routes
      handling, it does not solve route reference leakage problem
      happened when we return -EAGAIN to the fib_rules_lookup()
      and leave routing table entry referenced in arg->result.
      
      If rule with matched throw route isn't last matched in the
      list we overwrite arg->result losing reference on throw
      route stored previously forever.
      
      We also partially revert commit ab997ad4 ("ipv6: fix the
      incorrect return value of throw route") since we never return
      routing table entry with dst.error == -EAGAIN when
      CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
      to check for RTF_REJECT flag since it is always set throw
      route.
      
      Fixes: 73ba57bf ("ipv6: fix backtracking for throw routes")
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      640a09c6
    • Bert Kenward's avatar
      sfc: provide dummy definitions of vswitch functions · 9de17701
      Bert Kenward authored
      
      efx_probe_all() calls efx->type->vswitching_probe during probe. For
      SFC4000 (Falcon) NICs this function is not defined, leading to a BUG
      with the top of the call stack similar to:
        ? efx_pci_probe_main+0x29a/0x830
        efx_pci_probe+0x7d3/0xe70
      
      vswitching_restore and vswitching_remove also need to be defined.
      
      Fixed in mainline by:
      commit 5a6681e2 ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
      
      Fixes: 6d8aaaf6 ("sfc: create VEB vswitch and vport above default firmware setup")
      Signed-off-by: default avatarBert Kenward <bkenward@solarflare.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9de17701
    • Gao Feng's avatar
      net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev · 1f8bb605
      Gao Feng authored
      
      [ Upstream commit 9745e362 ]
      
      The register_vlan_device would invoke free_netdev directly, when
      register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
      if the dev was already registered. In this case, the netdev would be
      freed in netdev_run_todo later.
      
      So add one condition check now. Only when dev is not registered, then
      free it directly.
      
      The following is the part coredump when netdev_upper_dev_link failed
      in register_vlan_dev. I removed the lines which are too long.
      
      [  411.237457] ------------[ cut here ]------------
      [  411.237458] kernel BUG at net/core/dev.c:7998!
      [  411.237484] invalid opcode: 0000 [#1] SMP
      [  411.237705]  [last unloaded: 8021q]
      [  411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G            E   4.12.0-rc5+ #6
      [  411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [  411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000
      [  411.237782] RIP: 0010:free_netdev+0x116/0x120
      [  411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297
      [  411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878
      [  411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000
      [  411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801
      [  411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000
      [  411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000
      [  411.239518] FS:  00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000
      [  411.239949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0
      [  411.240936] Call Trace:
      [  411.241462]  vlan_ioctl_handler+0x3f1/0x400 [8021q]
      [  411.241910]  sock_ioctl+0x18b/0x2c0
      [  411.242394]  do_vfs_ioctl+0xa1/0x5d0
      [  411.242853]  ? sock_alloc_file+0xa6/0x130
      [  411.243465]  SyS_ioctl+0x79/0x90
      [  411.243900]  entry_SYSCALL_64_fastpath+0x1e/0xa9
      [  411.244425] RIP: 0033:0x7fb69089a357
      [  411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357
      [  411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003
      [  411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999
      [  411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004
      [  411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001
      [  411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f8bb605
    • Wei Wang's avatar
      decnet: always not take dst->__refcnt when inserting dst into hash table · f50f2e0c
      Wei Wang authored
      
      [ Upstream commit 76371d2e ]
      
      In the existing dn_route.c code, dn_route_output_slow() takes
      dst->__refcnt before calling dn_insert_route() while dn_route_input_slow()
      does not take dst->__refcnt before calling dn_insert_route().
      This makes the whole routing code very buggy.
      In dn_dst_check_expire(), dnrt_free() is called when rt expires. This
      makes the routes inserted by dn_route_output_slow() not able to be
      freed as the refcnt is not released.
      In dn_dst_gc(), dnrt_drop() is called to release rt which could
      potentially cause the dst->__refcnt to be dropped to -1.
      In dn_run_flush(), dst_free() is called to release all the dst. Again,
      it makes the dst inserted by dn_route_output_slow() not able to be
      released and also, it does not wait on the rcu and could potentially
      cause crash in the path where other users still refer to this dst.
      
      This patch makes sure both input and output path do not take
      dst->__refcnt before calling dn_insert_route() and also makes sure
      dnrt_free()/dst_free() is called when removing dst from the hash table.
      The only difference between those 2 calls is that dnrt_free() waits on
      the rcu while dst_free() does not.
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f50f2e0c
    • Eli Cohen's avatar
      net/mlx5: Wait for FW readiness before initializing command interface · 93911697
      Eli Cohen authored
      
      [ Upstream commit 6c780a02 ]
      
      Before attempting to initialize the command interface we must wait till
      the fw_initializing bit is clear.
      
      If we fail to meet this condition the hardware will drop our
      configuration, specifically the descriptors page address.  This scenario
      can happen when the firmware is still executing an FLR flow and did not
      finish yet so the driver needs to wait for that to finish.
      
      Fixes: e3297246 ('net/mlx5_core: Wait for FW readiness on startup')
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93911697
    • Xin Long's avatar
      ipv6: fix calling in6_ifa_hold incorrectly for dad work · 0d1effe9
      Xin Long authored
      
      [ Upstream commit f8a894b2 ]
      
      Now when starting the dad work in addrconf_mod_dad_work, if the dad work
      is idle and queued, it needs to hold ifa.
      
      The problem is there's one gap in [1], during which if the pending dad work
      is removed elsewhere. It will miss to hold ifa, but the dad word is still
      idea and queue.
      
              if (!delayed_work_pending(&ifp->dad_work))
                      in6_ifa_hold(ifp);
                          <--------------[1]
              mod_delayed_work(addrconf_wq, &ifp->dad_work, delay);
      
      An use-after-free issue can be caused by this.
      
      Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in
      net6_ifa_finish_destroy was hit because of it.
      
      As Hannes' suggestion, this patch is to fix it by holding ifa first in
      addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if
      the dad_work is already in queue.
      
      Note that this patch did not choose to fix it with:
      
        if (!mod_delayed_work(delay))
                in6_ifa_hold(ifp);
      
      As with it, when delay == 0, dad_work would be scheduled immediately, all
      addrconf_mod_dad_work(0) callings had to be moved under ifp->lock.
      Reported-by: default avatarWei Chen <weichen@redhat.com>
      Suggested-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d1effe9
    • WANG Cong's avatar
      igmp: add a missing spin_lock_init() · 4feb6121
      WANG Cong authored
      
      [ Upstream commit b4846fc3 ]
      
      Andrey reported a lockdep warning on non-initialized
      spinlock:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x292/0x395 lib/dump_stack.c:52
        register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
        ? 0xffffffffa0000000
        __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
        lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
        __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
        _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
        spin_lock_bh ./include/linux/spinlock.h:304
        ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
        igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
        ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
      
      We miss a spin_lock_init() in igmpv3_add_delrec(), probably
      because previously we never use it on this code path. Since
      we already unlink it from the global mc_tomb list, it is
      probably safe not to acquire this spinlock here. It does not
      harm to have it although, to avoid conditional locking.
      
      Fixes: c38b7d32 ("igmp: acquire pmc lock for ip_mc_clear_src()")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4feb6121
    • WANG Cong's avatar
      igmp: acquire pmc lock for ip_mc_clear_src() · ee8d5f9f
      WANG Cong authored
      
      [ Upstream commit c38b7d32 ]
      
      Andrey reported a use-after-free in add_grec():
      
              for (psf = *psf_list; psf; psf = psf_next) {
      		...
                      psf_next = psf->sf_next;
      
      where the struct ip_sf_list's were already freed by:
      
       kfree+0xe8/0x2b0 mm/slub.c:3882
       ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
       ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
       ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
       inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
       sock_release+0x8d/0x1e0 net/socket.c:597
       sock_close+0x16/0x20 net/socket.c:1072
      
      This happens because we don't hold pmc->lock in ip_mc_clear_src()
      and a parallel mr_ifc_timer timer could jump in and access them.
      
      The RCU lock is there but it is merely for pmc itself, this
      spinlock could actually ensure we don't access them in parallel.
      
      Thanks to Eric and Long for discussion on this bug.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee8d5f9f
    • Jia-Ju Bai's avatar
      net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx · 7de53eed
      Jia-Ju Bai authored
      
      [ Upstream commit f146e872 ]
      
      The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
      function call path is:
      cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
        cfctrl_linkdown_req
          cfpkt_create
            cfpkt_create_pfx
              alloc_skb(GFP_KERNEL) --> may sleep
      cfserl_receive (acquire the lock by rcu_read_lock)
        cfpkt_split
          cfpkt_create_pfx
            alloc_skb(GFP_KERNEL) --> may sleep
      
      There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
      "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
      is called under a rcu read lock, instead in interrupt.
      
      To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
      Signed-off-by: default avatarJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7de53eed
    • Krister Johansen's avatar
      Fix an intermittent pr_emerg warning about lo becoming free. · 030a77d2
      Krister Johansen authored
      
      [ Upstream commit f186ce61 ]
      
      It looks like this:
      
      Message from syslogd@flamingo at Apr 26 00:45:00 ...
       kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
      
      They seem to coincide with net namespace teardown.
      
      The message is emitted by netdev_wait_allrefs().
      
      Forced a kdump in netdev_run_todo, but found that the refcount on the lo
      device was already 0 at the time we got to the panic.
      
      Used bcc to check the blocking in netdev_run_todo.  The only places
      where we're off cpu there are in the rcu_barrier() and msleep() calls.
      That behavior is expected.  The msleep time coincides with the amount of
      time we spend waiting for the refcount to reach zero; the rcu_barrier()
      wait times are not excessive.
      
      After looking through the list of callbacks that the netdevice notifiers
      invoke in this path, it appears that the dst_dev_event is the most
      interesting.  The dst_ifdown path places a hold on the loopback_dev as
      part of releasing the dev associated with the original dst cache entry.
      Most of our notifier callbacks are straight-forward, but this one a)
      looks complex, and b) places a hold on the network interface in
      question.
      
      I constructed a new bcc script that watches various events in the
      liftime of a dst cache entry.  Note that dst_ifdown will take a hold on
      the loopback device until the invalidated dst entry gets freed.
      
      [      __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
          __dst_free
          rcu_nocb_kthread
          kthread
          ret_from_fork
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      030a77d2
    • Mateusz Jurczyk's avatar
      af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers · 0fc0fad0
      Mateusz Jurczyk authored
      
      [ Upstream commit defbcf2d ]
      
      Verify that the caller-provided sockaddr structure is large enough to
      contain the sa_family field, before accessing it in bind() and connect()
      handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
      size of the corresponding memory region, very short sockaddrs (zero or
      one byte long) result in operating on uninitialized memory while
      referencing .sa_family.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fc0fad0
    • Mintz, Yuval's avatar
      net: Zero ifla_vf_info in rtnl_fill_vfinfo() · e2c3ee00
      Mintz, Yuval authored
      
      [ Upstream commit 0eed9cf5 ]
      
      Some of the structure's fields are not initialized by the
      rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
      they'd leak memory to user.
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@cavium.com>
      CC: Michal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2c3ee00
    • Mateusz Jurczyk's avatar
      decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb · dedb088a
      Mateusz Jurczyk authored
      
      [ Upstream commit dd0da17b ]
      
      Verify that the length of the socket buffer is sufficient to cover the
      nlmsghdr structure before accessing the nlh->nlmsg_len field for further
      input sanitization. If the client only supplies 1-3 bytes of data in
      sk_buff, then nlh->nlmsg_len remains partially uninitialized and
      contains leftover memory from the corresponding kernel allocation.
      Operating on such data may result in indeterminate evaluation of the
      nlmsg_len < sizeof(*nlh) expression.
      
      The bug was discovered by a runtime instrumentation designed to detect
      use of uninitialized memory in the kernel. The patch prevents this and
      other similar tools (e.g. KMSAN) from flagging this behavior in the future.
      Signed-off-by: default avatarMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dedb088a
    • Alexander Potapenko's avatar
      net: don't call strlen on non-terminated string in dev_set_alias() · e79948e2
      Alexander Potapenko authored
      
      [ Upstream commit c28294b9 ]
      
      KMSAN reported a use of uninitialized memory in dev_set_alias(),
      which was caused by calling strlcpy() (which in turn called strlen())
      on the user-supplied non-terminated string.
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e79948e2
    • Willem de Bruijn's avatar
      ipv6: release dst on error in ip6_dst_lookup_tail · d68a4e38
      Willem de Bruijn authored
      commit 00ea1cee upstream.
      
      If ip6_dst_lookup_tail has acquired a dst and fails the IPv4-mapped
      check, release the dst before returning an error.
      
      Fixes: ec5e3b0a ("ipv6: Inhibit IPv4-mapped src address on the wire.")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d68a4e38
  2. 29 Jun, 2017 25 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.75 · 6ee496d7
      Greg Kroah-Hartman authored
      6ee496d7
    • Guilherme G. Piccoli's avatar
      nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too · cb7be08d
      Guilherme G. Piccoli authored
      commit b5a10c5f upstream.
      
      Commit 54adc010 ("nvme/quirk: Add a delay before checking for adapter
      readiness") introduced a quirk to adapters that cannot read the bit
      NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
      need a delay or else the action of reading the bit NVME_CSTS_RDY could
      somehow corrupt adapter's registers state and it never recovers.
      
      When this quirk was added, we checked ctrl->tagset in order to avoid
      quirking in probe time, supposing we would never require such delay
      during probe. Well, it was too optimistic; we in fact need this quirk
      at probe time in some cases, like after a kexec.
      
      In some experiments, after abnormal shutdown of machine (aka power cord
      unplug), we booted into our bootloader in Power, which is a Linux kernel,
      and kexec'ed into another distro. If this kexec is too quick, we end up
      reaching the probe of NVMe adapter in that distro when adapter is in
      bad state (not fully initialized on our bootloader). What happens next
      is that nvme_wait_ready() is unable to complete, except if the quirk is
      enabled.
      
      So, this patch removes the original ctrl->tagset verification in order
      to enable the quirk even on probe time.
      
      Fixes: 54adc010 ("nvme/quirk: Add a delay before checking for adapter readiness")
      Reported-by: default avatarAndrew Byrne <byrneadw@ie.ibm.com>
      Reported-by: default avatarJaime A. H. Gomez <jahgomez@mx1.ibm.com>
      Reported-by: default avatarZachary D. Myers <zdmyers@us.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-by: default avatarJeffrey Lien <Jeff.Lien@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      [mauricfo: backport to v4.4.70 without nvme quirk handling & nvme_ctrl]
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Tested-by: default avatarNarasimhan Vaidyanathan <vnarasimhan@in.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb7be08d
    • Guilherme G. Piccoli's avatar
      nvme/quirk: Add a delay before checking for adapter readiness · bddc8027
      Guilherme G. Piccoli authored
      commit 54adc010 upstream.
      
      When disabling the controller, the specification says the register
      NVME_REG_CC should be written and then driver needs to wait the
      adapter to be ready, which is checked by reading another register
      bit (NVME_CSTS_RDY). There's a timeout validation in this checking,
      so in case this timeout is reached the driver gives up and removes
      the adapter from the system.
      
      After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003)
      (HGST adapter) end up being removed if we issue a reset_controller,
      because driver keeps verifying the NVME_REG_CSTS until the timeout is
      reached. This patch adds a necessary quirk for this adapter, by
      introducing a delay before nvme_wait_ready(), so the reset procedure
      is able to be completed. This quirk is needed because just increasing
      the timeout is not enough in case of this adapter - the driver must
      wait before start reading NVME_REG_CSTS register on this specific
      device.
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      [mauricfo: backport to v4.4.70 without nvme quirk handling & nvme_ctrl]
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Tested-by: default avatarNarasimhan Vaidyanathan <vnarasimhan@in.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bddc8027
    • Russell King's avatar
      net: phy: fix marvell phy status reading · e5f87c73
      Russell King authored
      commit 898805e0 upstream.
      
      The Marvell driver incorrectly provides phydev->lp_advertising as the
      logical and of the link partner's advert and our advert.  This is
      incorrect - this field is supposed to store the link parter's unmodified
      advertisment.
      
      This allows ethtool to report the correct link partner auto-negotiation
      status.
      
      Fixes: be937f1f ("Marvell PHY m88e1111 driver fix")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5f87c73
    • Yendapally Reddy Dhananjaya Reddy's avatar
      net: phy: Initialize mdio clock at probe function · 9b54821d
      Yendapally Reddy Dhananjaya Reddy authored
      commit bb1a6197 upstream.
      
      USB PHYs need the MDIO clock divisor enabled earlier to work.
      Initialize mdio clock divisor in probe function. The ext bus
      bit available in the same register will be used by mdio mux
      to enable external mdio.
      Signed-off-by: default avatarYendapally Reddy Dhananjaya Reddy <yendapally.reddy@broadcom.com>
      Fixes: ddc24ae1 ("net: phy: Broadcom iProc MDIO bus driver")
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJon Mason <jon.mason@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b54821d
    • William Wu's avatar
      usb: gadget: f_fs: avoid out of bounds access on comp_desc · 889caad4
      William Wu authored
      commit b7f73850 upstream.
      
      Companion descriptor is only used for SuperSpeed endpoints,
      if the endpoints are HighSpeed or FullSpeed, the Companion
      descriptor will not allocated, so we can only access it if
      gadget is SuperSpeed.
      
      I can reproduce this issue on Rockchip platform rk3368 SoC
      which supports USB 2.0, and use functionfs for ADB. Kernel
      build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report
      the following BUG:
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr ffffffc0601f6509
      Read of size 1 by task swapper/0/0
      ============================================================================
      BUG kmalloc-256 (Not tainted): kasan: bad access detected
      ----------------------------------------------------------------------------
      
      Disabling lock debugging due to kernel taint
      INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1
      alloc_debug_processing+0x128/0x17c
      ___slab_alloc.constprop.58+0x50c/0x610
      __slab_alloc.isra.55.constprop.57+0x24/0x34
      __kmalloc+0xe0/0x250
      ffs_func_bind+0x52c/0x99c
      usb_add_function+0xd8/0x1d4
      configfs_composite_bind+0x48c/0x570
      udc_bind_to_driver+0x6c/0x170
      usb_udc_attach_driver+0xa4/0xd0
      gadget_dev_desc_UDC_store+0xcc/0x118
      configfs_write_file+0x1a0/0x1f8
      __vfs_write+0x64/0x174
      vfs_write+0xe4/0x200
      SyS_write+0x68/0xc8
      el0_svc_naked+0x24/0x28
      INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247
      ...
      Call trace:
      [<ffffff900808aab4>] dump_backtrace+0x0/0x230
      [<ffffff900808acf8>] show_stack+0x14/0x1c
      [<ffffff90084ad420>] dump_stack+0xa0/0xc8
      [<ffffff90082157cc>] print_trailer+0x188/0x198
      [<ffffff9008215948>] object_err+0x3c/0x4c
      [<ffffff900821b5ac>] kasan_report+0x324/0x4dc
      [<ffffff900821aa38>] __asan_load1+0x24/0x50
      [<ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0
      [<ffffff90089d3760>] composite_setup+0xdcc/0x1ac8
      [<ffffff90089d7394>] android_setup+0x124/0x1a0
      [<ffffff90089acd18>] _setup+0x54/0x74
      [<ffffff90089b6b98>] handle_ep0+0x3288/0x4390
      [<ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4
      [<ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298
      [<ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20
      [<ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac
      [<ffffff9008116610>] handle_irq_event+0x60/0xa0
      [<ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4
      [<ffffff9008115568>] generic_handle_irq+0x30/0x40
      [<ffffff90081159b4>] __handle_domain_irq+0xac/0xdc
      [<ffffff9008080e9c>] gic_handle_irq+0x64/0xa4
      ...
      Memory state around the buggy address:
        ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc
       >ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                             ^
        ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
      ==================================================================
      Signed-off-by: default avatarWilliam Wu <william.wu@rock-chips.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Cc: Jerry Zhang <zhangjerry@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      889caad4
    • Michael Ellerman's avatar
      powerpc/slb: Force a full SLB flush when we insert for a bad EA · db7130d6
      Michael Ellerman authored
      [Note this patch is not upstream. The bug fix was fixed differently in
      upstream prior to the bug being identified.]
      
      The SLB miss handler calls slb_allocate_realmode() in order to create an
      SLB entry for the faulting address. At the very start of that function
      we check that the faulting Effective Address (EA) is less than
      PGTABLE_RANGE (ignoring the region), ie. is it an address which could
      possibly fit in the virtual address space.
      
      For an EA which fails that test, we branch out of line (to label 8), but
      we still go on to create an SLB entry for the address. The SLB entry we
      create has a VSID of 0, which means it will never match anything in the
      hash table and so can't actually translate to a physical address.
      
      However that SLB entry will be inserted in the SLB, and so needs to be
      managed properly like any other SLB entry. In particular we need to
      insert the SLB entry in the SLB cache, so that it will be flushed when
      the process is descheduled.
      
      And that is where the bugs begin. The first bug is that slb_finish_load()
      uses cr7 to decide if it should insert the SLB entry into the SLB cache.
      When we come from the invalid EA case we don't set cr7, it just has some
      junk value from userspace. So we may or may not insert the SLB entry in
      the SLB cache. If we fail to insert it, we may then incorrectly leave it
      in the SLB when the process is descheduled.
      
      The second bug is that even if we do happen to add the entry to the SLB
      cache, we do not have enough bits in the SLB cache to remember the full
      ESID value for very large EAs.
      
      For example if a process branches to 0x788c545a18000000, that results in
      a 256MB SLB entry with an ESID of 0x788c545a1. But each entry in the SLB
      cache is only 32-bits, meaning we truncate the ESID to 0x88c545a1. This
      has the same effect as the first bug, we incorrectly leave the SLB entry
      in the SLB when the process is descheduled.
      
      When a process accesses an invalid EA it results in a SEGV signal being
      sent to the process, which typically results in the process being
      killed. Process death isn't instantaneous however, the process may catch
      the SEGV signal and continue somehow, or the kernel may start writing a
      core dump for the process, either of which means it's possible for the
      process to be preempted while its processing the SEGV but before it's
      been killed.
      
      If that happens, when the process is scheduled back onto the CPU we will
      allocate a new SLB entry for the NIP, which will insert a second entry
      into the SLB for the bad EA. Because we never flushed the original
      entry, due to either bug one or two, we now have two SLB entries that
      match the same EA.
      
      If another access is made to that EA, either by the process continuing
      after catching the SEGV, or by a second process accessing the same bad
      EA on the same CPU, we will trigger an SLB multi-hit machine check
      exception. This has been observed happening in the wild.
      
      The fix is when we hit the invalid EA case, we mark the SLB cache as
      being full. This causes us to not insert the truncated ESID into the SLB
      cache, and means when the process is switched out we will flush the
      entire SLB. Note that this works both for the original fault and for a
      subsequent call to slb_allocate_realmode() from switch_slb().
      
      Because we mark the SLB cache as full, it doesn't really matter what
      value is in cr7, but rather than leaving it as something random we set
      it to indicate the address was a kernel address. That also skips the
      attempt to insert it in the SLB cache which is a nice side effect.
      
      Another way to fix the bug would be to make the entries in the SLB cache
      wider, so that we don't truncate the ESID. However this would be a more
      intrusive change as it alters the size and layout of the paca.
      
      This bug was fixed in upstream by commit f0f558b1 ("powerpc/mm:
      Preserve CFAR value on SLB miss caused by access to bogus address"),
      which changed the way we handle a bad EA entirely removing this bug in
      the process.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db7130d6
    • Joël Esponde's avatar
      mtd: spi-nor: fix spansion quad enable · 8fcb215c
      Joël Esponde authored
      commit 807c1625 upstream.
      
      With the S25FL127S nor flash part, each writing to the configuration
      register takes hundreds of ms. During that  time, no more accesses to
      the flash should be done (even reads).
      
      This commit adds a wait loop after the register writing until the flash
      finishes its work.
      
      This issue could make rootfs mounting fail when the latter was done too
      much closely to this quad enable bit setting step. And in this case, a
      driver as UBIFS may try to recover the filesystem and may broke it
      completely.
      Signed-off-by: default avatarJoël Esponde <joel.esponde@honeywell.com>
      Signed-off-by: default avatarCyrille Pitchen <cyrille.pitchen@atmel.com>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8fcb215c
    • Tobias Wolf's avatar
      of: Add check to of_scan_flat_dt() before accessing initial_boot_params · 7dfea167
      Tobias Wolf authored
      commit 3ec75441 upstream.
      
      An empty __dtb_start to __dtb_end section might result in
      initial_boot_params being null for arch/mips/ralink. This showed that the
      boot process hangs indefinitely in of_scan_flat_dt().
      Signed-off-by: default avatarTobias Wolf <dev-NTEO@vplace.de>
      Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14605/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dfea167
    • David Howells's avatar
      rxrpc: Fix several cases where a padded len isn't checked in ticket decode · eab38dfd
      David Howells authored
      commit 5f2f9765 upstream.
      
      This fixes CVE-2017-7482.
      
      When a kerberos 5 ticket is being decoded so that it can be loaded into an
      rxrpc-type key, there are several places in which the length of a
      variable-length field is checked to make sure that it's not going to
      overrun the available data - but the data is padded to the nearest
      four-byte boundary and the code doesn't check for this extra.  This could
      lead to the size-remaining variable wrapping and the data pointer going
      over the end of the buffer.
      
      Fix this by making the various variable-length data checks use the padded
      length.
      Reported-by: default avatar石磊 <shilei-c@360.cn>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.c.dionne@auristor.com>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eab38dfd
    • Johan Hovold's avatar
      USB: usbip: fix nonconforming hub descriptor · 800d7454
      Johan Hovold authored
      commit ec963b41 upstream.
      
      Fix up the root-hub descriptor to accommodate the variable-length
      DeviceRemovable and PortPwrCtrlMask fields, while marking all ports as
      removable (and leaving the reserved bit zero unset).
      
      Also add a build-time constraint on VHCI_HC_PORTS which must never be
      greater than USB_MAXCHILDREN (but this was only enforced through a
      KConfig constant).
      
      This specifically fixes the descriptor layout whenever VHCI_HC_PORTS is
      greater than seven (default is 8).
      
      Fixes: 04679b34 ("Staging: USB/IP: add client driver")
      Cc: Takahiro Hirofuchi <hirofuchi@users.sourceforge.net>
      Cc: Valentina Manea <valentina.manea.m@gmail.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Acked-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      [ johan: backport to v4.4, which uses VHCI_NPORTS ]
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      800d7454
    • Alex Deucher's avatar
      drm/amdgpu: adjust default display clock · 525e496a
      Alex Deucher authored
      commit 52b482b0 upstream.
      
      Increase the default display clock on newer asics to
      accomodate some high res modes with really high refresh
      rates.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826Acked-by: default avatarChunming Zhou <david1.zhou@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      525e496a
    • Alex Deucher's avatar
      drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating · 52652784
      Alex Deucher authored
      commit 05b4017b upstream.
      
      We were using the wrong structure which lead to an overflow
      on some boards.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387Acked-by: default avatarChunming Zhou <david1.zhou@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52652784
    • Alex Deucher's avatar
    • Alex Deucher's avatar
    • Nicholas Bellinger's avatar
      iscsi-target: Reject immediate data underflow larger than SCSI transfer length · fe8003da
      Nicholas Bellinger authored
      commit abb85a9b upstream.
      
      When iscsi WRITE underflow occurs there are two different scenarios
      that can happen.
      
      Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH
      underflow is detected, the iscsi immediate data payload is the
      smaller SCSI CDB TRANSFER LENGTH.
      
      That is, when a host fabric LLD is using a fixed size EDTL for
      a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual
      SCSI payload ends up being smaller than EDTL.  In iscsi, this
      means the received iscsi immediate data payload matches the
      smaller SCSI CDB TRANSFER LENGTH, because there is no more
      SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH.
      
      However, it's possible for a malicous host to send a WRITE
      underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH,
      but incoming iscsi immediate data actually matches EDTL.
      
      In the wild, we've never had a iscsi host environment actually
      try to do this.
      
      For this special case, it's wrong to truncate part of the
      control CDB payload and continue to process the command during
      underflow when immediate data payload received was larger than
      SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the
      bogus payload as a defensive action.
      
      Note this potential bug was originally relaxed by the following
      for allowing WRITE underflow in MSFT FCP host environments:
      
         commit c72c5250
         Author: Roland Dreier <roland@purestorage.com>
         Date:   Wed Jul 22 15:08:18 2015 -0700
      
            target: allow underflow/overflow for PR OUT etc. commands
      
      Cc: Roland Dreier <roland@purestorage.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe8003da
    • Nicholas Bellinger's avatar
      target: Fix kref->refcount underflow in transport_cmd_finish_abort · d374be75
      Nicholas Bellinger authored
      commit 73d4e580 upstream.
      
      This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED
      when a fabric driver drops it's second reference from below the
      target_core_tmr.c based callers of transport_cmd_finish_abort().
      
      Recently with the conversion of kref to refcount_t, this bug was
      manifesting itself as:
      
      [705519.601034] refcount_t: underflow; use-after-free.
      [705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs
      [705539.719111] ------------[ cut here ]------------
      [705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51
      
      Since the original kref atomic_t based kref_put() didn't check for
      underflow and only invoked the final callback when zero was reached,
      this bug did not manifest in practice since all se_cmd memory is
      using preallocated tags.
      
      To address this, go ahead and propigate the existing return from
      transport_put_cmd() up via transport_cmd_finish_abort(), and
      change transport_cmd_finish_abort() + core_tmr_handle_tas_abort()
      callers to only do their local target_put_sess_cmd() if necessary.
      Reported-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Tested-by: default avatarGary Guo <ghg@datera.io>
      Tested-by: default avatarChu Yuan Lin <cyl@datera.io>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d374be75
    • John Stultz's avatar
      time: Fix clock->read(clock) race around clocksource changes · 1fecf397
      John Stultz authored
      commit ceea5e37 upstream.
      
      In tests, which excercise switching of clocksources, a NULL
      pointer dereference can be observed on AMR64 platforms in the
      clocksource read() function:
      
      u64 clocksource_mmio_readl_down(struct clocksource *c)
      {
      	return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask;
      }
      
      This is called from the core timekeeping code via:
      
      	cycle_now = tkr->read(tkr->clock);
      
      tkr->read is the cached tkr->clock->read() function pointer.
      When the clocksource is changed then tkr->clock and tkr->read
      are updated sequentially. The code above results in a sequential
      load operation of tkr->read and tkr->clock as well.
      
      If the store to tkr->clock hits between the loads of tkr->read
      and tkr->clock, then the old read() function is called with the
      new clock pointer. As a consequence the read() function
      dereferences a different data structure and the resulting 'reg'
      pointer can point anywhere including NULL.
      
      This problem was introduced when the timekeeping code was
      switched over to use struct tk_read_base. Before that, it was
      theoretically possible as well when the compiler decided to
      reload clock in the code sequence:
      
           now = tk->clock->read(tk->clock);
      
      Add a helper function which avoids the issue by reading
      tk_read_base->clock once into a local variable clk and then issue
      the read function via clk->read(clk). This guarantees that the
      read() function always gets the proper clocksource pointer handed
      in.
      
      Since there is now no use for the tkr.read pointer, this patch
      also removes it, and to address stopping the fast timekeeper
      during suspend/resume, it introduces a dummy clocksource to use
      rather then just a dummy read function.
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <stephen.boyd@linaro.org>
      Cc: Miroslav Lichvar <mlichvar@redhat.com>
      Cc: Daniel Mentz <danielmentz@google.com>
      Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fecf397
    • Daniel Drake's avatar
      Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list · 255ad85b
      Daniel Drake authored
      commit 817ae460 upstream.
      
      Without this quirk, the touchpad is not responsive on this product, with
      the following message repeated in the logs:
      
       psmouse serio1: bad data from KBC - timeout
      
      Add it to the notimeout list alongside other similar Fujitsu laptops.
      Signed-off-by: default avatarDaniel Drake <drake@endlessm.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      255ad85b
    • Naveen N. Rao's avatar
      powerpc/kprobes: Pause function_graph tracing during jprobes handling · 3ee9033e
      Naveen N. Rao authored
      commit a9f8553e upstream.
      
      This fixes a crash when function_graph and jprobes are used together.
      This is essentially commit 237d28db ("ftrace/jprobes/x86: Fix
      conflict between jprobes and function graph tracing"), but for powerpc.
      
      Jprobes breaks function_graph tracing since the jprobe hook needs to use
      jprobe_return(), which never returns back to the hook, but instead to
      the original jprobe'd function. The solution is to momentarily pause
      function_graph tracing before invoking the jprobe hook and re-enable it
      when returning back to the original jprobe'd function.
      
      Fixes: 6794c782 ("powerpc64: port of the function graph tracer")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ee9033e
    • Eric W. Biederman's avatar
      signal: Only reschedule timers on signals timers have sent · bc7b3e99
      Eric W. Biederman authored
      commit 57db7e4a upstream.
      
      Thomas Gleixner  wrote:
      > The CRIU support added a 'feature' which allows a user space task to send
      > arbitrary (kernel) signals to itself. The changelog says:
      >
      >   The kernel prevents sending of siginfo with positive si_code, because
      >   these codes are reserved for kernel.  I think we can allow a task to
      >   send such a siginfo to itself.  This operation should not be dangerous.
      >
      > Quite contrary to that claim, it turns out that it is outright dangerous
      > for signals with info->si_code == SI_TIMER. The following code sequence in
      > a user space task allows to crash the kernel:
      >
      >    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
      >    timer_set(id, ....);
      >    info->si_signo = SIGX;
      >    info->si_code = SI_TIMER:
      >    info->_sifields._timer._tid = id;
      >    info->_sifields._timer._sys_private = 2;
      >    rt_[tg]sigqueueinfo(..., SIGX, info);
      >    sigemptyset(&sigset);
      >    sigaddset(&sigset, SIGX);
      >    rt_sigtimedwait(sigset, info);
      >
      > For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
      > results in a kernel crash because sigwait() dequeues the signal and the
      > dequeue code observes:
      >
      >   info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
      >
      > which triggers the following callchain:
      >
      >  do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
      >
      > arm_timer() executes a list_add() on the timer, which is already armed via
      > the timer_set() syscall. That's a double list add which corrupts the posix
      > cpu timer list. As a consequence the kernel crashes on the next operation
      > touching the posix cpu timer list.
      >
      > Posix clocks which are internally implemented based on hrtimers are not
      > affected by this because hrtimer_start() can handle already armed timers
      > nicely, but it's a reliable way to trigger the WARN_ON() in
      > hrtimer_forward(), which complains about calling that function on an
      > already armed timer.
      
      This problem has existed since the posix timer code was merged into
      2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
      inject not just a signal (which linux has supported since 1.0) but the
      full siginfo of a signal.
      
      The core problem is that the code will reschedule in response to
      signals getting dequeued not just for signals the timers sent but
      for other signals that happen to a si_code of SI_TIMER.
      
      Avoid this confusion by testing to see if the queued signal was
      preallocated as all timer signals are preallocated, and so far
      only the timer code preallocates signals.
      
      Move the check for if a timer needs to be rescheduled up into
      collect_signal where the preallocation check must be performed,
      and pass the result back to dequeue_signal where the code reschedules
      timers.   This makes it clear why the code cares about preallocated
      timers.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
      Reference: 66dd34ad ("signal: allow to send any siginfo to itself")
      Reference: 1669ce53 ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
      Fixes: db8b50ba ("[PATCH] POSIX clocks & timers")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc7b3e99
    • Sebastian Parschauer's avatar
      HID: Add quirk for Dell PIXART OEM mouse · 005253ff
      Sebastian Parschauer authored
      commit 3db28271 upstream.
      
      This mouse is also known under other IDs. It needs the quirk
      ALWAYS_POLL or will disconnect in runlevel 1 or 3.
      Signed-off-by: default avatarSebastian Parschauer <sparschauer@suse.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      005253ff
    • Pavel Shilovsky's avatar
      CIFS: Improve readdir verbosity · 63ba840a
      Pavel Shilovsky authored
      commit dcd87838 upstream.
      
      Downgrade the loglevel for SMB2 to prevent filling the log
      with messages if e.g. readdir was interrupted. Also make SMB2
      and SMB1 codepaths do the same logging during readdir.
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63ba840a
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Preserve userspace HTM state properly · 824b9506
      Paul Mackerras authored
      commit 46a704f8 upstream.
      
      If userspace attempts to call the KVM_RUN ioctl when it has hardware
      transactional memory (HTM) enabled, the values that it has put in the
      HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by
      guest values.  To fix this, we detect this condition and save those
      SPR values in the thread struct, and disable HTM for the task.  If
      userspace goes to access those SPRs or the HTM facility in future,
      a TM-unavailable interrupt will occur and the handler will reload
      those SPRs and re-enable HTM.
      
      If userspace has started a transaction and suspended it, we would
      currently lose the transactional state in the guest entry path and
      would almost certainly get a "TM Bad Thing" interrupt, which would
      cause the host to crash.  To avoid this, we detect this case and
      return from the KVM_RUN ioctl with an EINVAL error, with the KVM
      exit reason set to KVM_EXIT_FAIL_ENTRY.
      
      Fixes: b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      824b9506
    • Ilya Matveychikov's avatar
      lib/cmdline.c: fix get_options() overflow while parsing ranges · 7b88f761
      Ilya Matveychikov authored
      commit a91e0f68 upstream.
      
      When using get_options() it's possible to specify a range of numbers,
      like 1-100500.  The problem is that it doesn't track array size while
      calling internally to get_range() which iterates over the range and
      fills the memory with numbers.
      
      Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.comSigned-off-by: default avatarIlya V. Matveychikov <matvejchikov@gmail.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b88f761