1. 26 Jun, 2017 40 commits
    • Xin Long's avatar
      sctp: sctp_addr_id2transport should verify the addr before looking up assoc · 86e9b2ee
      Xin Long authored
      [ Upstream commit 6f29a130 ]
      
      sctp_addr_id2transport is a function for sockopt to look up assoc by
      address. As the address is from userspace, it can be a v4-mapped v6
      address. But in sctp protocol stack, it always handles a v4-mapped
      v6 address as a v4 address. So it's necessary to convert it to a v4
      address before looking up assoc by address.
      
      This patch is to fix it by calling sctp_verify_addr in which it can do
      this conversion before calling sctp_endpoint_lookup_assoc, just like
      what sctp_sendmsg and __sctp_connect do for the address from users.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      86e9b2ee
    • hayeswang's avatar
      r8152: re-schedule napi for tx · 5fbc861a
      hayeswang authored
      [ Upstream commit 248b213a ]
      
      Re-schedule napi after napi_complete() for tx, if it is necessay.
      
      In r8152_poll(), if the tx is completed after tx_bottom() and before
      napi_complete(), the scheduling of napi would be lost. Then, no
      one handles the next tx until the next napi_schedule() is called.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      5fbc861a
    • Y.C. Chen's avatar
      drm/ast: Fixed system hanged if disable P2A · 41e0083c
      Y.C. Chen authored
      [ Upstream commit 6c971c09 ]
      
      The original ast driver will access some BMC configuration through P2A bridge
      that can be disabled since AST2300 and after.
      It will cause system hanged if P2A bridge is disabled.
      Here is the update to fix it.
      Signed-off-by: default avatarY.C. Chen <yc_chen@aspeedtech.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      41e0083c
    • Lyude Paul's avatar
      drm/nouveau: Don't enabling polling twice on runtime resume · 9b50bb2b
      Lyude Paul authored
      [ Upstream commit cae9ff03 ]
      
      As it turns out, on cards that actually have CRTCs on them we're already
      calling drm_kms_helper_poll_enable(drm_dev) from
      nouveau_display_resume() before we call it in
      nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
      enable polling twice, which results in a potential deadlock between the
      RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
      polling the second time while output_poll_execute is running and holding
      the mode_config lock. As such, make sure we only enable polling in
      nouveau_pmops_runtime_resume() if we need to.
      
      This fixes hangs observed on the ThinkPad W541
      Signed-off-by: default avatarLyude <lyude@redhat.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: David Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      9b50bb2b
    • Helge Deller's avatar
      parisc, parport_gsc: Fixes for printk continuation lines · c29b8f7d
      Helge Deller authored
      [ Upstream commit 83b5d1e3 ]
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      c29b8f7d
    • Alexey Khoroshilov's avatar
      net: adaptec: starfire: add checks for dma mapping errors · 8cc57997
      Alexey Khoroshilov authored
      [ Upstream commit d1156b48 ]
      
      init_ring(), refill_rx_ring() and start_tx() don't check
      if mapping dma memory succeed.
      The patch adds the checks and failure handling.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8cc57997
    • Jack Morgenstein's avatar
      net/mlx4_core: Avoid command timeouts during VF driver device shutdown · 6d433524
      Jack Morgenstein authored
      [ Upstream commit d585df1c ]
      
      Some Hypervisors detach VFs from VMs by instantly causing an FLR event
      to be generated for a VF.
      
      In the mlx4 case, this will cause that VF's comm channel to be disabled
      before the VM has an opportunity to invoke the VF device's "shutdown"
      method.
      
      The result is that the VF driver on the VM will experience a command
      timeout during the shutdown process when the Hypervisor does not deliver
      a command-completion event to the VM.
      
      To avoid FW command timeouts on the VM when the driver's shutdown method
      is invoked, we detect the absence of the VF's comm channel at the very
      start of the shutdown process. If the comm-channel has already been
      disabled, we cause all FW commands during the device shutdown process to
      immediately return success (and thus avoid all command timeouts).
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      6d433524
    • Ben Skeggs's avatar
    • David Howells's avatar
      fscache: Clear outstanding writes when disabling a cookie · 510c2963
      David Howells authored
      [ Upstream commit 6bdded59 ]
      
      fscache_disable_cookie() needs to clear the outstanding writes on the
      cookie it's disabling because they cannot be completed after.
      
      Without this, fscache_nfs_open_file() gets stuck because it disables the
      cookie when the file is opened for writing but can't uncache the pages till
      afterwards - otherwise there's a race between the open routine and anyone
      who already has it open R/O and is still reading from it.
      
      Looking in /proc/pid/stack of the offending process shows:
      
      [<ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
      [<ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
      [<ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
      [<ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
      [<ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
      [<ffffffff811743ac>] vfs_open+0x5c/0x65
      [<ffffffff81184185>] path_openat+0x785/0x8fb
      [<ffffffff81184343>] do_filp_open+0x48/0x9e
      [<ffffffff81174710>] do_sys_open+0x13b/0x1cb
      [<ffffffff811747b9>] SyS_open+0x19/0x1b
      [<ffffffff81001c44>] do_syscall_64+0x80/0x17a
      [<ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
      [<ffffffffffffffff>] 0xffffffffffffffff
      Reported-by: default avatarJianhong Yin <jiyin@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJeff Layton <jlayton@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      510c2963
    • Stanislaw Gruszka's avatar
      ethtool: do not vzalloc(0) on registers dump · 42c32ac3
      Stanislaw Gruszka authored
      [ Upstream commit 3808d348 ]
      
      If ->get_regs_len() callback return 0, we allocate 0 bytes of memory,
      what print ugly warning in dmesg, which can be found further below.
      
      This happen on mac80211 devices where ieee80211_get_regs_len() just
      return 0 and driver only fills ethtool_regs structure and actually
      do not provide any dump. However I assume this can happen on other
      drivers i.e. when for some devices driver provide regs dump and for
      others do not. Hence preventing to to print warning in ethtool code
      seems to be reasonable.
      
      ethtool: vmalloc: allocation failure: 0 bytes, mode:0x24080c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO)
      <snip>
      Call Trace:
      [<ffffffff813bde47>] dump_stack+0x63/0x8c
      [<ffffffff811b0a1f>] warn_alloc+0x13f/0x170
      [<ffffffff811f0476>] __vmalloc_node_range+0x1e6/0x2c0
      [<ffffffff811f0874>] vzalloc+0x54/0x60
      [<ffffffff8169986c>] dev_ethtool+0xb4c/0x1b30
      [<ffffffff816adbb1>] dev_ioctl+0x181/0x520
      [<ffffffff816714d2>] sock_do_ioctl+0x42/0x50
      <snip>
      Mem-Info:
      active_anon:435809 inactive_anon:173951 isolated_anon:0
       active_file:835822 inactive_file:196932 isolated_file:0
       unevictable:0 dirty:8 writeback:0 unstable:0
       slab_reclaimable:157732 slab_unreclaimable:10022
       mapped:83042 shmem:306356 pagetables:9507 bounce:0
       free:130041 free_pcp:1080 free_cma:0
      Node 0 active_anon:1743236kB inactive_anon:695804kB active_file:3343288kB inactive_file:787728kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:332168kB dirty:32kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1225424kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
      Node 0 DMA free:15900kB min:136kB low:168kB high:200kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15900kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      lowmem_reserve[]: 0 3187 7643 7643
      Node 0 DMA32 free:419732kB min:28124kB low:35152kB high:42180kB active_anon:541180kB inactive_anon:248988kB active_file:1466388kB inactive_file:389632kB unevictable:0kB writepending:0kB present:3370280kB managed:3290932kB mlocked:0kB slab_reclaimable:217184kB slab_unreclaimable:4180kB kernel_stack:160kB pagetables:984kB bounce:0kB free_pcp:2236kB local_pcp:660kB free_cma:0kB
      lowmem_reserve[]: 0 0 4456 4456
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      42c32ac3
    • Ard Biesheuvel's avatar
      log2: make order_base_2() behave correctly on const input value zero · eaabe4b7
      Ard Biesheuvel authored
      [ Upstream commit 29905b52 ]
      
      The function order_base_2() is defined (according to the comment block)
      as returning zero on input zero, but subsequently passes the input into
      roundup_pow_of_two(), which is explicitly undefined for input zero.
      
      This has gone unnoticed until now, but optimization passes in GCC 7 may
      produce constant folded function instances where a constant value of
      zero is passed into order_base_2(), resulting in link errors against the
      deliberately undefined '____ilog2_NaN'.
      
      So update order_base_2() to adhere to its own documented interface.
      
      [ See
      
           http://marc.info/?l=linux-kernel&m=147672952517795&w=2
      
        and follow-up discussion for more background. The gcc "optimization
        pass" is really just broken, but now the GCC trunk problem seems to
        have escaped out of just specially built daily images, so we need to
        work around it in mainline.    - Linus ]
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      eaabe4b7
    • Peter Zijlstra's avatar
      kasan: respect /proc/sys/kernel/traceoff_on_warning · 8bc30cf0
      Peter Zijlstra authored
      [ Upstream commit 4f40c6e5 ]
      
      After much waiting I finally reproduced a KASAN issue, only to find my
      trace-buffer empty of useful information because it got spooled out :/
      
      Make kasan_report honour the /proc/sys/kernel/traceoff_on_warning
      interface.
      
      Link: http://lkml.kernel.org/r/20170125164106.3514-1-aryabinin@virtuozzo.comSigned-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      8bc30cf0
    • David Lin's avatar
      jump label: pass kbuild_cflags when checking for asm goto support · acd66665
      David Lin authored
      [ Upstream commit 35f860f9 ]
      
      Some versions of ARM GCC compiler such as Android toolchain throws in a
      '-fpic' flag by default.  This causes the gcc-goto check script to fail
      although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.
      
      This patch passes the KBUILD_CFLAGS to the check script so that the
      script does not rely on the default config from different compilers.
      
      Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.comSigned-off-by: default avatarDavid Lin <dtwlin@google.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Michal Marek <mmarek@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      acd66665
    • Rafael J. Wysocki's avatar
      PM / runtime: Avoid false-positive warnings from might_sleep_if() · cb2098ab
      Rafael J. Wysocki authored
      [ Upstream commit a9306a63 ]
      
      The might_sleep_if() assertions in __pm_runtime_idle(),
      __pm_runtime_suspend() and __pm_runtime_resume() may generate
      false-positive warnings in some situations.  For example, that
      happens if a nested pm_runtime_get_sync()/pm_runtime_put() pair
      is executed with disabled interrupts within an outer
      pm_runtime_get_sync()/pm_runtime_put() section for the same device.
      [Generally, pm_runtime_get_sync() may sleep, so it should not be
      called with disabled interrupts, but in this particular case the
      previous pm_runtime_get_sync() guarantees that the device will not
      be suspended, so the inner pm_runtime_get_sync() will return
      immediately after incrementing the device's usage counter.]
      
      That started to happen in the i915 driver in 4.10-rc, leading to
      the following splat:
      
       BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1032
       in_atomic(): 1, irqs_disabled(): 0, pid: 1500, name: Xorg
       1 lock held by Xorg/1500:
        #0:  (&dev->struct_mutex){+.+.+.}, at:
        [<ffffffffa0680c13>] i915_mutex_lock_interruptible+0x43/0x140 [i915]
       CPU: 0 PID: 1500 Comm: Xorg Not tainted
       Call Trace:
        dump_stack+0x85/0xc2
        ___might_sleep+0x196/0x260
        __might_sleep+0x53/0xb0
        __pm_runtime_resume+0x7a/0x90
        intel_runtime_pm_get+0x25/0x90 [i915]
        aliasing_gtt_bind_vma+0xaa/0xf0 [i915]
        i915_vma_bind+0xaf/0x1e0 [i915]
        i915_gem_execbuffer_relocate_entry+0x513/0x6f0 [i915]
        i915_gem_execbuffer_relocate_vma.isra.34+0x188/0x250 [i915]
        ? trace_hardirqs_on+0xd/0x10
        ? i915_gem_execbuffer_reserve_vma.isra.31+0x152/0x1f0 [i915]
        ? i915_gem_execbuffer_reserve.isra.32+0x372/0x3a0 [i915]
        i915_gem_do_execbuffer.isra.38+0xa70/0x1a40 [i915]
        ? __might_fault+0x4e/0xb0
        i915_gem_execbuffer2+0xc5/0x260 [i915]
        ? __might_fault+0x4e/0xb0
        drm_ioctl+0x206/0x450 [drm]
        ? i915_gem_execbuffer+0x340/0x340 [i915]
        ? __fget+0x5/0x200
        do_vfs_ioctl+0x91/0x6f0
        ? __fget+0x111/0x200
        ? __fget+0x5/0x200
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc6
      
      even though the code triggering it is correct.
      
      Unfortunately, the might_sleep_if() assertions in question are
      too coarse-grained to cover such cases correctly, so make them
      a bit less sensitive in order to avoid the false-positives.
      Reported-and-tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      cb2098ab
    • Arnd Bergmann's avatar
      ARM: defconfigs: make NF_CT_PROTO_SCTP and NF_CT_PROTO_UDPLITE built-in · d3121ad1
      Arnd Bergmann authored
      [ Upstream commit 5aff1d24 ]
      
      The symbols can no longer be used as loadable modules, leading to a harmless Kconfig
      warning:
      
      arch/arm/configs/imote2_defconfig:60:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
      arch/arm/configs/imote2_defconfig:59:warning: symbol value 'm' invalid for NF_CT_PROTO_SCTP
      arch/arm/configs/ezx_defconfig:68:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
      arch/arm/configs/ezx_defconfig:67:warning: symbol value 'm' invalid for NF_CT_PROTO_SCTP
      
      Let's make them built-in.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      d3121ad1
    • Linus Lüssing's avatar
      ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches · 4c8eb627
      Linus Lüssing authored
      [ Upstream commit a088d1d7 ]
      
      When for instance a mobile Linux device roams from one access point to
      another with both APs sharing the same broadcast domain and a
      multicast snooping switch in between:
      
      1)    (c) <~~~> (AP1) <--[SSW]--> (AP2)
      
      2)              (AP1) <--[SSW]--> (AP2) <~~~> (c)
      
      Then currently IPv6 multicast packets will get lost for (c) until an
      MLD Querier sends its next query message. The packet loss occurs
      because upon roaming the Linux host so far stayed silent regarding
      MLD and the snooping switch will therefore be unaware of the
      multicast topology change for a while.
      
      This patch fixes this by always resending MLD reports when an interface
      change happens, for instance from NO-CARRIER to CARRIER state.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      4c8eb627
    • Stefan Brüns's avatar
      sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications · 0def8e45
      Stefan Brüns authored
      [ Upstream commit 5a70348e ]
      
      If a context is configured as dualstack ("IPv4v6"), the modem indicates
      the context activation with a slightly different indication message.
      The dual-stack indication omits the link_type (IPv4/v6) and adds
      additional address fields.
      IPv6 LSIs are identical to IPv4 LSIs, but have a different link type.
      Signed-off-by: default avatarStefan Brüns <stefan.bruens@rwth-aachen.de>
      Reviewed-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0def8e45
    • Stefan Brüns's avatar
      sierra_net: Skip validating irrelevant fields for IDLE LSIs · 0c2950fa
      Stefan Brüns authored
      [ Upstream commit 764895d3 ]
      
      When the context is deactivated, the link_type is set to 0xff, which
      triggers a warning message, and results in a wrong link status, as
      the LSI is ignored.
      Signed-off-by: default avatarStefan Brüns <stefan.bruens@rwth-aachen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      0c2950fa
    • Ralf Baechle's avatar
      NET: mkiss: Fix panic · a9cbb7cd
      Ralf Baechle authored
      [ Upstream commit 7ba1b689 ]
      
      If a USB-to-serial adapter is unplugged, the driver re-initializes, with
      dev->hard_header_len and dev->addr_len set to zero, instead of the correct
      values.  If then a packet is sent through the half-dead interface, the
      kernel will panic due to running out of headroom in the skb when pushing
      for the AX.25 headers resulting in this panic:
      
      [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50)
      [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 [ax25])
      [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] (ax_header+0x38/0x40 [mkiss])
      [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] (neigh_compat_output+0x8c/0xd8)
      [<c041b584>] (neigh_compat_output) from [<c043e7a8>] (ip_finish_output+0x2a0/0x914)
      [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0)
      [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48)
      
      This patch makes mkiss behave like the 6pack driver. 6pack does not
      panic.  In 6pack.c sp_setup() (same function name here) the values for
      dev->hard_header_len and dev->addr_len are set to the same values as in
      my mkiss patch.
      
      [ralf@linux-mips.org: Massages original submission to conform to the usual
      standards for patch submissions.]
      Signed-off-by: default avatarThomas Osterried <thomas@osterried.de>
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      a9cbb7cd
    • Ralf Baechle's avatar
      NET: Fix /proc/net/arp for AX.25 · d914dc3b
      Ralf Baechle authored
      [ Upstream commit 4872e57c ]
      
      When sending ARP requests over AX.25 links the hwaddress in the neighbour
      cache are not getting initialized.  For such an incomplete arp entry
      ax2asc2 will generate an empty string resulting in /proc/net/arp output
      like the following:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      172.20.1.99      0x3         0x0              *        bpq0
      
      The missing field will confuse the procfs parsing of arp(8) resulting in
      incorrect output for the device such as the following:
      
      $ arp
      Address                  HWtype  HWaddress           Flags Mask            Iface
      gateway                  ether   52:54:00:00:5d:5f   C                     ens3
      172.20.1.99                      (incomplete)                              ens3
      
      This changes the content of /proc/net/arp to:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      172.20.1.99      0x3         0x0         *                     *        bpq0
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      
      To do so it change ax2asc to put the string "*" in buf for a NULL address
      argument.  Finally the HW address field is left aligned in a 17 character
      field (the length of an ethernet HW address in the usual hex notation) for
      readability.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      d914dc3b
    • Jonathan T. Leighton's avatar
      ipv6: Inhibit IPv4-mapped src address on the wire. · 68978d69
      Jonathan T. Leighton authored
      [ Upstream commit ec5e3b0a ]
      
      This patch adds a check for the problematic case of an IPv4-mapped IPv6
      source address and a destination address that is neither an IPv4-mapped
      IPv6 address nor in6addr_any, and returns an appropriate error. The
      check in done before returning from looking up the route.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      68978d69
    • Jonathan T. Leighton's avatar
      ipv6: Handle IPv4-mapped src to in6addr_any dst. · 19708236
      Jonathan T. Leighton authored
      [ Upstream commit 052d2369 ]
      
      This patch adds a check on the type of the source address for the case
      where the destination address is in6addr_any. If the source is an
      IPv4-mapped IPv6 source address, the destination is changed to
      ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
      is done in three locations to handle UDP calls to either connect() or
      sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
      handling an in6addr_any destination until very late, so the patch only
      needs to handle the case where the source is an IPv4-mapped IPv6
      address.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      19708236
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix receive buffer overflow · dd4d061c
      Anssi Hannula authored
      [ Upstream commit cd224553 ]
      
      xilinx_emaclite looks at the received data to try to determine the
      Ethernet packet length but does not properly clamp it if
      proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
      overflow and a panic via skb_panic() as the length exceeds the allocated
      skb size.
      
      Fix those cases.
      
      Also add an additional unconditional check with WARN_ON() at the end.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: bb81b2dd ("net: add Xilinx emac lite device driver")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      dd4d061c
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix freezes due to unordered I/O · 742e7978
      Anssi Hannula authored
      [ Upstream commit acf138f1 ]
      
      The xilinx_emaclite uses __raw_writel and __raw_readl for register
      accesses. Those functions do not imply any kind of memory barriers and
      they may be reordered.
      
      The driver does not seem to take that into account, though, and the
      driver does not satisfy the ordering requirements of the hardware.
      For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
      which try to set MDIO address before initiating the transaction.
      
      I'm seeing system freezes with the driver with GCC 5.4 and current
      Linux kernels on Zynq-7000 SoC immediately when trying to use the
      interface.
      
      In commit 123c1407 ("net: emaclite: Do not use microblaze and ppc
      IO functions") the driver was switched from non-generic
      in_be32/out_be32 (memory barriers, big endian) to
      __raw_readl/__raw_writel (no memory barriers, native endian), so
      apparently the device follows system endianness and the driver was
      originally written with the assumption of memory barriers.
      
      Rather than try to hunt for each case of missing barrier, just switch
      the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
      on endianness instead.
      
      Tested on little-endian Zynq-7000 ARM SoC FPGA.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: 123c1407 ("net: emaclite: Do not use microblaze and ppc IO
      functions")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      742e7978
    • Richard's avatar
      partitions/msdos: FreeBSD UFS2 file systems are not recognized · afae1d9d
      Richard authored
      [ Upstream commit 22322035 ]
      
      The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
      and NetBSD partitions and does a reasonable job picking out OpenBSD
      and NetBSD UFS subpartitions.
      
      But for FreeBSD the subpartitions are always "bad".
      
          Kernel: <bsd:bad subpartition - ignored
      
      Though all 3 of these BSD systems use UFS as a file system, only
      FreeBSD uses relative start addresses in the subpartition
      declarations.
      
      The following patch fixes this for FreeBSD partitions and leaves
      the code for OpenBSD and NetBSD intact:
      Signed-off-by: default avatarRichard Narron <comet.berkeley@gmail.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      afae1d9d
    • Imre Deak's avatar
      PCI/PM: Add needs_resume flag to avoid suspend complete optimization · 7f6abe4c
      Imre Deak authored
      [ Upstream commit 4d071c32 ]
      
      Some drivers - like i915 - may not support the system suspend direct
      complete optimization due to differences in their runtime and system
      suspend sequence.  Add a flag that when set resumes the device before
      calling the driver's system suspend handlers which effectively disables
      the optimization.
      
      Needed by a future patch fixing suspend/resume on i915.
      
      Suggested by Rafael.
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7f6abe4c
    • Kees Cook's avatar
      usercopy: Adjust tests to deal with SMAP/PAN · cd1c4f85
      Kees Cook authored
      [ Upstream commit f5f893c5 ]
      
      Under SMAP/PAN/etc, we cannot write directly to userspace memory, so
      this rearranges the test bytes to get written through copy_to_user().
      Additionally drops the bad copy_from_user() test that would trigger a
      memcpy() against userspace on failure.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      cd1c4f85
    • Kristina Martsenko's avatar
      arm64: entry: improve data abort handling of tagged pointers · 9da80866
      Kristina Martsenko authored
      [ Upstream commit 276e9327 ]
      
      When handling a data abort from EL0, we currently zero the top byte of
      the faulting address, as we assume the address is a TTBR0 address, which
      may contain a non-zero address tag. However, the address may be a TTBR1
      address, in which case we should not zero the top byte. This patch fixes
      that. The effect is that the full TTBR1 address is passed to the task's
      signal handler (or printed out in the kernel log).
      
      When handling a data abort from EL1, we leave the faulting address
      intact, as we assume it's either a TTBR1 address or a TTBR0 address with
      tag 0x00. This is true as far as I'm aware, we don't seem to access a
      tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
      forget about address tags, and code added in the future may not always
      remember to remove tags from addresses before accessing them. So add tag
      handling to the EL1 data abort handler as well. This also makes it
      consistent with the EL0 data abort handler.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Cc: <stable@vger.kernel.org> # 3.12.x-
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      9da80866
    • Julius Werner's avatar
      drivers: char: mem: Fix wraparound check to allow mappings up to the end · 47e49f2d
      Julius Werner authored
      [ Upstream commit 32829da5 ]
      
      A recent fix to /dev/mem prevents mappings from wrapping around the end
      of physical address space. However, the check was written in a way that
      also prevents a mapping reaching just up to the end of physical address
      space, which may be a valid use case (especially on 32-bit systems).
      This patch fixes it by checking the last mapped address (instead of the
      first address behind that) for overflow.
      
      Fixes: b299cde2 ("drivers: char: mem: Check for address space wraparound with mmap()")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarNico Huber <nico.h@gmx.de>
      Signed-off-by: default avatarJulius Werner <jwerner@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      47e49f2d
    • Takashi Iwai's avatar
      ASoC: Fix use-after-free at card unregistration · bb3556c1
      Takashi Iwai authored
      [ Upstream commit 4efda5f2 ]
      
      soc_cleanup_card_resources() call snd_card_free() at the last of its
      procedure.  This turned out to lead to a use-after-free.
      PCM runtimes have been already removed via soc_remove_pcm_runtimes(),
      while it's dereferenced later in soc_pcm_free() called via
      snd_card_free().
      
      The fix is simple: just move the snd_card_free() call to the beginning
      of the whole procedure.  This also gives another benefit: it
      guarantees that all operations have been shut down before actually
      releasing the resources, which was racy until now.
      Reported-and-tested-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      bb3556c1
    • Takashi Iwai's avatar
      ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT · 88c41586
      Takashi Iwai authored
      [ Upstream commit ba3021b2 ]
      
      snd_timer_user_tselect() reallocates the queue buffer dynamically, but
      it forgot to reset its indices.  Since the read may happen
      concurrently with ioctl and snd_timer_user_tselect() allocates the
      buffer via kmalloc(), this may lead to the leak of uninitialized
      kernel-space data, as spotted via KMSAN:
      
        BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
        CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:16
         dump_stack+0x143/0x1b0 lib/dump_stack.c:52
         kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
         kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
         copy_to_user ./arch/x86/include/asm/uaccess.h:725
         snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
         do_loop_readv_writev fs/read_write.c:716
         __do_readv_writev+0x94c/0x1380 fs/read_write.c:864
         do_readv_writev fs/read_write.c:894
         vfs_readv fs/read_write.c:908
         do_readv+0x52a/0x5d0 fs/read_write.c:934
         SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
         SyS_readv+0x87/0xb0 fs/read_write.c:1018
      
      This patch adds the missing reset of queue indices.  Together with the
      previous fix for the ioctl/read race, we cover the whole problem.
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Tested-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      88c41586
    • Takashi Iwai's avatar
      ALSA: timer: Fix race between read and ioctl · 5d28ba6e
      Takashi Iwai authored
      [ Upstream commit d11662f4 ]
      
      The read from ALSA timer device, the function snd_timer_user_tread(),
      may access to an uninitialized struct snd_timer_user fields when the
      read is concurrently performed while the ioctl like
      snd_timer_user_tselect() is invoked.  We have already fixed the races
      among ioctls via a mutex, but we seem to have forgotten the race
      between read vs ioctl.
      
      This patch simply applies (more exactly extends the already applied
      range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
      race window.
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Tested-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      5d28ba6e
    • Dan Carpenter's avatar
      drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() · 29837be8
      Dan Carpenter authored
      [ Upstream commit f0c62e98 ]
      
      If vmalloc() fails then we need to a bit of cleanup before returning.
      
      Cc: <stable@vger.kernel.org>
      Fixes: fb1d9738 ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      29837be8
    • Jin Yao's avatar
      perf/core: Drop kernel samples even though :u is specified · d6f90404
      Jin Yao authored
      [ Upstream commit cc1582c2 ]
      
      When doing sampling, for example:
      
        perf record -e cycles:u ...
      
      On workloads that do a lot of kernel entry/exits we see kernel
      samples, even though :u is specified. This is due to skid existing.
      
      This might be a security issue because it can leak kernel addresses even
      though kernel sampling support is disabled.
      
      The patch drops the kernel samples if exclude_kernel is specified.
      
      For example, test on Haswell desktop:
      
        perf record -e cycles:u <mgen>
        perf report --stdio
      
      Before patch applied:
      
          99.77%  mgen     mgen              [.] buf_read
           0.20%  mgen     mgen              [.] rand_buf_init
           0.01%  mgen     [kernel.vmlinux]  [k] apic_timer_interrupt
           0.00%  mgen     mgen              [.] last_free_elem
           0.00%  mgen     libc-2.23.so      [.] __random_r
           0.00%  mgen     libc-2.23.so      [.] _int_malloc
           0.00%  mgen     mgen              [.] rand_array_init
           0.00%  mgen     [kernel.vmlinux]  [k] page_fault
           0.00%  mgen     libc-2.23.so      [.] __random
           0.00%  mgen     libc-2.23.so      [.] __strcasestr
           0.00%  mgen     ld-2.23.so        [.] strcmp
           0.00%  mgen     ld-2.23.so        [.] _dl_start
           0.00%  mgen     libc-2.23.so      [.] sched_setaffinity@@GLIBC_2.3.4
           0.00%  mgen     ld-2.23.so        [.] _start
      
      We can see kernel symbols apic_timer_interrupt and page_fault.
      
      After patch applied:
      
          99.79%  mgen     mgen           [.] buf_read
           0.19%  mgen     mgen           [.] rand_buf_init
           0.00%  mgen     libc-2.23.so   [.] __random_r
           0.00%  mgen     mgen           [.] rand_array_init
           0.00%  mgen     mgen           [.] last_free_elem
           0.00%  mgen     libc-2.23.so   [.] vfprintf
           0.00%  mgen     libc-2.23.so   [.] rand
           0.00%  mgen     libc-2.23.so   [.] __random
           0.00%  mgen     libc-2.23.so   [.] _int_malloc
           0.00%  mgen     libc-2.23.so   [.] _IO_doallocbuf
           0.00%  mgen     ld-2.23.so     [.] do_lookup_x
           0.00%  mgen     ld-2.23.so     [.] open_verify.constprop.7
           0.00%  mgen     ld-2.23.so     [.] _dl_important_hwcaps
           0.00%  mgen     libc-2.23.so   [.] sched_setaffinity@@GLIBC_2.3.4
           0.00%  mgen     ld-2.23.so     [.] _start
      
      There are only userspace symbols.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Cc: kan.liang@intel.com
      Cc: mark.rutland@arm.com
      Cc: will.deacon@arm.com
      Cc: yao.jin@intel.com
      Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      d6f90404
    • Michael Bringmann's avatar
      powerpc/hotplug-mem: Fix missing endian conversion of aa_index · f4455627
      Michael Bringmann authored
      [ Upstream commit dc421b20 ]
      
      When adding or removing memory, the aa_index (affinity value) for the
      memblock must also be converted to match the endianness of the rest
      of the 'ibm,dynamic-memory' property.  Otherwise, subsequent retrieval
      of the attribute will likely lead to non-existent nodes, followed by
      using the default node in the code inappropriately.
      
      Fixes: 5f97b2a0 ("powerpc/pseries: Implement memory hotplug add in the kernel")
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarMichael Bringmann <mwb@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      f4455627
    • Michael Ellerman's avatar
      powerpc/numa: Fix percpu allocations to be NUMA aware · 7ee9689e
      Michael Ellerman authored
      [ Upstream commit ba4a648f ]
      
      In commit 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
      switched to the generic implementation of cpu_to_node(), which uses a percpu
      variable to hold the NUMA node for each CPU.
      
      Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
      of our percpu areas, leading to a chicken and egg problem. In practice what
      happens is when we are setting up the percpu areas, cpu_to_node() reports that
      all CPUs are on node 0, so we allocate all percpu areas on node 0.
      
      This is visible in the dmesg output, as all pcpu allocs being in group 0:
      
        pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
        pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
        pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
        pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
        pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
        pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47
      
      To fix it we need an early_cpu_to_node() which can run prior to percpu being
      setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
      in. With the patch dmesg output shows two groups, 0 and 1:
      
        pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
        pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
        pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
        pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
        pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
        pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47
      
      We can also check the data_offset in the paca of various CPUs, with the fix we
      see:
      
        CPU 0:  data_offset = 0x0ffe8b0000
        CPU 24: data_offset = 0x1ffe5b0000
      
      And we can see from dmesg that CPU 24 has an allocation on node 1:
      
        node   0: [mem 0x0000000000000000-0x0000000fffffffff]
        node   1: [mem 0x0000001000000000-0x0000001fffffffff]
      
      Cc: stable@vger.kernel.org # v3.16+
      Fixes: 8c272261 ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      7ee9689e
    • Johannes Thumshirn's avatar
      scsi: qla2xxx: don't disable a not previously enabled PCI device · eecbbd83
      Johannes Thumshirn authored
      [ Upstream commit ddff7ed4 ]
      
      When pci_enable_device() or pci_enable_device_mem() fail in
      qla2x00_probe_one() we bail out but do a call to
      pci_disable_device(). This causes the dev_WARN_ON() in
      pci_disable_device() to trigger, as the device wasn't enabled
      previously.
      
      So instead of taking the 'probe_out' error path we can directly return
      *iff* one of the pci_enable_device() calls fails.
      
      Additionally rename the 'probe_out' goto label's name to the more
      descriptive 'disable_device'.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Fixes: e315cd28 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: default avatarGiridhar Malavali <giridhar.malavali@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      eecbbd83
    • Marc Zyngier's avatar
      KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages · 4a213a0f
      Marc Zyngier authored
      [ Upstream commit d6dbdd3c ]
      
      Under memory pressure, we start ageing pages, which amounts to parsing
      the page tables. Since we don't want to allocate any extra level,
      we pass NULL for our private allocation cache. Which means that
      stage2_get_pud() is allowed to fail. This results in the following
      splat:
      
      [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
      [ 1520.417741] pgd = ffff810f52fef000
      [ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
      [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
      [ 1520.435156] Modules linked in:
      [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
      [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
      [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
      [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
      [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
      [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145
      [ 1520.486325] sp : ffff800ce04e33d0
      [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
      [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
      [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
      [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
      [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
      [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
      [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
      [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
      [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
      [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
      [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
      [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
      [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
      [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
      [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
      [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
      [...]
      [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110
      [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0
      [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8
      [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0
      [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
      [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0
      [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188
      [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250
      [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0
      [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180
      [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8
      [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600
      [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328
      [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340
      [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240
      [...]
      
      The trivial fix is to handle this NULL pud value early, rather than
      dereferencing it blindly.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      4a213a0f
    • Jeff Mahoney's avatar
      btrfs: fix memory leak in update_space_info failure path · 951269f9
      Jeff Mahoney authored
      [ Upstream commit 896533a7 ]
      
      If we fail to add the space_info kobject, we'll leak the memory
      for the percpu counter.
      
      Fixes: 6ab0a202 (btrfs: publish allocation data in sysfs)
      Cc: <stable@vger.kernel.org> # v3.14+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      951269f9
    • David Sterba's avatar
      btrfs: use correct types for page indices in btrfs_page_exists_in_range · d42014c8
      David Sterba authored
      [ Upstream commit cc2b702c ]
      
      Variables start_idx and end_idx are supposed to hold a page index
      derived from the file offsets. The int type is not the right one though,
      offsets larger than 1 << 44 will get silently trimmed off the high bits.
      (1 << 44 is 16TiB)
      
      What can go wrong, if start is below the boundary and end gets trimmed:
      - if there's a page after start, we'll find it (radix_tree_gang_lookup_slot)
      - the final check "if (page->index <= end_idx)" will unexpectedly fail
      
      The function will return false, ie. "there's no page in the range",
      although there is at least one.
      
      btrfs_page_exists_in_range is used to prevent races in:
      
      * in hole punching, where we make sure there are not pages in the
        truncated range, otherwise we'll wait for them to finish and redo
        truncation, but we're going to replace the pages with holes anyway so
        the only problem is the intermediate state
      
      * lock_extent_direct: we want to make sure there are no pages before we
        lock and start DIO, to prevent stale data reads
      
      For practical occurence of the bug, there are several constaints.  The
      file must be quite large, the affected range must cross the 16TiB
      boundary and the internal state of the file pages and pending operations
      must match.  Also, we must not have started any ordered data in the
      range, otherwise we don't even reach the buggy function check.
      
      DIO locking tries hard in several places to avoid deadlocks with
      buffered IO and avoids waiting for ranges. The worst consequence seems
      to be stale data read.
      
      CC: Liu Bo <bo.li.liu@oracle.com>
      CC: stable@vger.kernel.org	# 3.16+
      Fixes: fc4adbff ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking")
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      d42014c8