1. 17 Jun, 2017 29 commits
    • Arseny Solokha's avatar
      gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page · 6e3ea31d
      Arseny Solokha authored
      
      [ Upstream commit 4af0e5bb ]
      
      In spite of switching to paged allocation of Rx buffers, the driver still
      called dma_unmap_single() in the Rx queues tear-down path.
      
      The DMA region unmapping code in free_skb_rx_queue() basically predates
      the introduction of paged allocation to the driver. While being refactored,
      it apparently hasn't reflected the change in the DMA API usage by its
      counterpart gfar_new_page().
      
      As a result, setting an interface to the DOWN state now yields the following:
      
        # ip link set eth2 down
        fsl-gianfar ffe24000.ethernet: DMA-API: device driver frees DMA memory with wrong function [device address=0x000000001ecd0000] [size=40]
        ------------[ cut here ]------------
        WARNING: CPU: 1 PID: 189 at lib/dma-debug.c:1123 check_unmap+0x8e0/0xa28
        CPU: 1 PID: 189 Comm: ip Tainted: G           O    4.9.5 #1
        task: dee73400 task.stack: dede2000
        NIP: c02101e8 LR: c02101e8 CTR: c0260d74
        REGS: dede3bb0 TRAP: 0700   Tainted: G           O     (4.9.5)
        MSR: 00021000 <CE,ME>  CR: 28002222  XER: 00000000
      
        GPR00: c02101e8 dede3c60 dee73400 000000b6 dfbd033c dfbd36c4 1f622000 dede2000
        GPR08: 00000007 c05b1634 1f622000 00000000 22002484 100a9904 00000000 00000000
        GPR16: 00000000 db4c849c 00000002 db4c8480 00000001 df142240 db4c84bc 00000000
        GPR24: c0706148 c0700000 00029000 c07552e8 c07323b4 dede3cb8 c07605e0 db535540
        NIP [c02101e8] check_unmap+0x8e0/0xa28
        LR [c02101e8] check_unmap+0x8e0/0xa28
        Call Trace:
        [dede3c60] [c02101e8] check_unmap+0x8e0/0xa28 (unreliable)
        [dede3cb0] [c02103b8] debug_dma_unmap_page+0x88/0x9c
        [dede3d30] [c02dffbc] free_skb_resources+0x2c4/0x404
        [dede3d80] [c02e39b4] gfar_close+0x24/0xc8
        [dede3da0] [c0361550] __dev_close_many+0xa0/0xf8
        [dede3dd0] [c03616f0] __dev_close+0x2c/0x4c
        [dede3df0] [c036b1b8] __dev_change_flags+0xa0/0x174
        [dede3e10] [c036b2ac] dev_change_flags+0x20/0x60
        [dede3e30] [c03e130c] devinet_ioctl+0x540/0x824
        [dede3e90] [c0347dcc] sock_ioctl+0x134/0x298
        [dede3eb0] [c0111814] do_vfs_ioctl+0xac/0x854
        [dede3f20] [c0111ffc] SyS_ioctl+0x40/0x74
        [dede3f40] [c000f290] ret_from_syscall+0x0/0x3c
        --- interrupt: c01 at 0xff45da0
            LR = 0xff45cd0
        Instruction dump:
        811d001c 7c66482e 813d0020 9061000c 807f000c 5463103a 7cc6182e 3c60c052
        386309ac 90c10008 4cc63182 4826b845 <0fe00000> 4bfffa60 3c80c052 388402c4
        ---[ end trace 695ae6d7ac1d0c47 ]---
        Mapped at:
         [<c02e22a8>] gfar_alloc_rx_buffs+0x178/0x248
         [<c02e3ef0>] startup_gfar+0x368/0x570
         [<c036aeb4>] __dev_open+0xdc/0x150
         [<c036b1b8>] __dev_change_flags+0xa0/0x174
         [<c036b2ac>] dev_change_flags+0x20/0x60
      
      Even though the issue was discovered in 4.9 kernel, the code in question
      is identical in the current net and net-next trees.
      
      Fixes: 75354148 ("gianfar: Add paged allocation and Rx S/G")
      Signed-off-by: default avatarArseny Solokha <asolokha@kb.kras.ru>
      Acked-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e3ea31d
    • Jack Morgenstein's avatar
      net/mlx4_core: Avoid command timeouts during VF driver device shutdown · 2b9f84ef
      Jack Morgenstein authored
      
      [ Upstream commit d585df1c ]
      
      Some Hypervisors detach VFs from VMs by instantly causing an FLR event
      to be generated for a VF.
      
      In the mlx4 case, this will cause that VF's comm channel to be disabled
      before the VM has an opportunity to invoke the VF device's "shutdown"
      method.
      
      The result is that the VF driver on the VM will experience a command
      timeout during the shutdown process when the Hypervisor does not deliver
      a command-completion event to the VM.
      
      To avoid FW command timeouts on the VM when the driver's shutdown method
      is invoked, we detect the absence of the VF's comm channel at the very
      start of the shutdown process. If the comm-channel has already been
      disabled, we cause all FW commands during the device shutdown process to
      immediately return success (and thus avoid all command timeouts).
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b9f84ef
    • Ben Skeggs's avatar
    • Ben Skeggs's avatar
      9c7a11e6
    • Dimitris Michailidis's avatar
      ipv6: fix flow labels when the traffic class is non-0 · 1507ea6d
      Dimitris Michailidis authored
      
      [ Upstream commit 90427ef5 ]
      
      ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
      supposed to be passed a flow label, which it returns as is if non-0 and
      in some other cases, otherwise it calculates a new value.
      
      The problem is callers often pass a flowi6.flowlabel, which may also
      contain traffic class bits. If the traffic class is non-0
      ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
      returns the whole thing. Thus it can return a 'flow label' longer than
      20b and the low 20b of that is typically 0 resulting in packets with 0
      label. Moreover, different packets of a flow may be labeled differently.
      For a TCP flow with ECN non-payload and payload packets get different
      labels as exemplified by this pair of consecutive packets:
      
      (pure ACK)
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
          .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
          Payload Length: 32
          Next Header: TCP (6)
      
      (payload)
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
          .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
          Payload Length: 688
          Next Header: TCP (6)
      
      This patch allows ip6_make_flowlabel() to be passed more than just a
      flow label and has it extract the part it really wants. This was simpler
      than modifying the callers. With this patch packets like the above become
      
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
          .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
          Payload Length: 32
          Next Header: TCP (6)
      
      Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
          0110 .... = Version: 6
          .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
              .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
              .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
          .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
          Payload Length: 688
          Next Header: TCP (6)
      Signed-off-by: default avatarDimitris Michailidis <dmichail@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1507ea6d
    • David Howells's avatar
      FS-Cache: Initialise stores_lock in netfs cookie · 95a4659e
      David Howells authored
      
      [ Upstream commit 62deb818 ]
      
      Initialise the stores_lock in fscache netfs cookies.  Technically, it
      shouldn't be necessary, since the netfs cookie is an index and stores no
      data, but initialising it anyway adds insignificant overhead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      95a4659e
    • David Howells's avatar
      fscache: Clear outstanding writes when disabling a cookie · 38481d7d
      David Howells authored
      
      [ Upstream commit 6bdded59 ]
      
      fscache_disable_cookie() needs to clear the outstanding writes on the
      cookie it's disabling because they cannot be completed after.
      
      Without this, fscache_nfs_open_file() gets stuck because it disables the
      cookie when the file is opened for writing but can't uncache the pages till
      afterwards - otherwise there's a race between the open routine and anyone
      who already has it open R/O and is still reading from it.
      
      Looking in /proc/pid/stack of the offending process shows:
      
      [<ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
      [<ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
      [<ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
      [<ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
      [<ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
      [<ffffffff811743ac>] vfs_open+0x5c/0x65
      [<ffffffff81184185>] path_openat+0x785/0x8fb
      [<ffffffff81184343>] do_filp_open+0x48/0x9e
      [<ffffffff81174710>] do_sys_open+0x13b/0x1cb
      [<ffffffff811747b9>] SyS_open+0x19/0x1b
      [<ffffffff81001c44>] do_syscall_64+0x80/0x17a
      [<ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
      [<ffffffffffffffff>] 0xffffffffffffffff
      Reported-by: default avatarJianhong Yin <jiyin@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJeff Layton <jlayton@redhat.com>
      Acked-by: default avatarSteve Dickson <steved@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38481d7d
    • David Howells's avatar
      fscache: Fix dead object requeue · b421d230
      David Howells authored
      
      [ Upstream commit e26bfebd ]
      
      Under some circumstances, an fscache object can become queued such that it
      fscache_object_work_func() can be called once the object is in the
      OBJECT_DEAD state.  This results in the kernel oopsing when it tries to
      invoke the handler for the state (which is hard coded to 0x2).
      
      The way this comes about is something like the following:
      
       (1) The object dispatcher is processing a work state for an object.  This
           is done in workqueue context.
      
       (2) An out-of-band event comes in that isn't masked, causing the object to
           be queued, say EV_KILL.
      
       (3) The object dispatcher finishes processing the current work state on
           that object and then sees there's another event to process, so,
           without returning to the workqueue core, it processes that event too.
           It then follows the chain of events that initiates until we reach
           OBJECT_DEAD without going through a wait state (such as
           WAIT_FOR_CLEARANCE).
      
           At this point, object->events may be 0, object->event_mask will be 0
           and oob_event_mask will be 0.
      
       (4) The object dispatcher returns to the workqueue processor, and in due
           course, this sees that the object's work item is still queued and
           invokes it again.
      
       (5) The current state is a work state (OBJECT_DEAD), so the dispatcher
           jumps to it - resulting in an OOPS.
      
      When I'm seeing this, the work state in (1) appears to have been either
      LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
      fscache_osm_lookup_oob).
      
      The window for (2) is very small:
      
       (A) object->event_mask is cleared whilst the event dispatch process is
           underway - though there's no memory barrier to force this to the top
           of the function.
      
           The window, therefore is from the time the object was selected by the
           workqueue processor and made requeueable to the time the mask was
           cleared.
      
       (B) fscache_raise_event() will only queue the object if it manages to set
           the event bit and the corresponding event_mask bit was set.
      
           The enqueuement is then deferred slightly whilst we get a ref on the
           object and get the per-CPU variable for workqueue congestion.  This
           slight deferral slightly increases the probability by allowing extra
           time for the workqueue to make the item requeueable.
      
      Handle this by giving the dead state a processor function and checking the
      for the dead state address rather than seeing if the processor function is
      address 0x2.  The dead state processor function can then set a flag to
      indicate that it's occurred and give a warning if it occurs more than once
      per object.
      
      If this race occurs, an oops similar to the following is seen (note the RIP
      value):
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
      IP: [<0000000000000002>] 0x1
      PGD 0
      Oops: 0010 [#1] SMP
      Modules linked in: ...
      CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
      Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
      Workqueue: fscache_object fscache_object_work_func [fscache]
      task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000
      RIP: 0010:[<0000000000000002>]  [<0000000000000002>] 0x1
      RSP: 0018:ffff880717547df8  EFLAGS: 00010202
      RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200
      RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480
      RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510
      R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000
      R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600
      FS:  0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
       ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
       ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
      Call Trace:
       [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
       [<ffffffff8109d5db>] process_one_work+0x17b/0x470
       [<ffffffff8109e4ac>] worker_thread+0x21c/0x400
       [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
       [<ffffffff810a5acf>] kthread+0xcf/0xe0
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff816460d8>] ret_from_fork+0x58/0x90
       [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJeremy McNicoll <jeremymc@redhat.com>
      Tested-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Tested-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b421d230
    • Stanislaw Gruszka's avatar
      ethtool: do not vzalloc(0) on registers dump · e6b15f0f
      Stanislaw Gruszka authored
      
      [ Upstream commit 3808d348 ]
      
      If ->get_regs_len() callback return 0, we allocate 0 bytes of memory,
      what print ugly warning in dmesg, which can be found further below.
      
      This happen on mac80211 devices where ieee80211_get_regs_len() just
      return 0 and driver only fills ethtool_regs structure and actually
      do not provide any dump. However I assume this can happen on other
      drivers i.e. when for some devices driver provide regs dump and for
      others do not. Hence preventing to to print warning in ethtool code
      seems to be reasonable.
      
      ethtool: vmalloc: allocation failure: 0 bytes, mode:0x24080c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO)
      <snip>
      Call Trace:
      [<ffffffff813bde47>] dump_stack+0x63/0x8c
      [<ffffffff811b0a1f>] warn_alloc+0x13f/0x170
      [<ffffffff811f0476>] __vmalloc_node_range+0x1e6/0x2c0
      [<ffffffff811f0874>] vzalloc+0x54/0x60
      [<ffffffff8169986c>] dev_ethtool+0xb4c/0x1b30
      [<ffffffff816adbb1>] dev_ioctl+0x181/0x520
      [<ffffffff816714d2>] sock_do_ioctl+0x42/0x50
      <snip>
      Mem-Info:
      active_anon:435809 inactive_anon:173951 isolated_anon:0
       active_file:835822 inactive_file:196932 isolated_file:0
       unevictable:0 dirty:8 writeback:0 unstable:0
       slab_reclaimable:157732 slab_unreclaimable:10022
       mapped:83042 shmem:306356 pagetables:9507 bounce:0
       free:130041 free_pcp:1080 free_cma:0
      Node 0 active_anon:1743236kB inactive_anon:695804kB active_file:3343288kB inactive_file:787728kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:332168kB dirty:32kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1225424kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
      Node 0 DMA free:15900kB min:136kB low:168kB high:200kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15900kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
      lowmem_reserve[]: 0 3187 7643 7643
      Node 0 DMA32 free:419732kB min:28124kB low:35152kB high:42180kB active_anon:541180kB inactive_anon:248988kB active_file:1466388kB inactive_file:389632kB unevictable:0kB writepending:0kB present:3370280kB managed:3290932kB mlocked:0kB slab_reclaimable:217184kB slab_unreclaimable:4180kB kernel_stack:160kB pagetables:984kB bounce:0kB free_pcp:2236kB local_pcp:660kB free_cma:0kB
      lowmem_reserve[]: 0 0 4456 4456
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6b15f0f
    • Ard Biesheuvel's avatar
      log2: make order_base_2() behave correctly on const input value zero · 98066076
      Ard Biesheuvel authored
      commit 29905b52 upstream.
      
      The function order_base_2() is defined (according to the comment block)
      as returning zero on input zero, but subsequently passes the input into
      roundup_pow_of_two(), which is explicitly undefined for input zero.
      
      This has gone unnoticed until now, but optimization passes in GCC 7 may
      produce constant folded function instances where a constant value of
      zero is passed into order_base_2(), resulting in link errors against the
      deliberately undefined '____ilog2_NaN'.
      
      So update order_base_2() to adhere to its own documented interface.
      
      [ See
      
           http://marc.info/?l=linux-kernel&m=147672952517795&w=2
      
        and follow-up discussion for more background. The gcc "optimization
        pass" is really just broken, but now the GCC trunk problem seems to
        have escaped out of just specially built daily images, so we need to
        work around it in mainline.    - Linus ]
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98066076
    • Peter Zijlstra's avatar
      kasan: respect /proc/sys/kernel/traceoff_on_warning · 55d0f89a
      Peter Zijlstra authored
      
      [ Upstream commit 4f40c6e5 ]
      
      After much waiting I finally reproduced a KASAN issue, only to find my
      trace-buffer empty of useful information because it got spooled out :/
      
      Make kasan_report honour the /proc/sys/kernel/traceoff_on_warning
      interface.
      
      Link: http://lkml.kernel.org/r/20170125164106.3514-1-aryabinin@virtuozzo.comSigned-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55d0f89a
    • David Lin's avatar
      jump label: pass kbuild_cflags when checking for asm goto support · 1948d0af
      David Lin authored
      
      [ Upstream commit 35f860f9 ]
      
      Some versions of ARM GCC compiler such as Android toolchain throws in a
      '-fpic' flag by default.  This causes the gcc-goto check script to fail
      although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.
      
      This patch passes the KBUILD_CFLAGS to the check script so that the
      script does not rely on the default config from different compilers.
      
      Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.comSigned-off-by: default avatarDavid Lin <dtwlin@google.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Michal Marek <mmarek@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1948d0af
    • Rafael J. Wysocki's avatar
      PM / runtime: Avoid false-positive warnings from might_sleep_if() · 266e02bc
      Rafael J. Wysocki authored
      
      [ Upstream commit a9306a63 ]
      
      The might_sleep_if() assertions in __pm_runtime_idle(),
      __pm_runtime_suspend() and __pm_runtime_resume() may generate
      false-positive warnings in some situations.  For example, that
      happens if a nested pm_runtime_get_sync()/pm_runtime_put() pair
      is executed with disabled interrupts within an outer
      pm_runtime_get_sync()/pm_runtime_put() section for the same device.
      [Generally, pm_runtime_get_sync() may sleep, so it should not be
      called with disabled interrupts, but in this particular case the
      previous pm_runtime_get_sync() guarantees that the device will not
      be suspended, so the inner pm_runtime_get_sync() will return
      immediately after incrementing the device's usage counter.]
      
      That started to happen in the i915 driver in 4.10-rc, leading to
      the following splat:
      
       BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1032
       in_atomic(): 1, irqs_disabled(): 0, pid: 1500, name: Xorg
       1 lock held by Xorg/1500:
        #0:  (&dev->struct_mutex){+.+.+.}, at:
        [<ffffffffa0680c13>] i915_mutex_lock_interruptible+0x43/0x140 [i915]
       CPU: 0 PID: 1500 Comm: Xorg Not tainted
       Call Trace:
        dump_stack+0x85/0xc2
        ___might_sleep+0x196/0x260
        __might_sleep+0x53/0xb0
        __pm_runtime_resume+0x7a/0x90
        intel_runtime_pm_get+0x25/0x90 [i915]
        aliasing_gtt_bind_vma+0xaa/0xf0 [i915]
        i915_vma_bind+0xaf/0x1e0 [i915]
        i915_gem_execbuffer_relocate_entry+0x513/0x6f0 [i915]
        i915_gem_execbuffer_relocate_vma.isra.34+0x188/0x250 [i915]
        ? trace_hardirqs_on+0xd/0x10
        ? i915_gem_execbuffer_reserve_vma.isra.31+0x152/0x1f0 [i915]
        ? i915_gem_execbuffer_reserve.isra.32+0x372/0x3a0 [i915]
        i915_gem_do_execbuffer.isra.38+0xa70/0x1a40 [i915]
        ? __might_fault+0x4e/0xb0
        i915_gem_execbuffer2+0xc5/0x260 [i915]
        ? __might_fault+0x4e/0xb0
        drm_ioctl+0x206/0x450 [drm]
        ? i915_gem_execbuffer+0x340/0x340 [i915]
        ? __fget+0x5/0x200
        do_vfs_ioctl+0x91/0x6f0
        ? __fget+0x111/0x200
        ? __fget+0x5/0x200
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc6
      
      even though the code triggering it is correct.
      
      Unfortunately, the might_sleep_if() assertions in question are
      too coarse-grained to cover such cases correctly, so make them
      a bit less sensitive in order to avoid the false-positives.
      Reported-and-tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      266e02bc
    • Linus Lüssing's avatar
      ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches · 8d228758
      Linus Lüssing authored
      
      [ Upstream commit a088d1d7 ]
      
      When for instance a mobile Linux device roams from one access point to
      another with both APs sharing the same broadcast domain and a
      multicast snooping switch in between:
      
      1)    (c) <~~~> (AP1) <--[SSW]--> (AP2)
      
      2)              (AP1) <--[SSW]--> (AP2) <~~~> (c)
      
      Then currently IPv6 multicast packets will get lost for (c) until an
      MLD Querier sends its next query message. The packet loss occurs
      because upon roaming the Linux host so far stayed silent regarding
      MLD and the snooping switch will therefore be unaware of the
      multicast topology change for a while.
      
      This patch fixes this by always resending MLD reports when an interface
      change happens, for instance from NO-CARRIER to CARRIER state.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d228758
    • Ricardo Ribalda's avatar
      i2c: piix4: Fix request_region size · ee0cd477
      Ricardo Ribalda authored
      
      [ Upstream commit f43128c7 ]
      
      Since '701dc207 ("i2c: piix4: Avoid race conditions with IMC")' we
      are using the SMBSLVCNT register at offset 0x8. We need to request it.
      
      Fixes: 701dc207 ("i2c: piix4: Avoid race conditions with IMC")
      Signed-off-by: default avatarRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee0cd477
    • Stefan Brüns's avatar
      sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications · 68cac074
      Stefan Brüns authored
      
      [ Upstream commit 5a70348e ]
      
      If a context is configured as dualstack ("IPv4v6"), the modem indicates
      the context activation with a slightly different indication message.
      The dual-stack indication omits the link_type (IPv4/v6) and adds
      additional address fields.
      IPv6 LSIs are identical to IPv4 LSIs, but have a different link type.
      Signed-off-by: default avatarStefan Brüns <stefan.bruens@rwth-aachen.de>
      Reviewed-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68cac074
    • Stefan Brüns's avatar
      sierra_net: Skip validating irrelevant fields for IDLE LSIs · d95ffdd3
      Stefan Brüns authored
      
      [ Upstream commit 764895d3 ]
      
      When the context is deactivated, the link_type is set to 0xff, which
      triggers a warning message, and results in a wrong link status, as
      the LSI is ignored.
      Signed-off-by: default avatarStefan Brüns <stefan.bruens@rwth-aachen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d95ffdd3
    • Kejian Yan's avatar
      net: hns: Fix the device being used for dma mapping during TX · 716cca0a
      Kejian Yan authored
      
      [ Upstream commit b85ea006 ]
      
      This patch fixes the device being used to DMA map skb->data.
      Erroneous device assignment causes the crash when SMMU is enabled.
      This happens during TX since buffer gets DMA mapped with device
      correspondign to net_device and gets unmapped using the device
      related to DSAF.
      Signed-off-by: default avatarKejian Yan <yankejian@huawei.com>
      Reviewed-by: default avatarYisen Zhuang <yisen.zhuang@huawei.com>
      Signed-off-by: default avatarSalil Mehta <salil.mehta@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      716cca0a
    • Ralf Baechle's avatar
      NET: mkiss: Fix panic · aacf9de1
      Ralf Baechle authored
      
      [ Upstream commit 7ba1b689 ]
      
      If a USB-to-serial adapter is unplugged, the driver re-initializes, with
      dev->hard_header_len and dev->addr_len set to zero, instead of the correct
      values.  If then a packet is sent through the half-dead interface, the
      kernel will panic due to running out of headroom in the skb when pushing
      for the AX.25 headers resulting in this panic:
      
      [<c0595468>] (skb_panic) from [<c0401f70>] (skb_push+0x4c/0x50)
      [<c0401f70>] (skb_push) from [<bf0bdad4>] (ax25_hard_header+0x34/0xf4 [ax25])
      [<bf0bdad4>] (ax25_hard_header [ax25]) from [<bf0d05d4>] (ax_header+0x38/0x40 [mkiss])
      [<bf0d05d4>] (ax_header [mkiss]) from [<c041b584>] (neigh_compat_output+0x8c/0xd8)
      [<c041b584>] (neigh_compat_output) from [<c043e7a8>] (ip_finish_output+0x2a0/0x914)
      [<c043e7a8>] (ip_finish_output) from [<c043f948>] (ip_output+0xd8/0xf0)
      [<c043f948>] (ip_output) from [<c043f04c>] (ip_local_out_sk+0x44/0x48)
      
      This patch makes mkiss behave like the 6pack driver. 6pack does not
      panic.  In 6pack.c sp_setup() (same function name here) the values for
      dev->hard_header_len and dev->addr_len are set to the same values as in
      my mkiss patch.
      
      [ralf@linux-mips.org: Massages original submission to conform to the usual
      standards for patch submissions.]
      Signed-off-by: default avatarThomas Osterried <thomas@osterried.de>
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aacf9de1
    • Ralf Baechle's avatar
      NET: Fix /proc/net/arp for AX.25 · b9e9045d
      Ralf Baechle authored
      
      [ Upstream commit 4872e57c ]
      
      When sending ARP requests over AX.25 links the hwaddress in the neighbour
      cache are not getting initialized.  For such an incomplete arp entry
      ax2asc2 will generate an empty string resulting in /proc/net/arp output
      like the following:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      172.20.1.99      0x3         0x0              *        bpq0
      
      The missing field will confuse the procfs parsing of arp(8) resulting in
      incorrect output for the device such as the following:
      
      $ arp
      Address                  HWtype  HWaddress           Flags Mask            Iface
      gateway                  ether   52:54:00:00:5d:5f   C                     ens3
      172.20.1.99                      (incomplete)                              ens3
      
      This changes the content of /proc/net/arp to:
      
      $ cat /proc/net/arp
      IP address       HW type     Flags       HW address            Mask     Device
      172.20.1.99      0x3         0x0         *                     *        bpq0
      192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
      
      To do so it change ax2asc to put the string "*" in buf for a NULL address
      argument.  Finally the HW address field is left aligned in a 17 character
      field (the length of an ethernet HW address in the usual hex notation) for
      readability.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9e9045d
    • Jonathan T. Leighton's avatar
      ipv6: Inhibit IPv4-mapped src address on the wire. · 23287661
      Jonathan T. Leighton authored
      
      [ Upstream commit ec5e3b0a ]
      
      This patch adds a check for the problematic case of an IPv4-mapped IPv6
      source address and a destination address that is neither an IPv4-mapped
      IPv6 address nor in6addr_any, and returns an appropriate error. The
      check in done before returning from looking up the route.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23287661
    • Jonathan T. Leighton's avatar
      ipv6: Handle IPv4-mapped src to in6addr_any dst. · 8faccb2b
      Jonathan T. Leighton authored
      
      [ Upstream commit 052d2369 ]
      
      This patch adds a check on the type of the source address for the case
      where the destination address is in6addr_any. If the source is an
      IPv4-mapped IPv6 source address, the destination is changed to
      ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
      is done in three locations to handle UDP calls to either connect() or
      sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
      handling an in6addr_any destination until very late, so the patch only
      needs to handle the case where the source is an IPv4-mapped IPv6
      address.
      Signed-off-by: default avatarJonathan T. Leighton <jtleight@udel.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8faccb2b
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix receive buffer overflow · 10a76297
      Anssi Hannula authored
      
      [ Upstream commit cd224553 ]
      
      xilinx_emaclite looks at the received data to try to determine the
      Ethernet packet length but does not properly clamp it if
      proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
      overflow and a panic via skb_panic() as the length exceeds the allocated
      skb size.
      
      Fix those cases.
      
      Also add an additional unconditional check with WARN_ON() at the end.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: bb81b2dd ("net: add Xilinx emac lite device driver")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10a76297
    • Anssi Hannula's avatar
      net: xilinx_emaclite: fix freezes due to unordered I/O · 7f71f22a
      Anssi Hannula authored
      
      [ Upstream commit acf138f1 ]
      
      The xilinx_emaclite uses __raw_writel and __raw_readl for register
      accesses. Those functions do not imply any kind of memory barriers and
      they may be reordered.
      
      The driver does not seem to take that into account, though, and the
      driver does not satisfy the ordering requirements of the hardware.
      For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
      which try to set MDIO address before initiating the transaction.
      
      I'm seeing system freezes with the driver with GCC 5.4 and current
      Linux kernels on Zynq-7000 SoC immediately when trying to use the
      interface.
      
      In commit 123c1407 ("net: emaclite: Do not use microblaze and ppc
      IO functions") the driver was switched from non-generic
      in_be32/out_be32 (memory barriers, big endian) to
      __raw_readl/__raw_writel (no memory barriers, native endian), so
      apparently the device follows system endianness and the driver was
      originally written with the assumption of memory barriers.
      
      Rather than try to hunt for each case of missing barrier, just switch
      the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
      on endianness instead.
      
      Tested on little-endian Zynq-7000 ARM SoC FPGA.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Fixes: 123c1407 ("net: emaclite: Do not use microblaze and ppc IO
      functions")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f71f22a
    • Sachin Prabhu's avatar
      Call echo service immediately after socket reconnect · 2ba464a4
      Sachin Prabhu authored
      commit b8c60012 upstream.
      
      Commit 4fcd1813 ("Fix reconnect to not defer smb3 session reconnect
      long after socket reconnect") changes the behaviour of the SMB2 echo
      service and causes it to renegotiate after a socket reconnect. However
      under default settings, the echo service could take up to 120 seconds to
      be scheduled.
      
      The patch forces the echo service to be called immediately resulting a
      negotiate call being made immediately on reconnect.
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Acked-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ba464a4
    • Malcolm Priestley's avatar
      staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory. · 691fe561
      Malcolm Priestley authored
      commit baabd567 upstream.
      
      The driver attempts to alter memory that is mapped to PCI device.
      
      This is because tx_fwinfo_8190pci points to skb->data
      
      Move the pci_map_single to when completed buffer is ready to be mapped with
      psdec is empty to drop on mapping error.
      Signed-off-by: default avatarMalcolm Priestley <tvboxspy@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      691fe561
    • Fabio Estevam's avatar
      ARM: dts: imx6dl: Fix the VDD_ARM_CAP voltage for 396MHz operation · 3fc4d704
      Fabio Estevam authored
      commit 46350b71 upstream.
      
      Table 8 from MX6DL datasheet (IMX6SDLCEC Rev. 5, 06/2015):
      http://cache.nxp.com/files/32bit/doc/data_sheet/IMX6SDLCEC.pdf
      
      states the following:
      
      "LDO Output Set Point (VDD_ARM_CAP) = 1.125 V minimum for operation
      up to 396 MHz."
      
      So fix the entry by adding the 25mV margin value as done in the other
      entries of the table, which results in 1.15V for 396MHz operation.
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Cc: Stephane Fillod <f8cfe@free.fr>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fc4d704
    • Richard's avatar
      partitions/msdos: FreeBSD UFS2 file systems are not recognized · b28c21ba
      Richard authored
      commit 22322035 upstream.
      
      The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD
      and NetBSD partitions and does a reasonable job picking out OpenBSD
      and NetBSD UFS subpartitions.
      
      But for FreeBSD the subpartitions are always "bad".
      
          Kernel: <bsd:bad subpartition - ignored
      
      Though all 3 of these BSD systems use UFS as a file system, only
      FreeBSD uses relative start addresses in the subpartition
      declarations.
      
      The following patch fixes this for FreeBSD partitions and leaves
      the code for OpenBSD and NetBSD intact:
      Signed-off-by: default avatarRichard Narron <comet.berkeley@gmail.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b28c21ba
    • Heiko Carstens's avatar
      s390/vmem: fix identity mapping · 0fb2a1fe
      Heiko Carstens authored
      commit c34a6905 upstream.
      
      The identity mapping is suboptimal for the last 2GB frame. The mapping
      will be established with a mix of 4KB and 1MB mappings instead of a
      single 2GB mapping.
      
      This happens because of a off-by-one bug introduced with
      commit 50be6345 ("s390/mm: Convert bootmem to memblock").
      
      Currently the identity mapping looks like this:
      
      0x0000000080000000-0x0000000180000000        4G PUD RW
      0x0000000180000000-0x00000001fff00000     2047M PMD RW
      0x00000001fff00000-0x0000000200000000        1M PTE RW
      
      With the bug fixed it looks like this:
      
      0x0000000080000000-0x0000000200000000        6G PUD RW
      
      Fixes: 50be6345 ("s390/mm: Convert bootmem to memblock")
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Jean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fb2a1fe
  2. 14 Jun, 2017 11 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.72 · 30c9187f
      Greg Kroah-Hartman authored
      30c9187f
    • Mark Rutland's avatar
      arm64: ensure extension of smp_store_release value · 4e528eb9
      Mark Rutland authored
      commit 994870be upstream.
      
      When an inline assembly operand's type is narrower than the register it
      is allocated to, the least significant bits of the register (up to the
      operand type's width) are valid, and any other bits are permitted to
      contain any arbitrary value. This aligns with the AAPCS64 parameter
      passing rules.
      
      Our __smp_store_release() implementation does not account for this, and
      implicitly assumes that operands have been zero-extended to the width of
      the type being stored to. Thus, we may store unknown values to memory
      when the value type is narrower than the pointer type (e.g. when storing
      a char to a long).
      
      This patch fixes the issue by casting the value operand to the same
      width as the pointer operand in all cases, which ensures that the value
      is zero-extended as we expect. We use the same union trickery as
      __smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that
      pointers are potentially cast to narrower width integers in unreachable
      paths.
      
      A whitespace issue at the top of __smp_store_release() is also
      corrected.
      
      No changes are necessary for __smp_load_acquire(). Load instructions
      implicitly clear any upper bits of the register, and the compiler will
      only consider the least significant bits of the register as valid
      regardless.
      
      Fixes: 47933ad4 ("arch: Introduce smp_load_acquire(), smp_store_release()")
      Fixes: 878a84d5 ("arm64: add missing data types in smp_load_acquire/smp_store_release")
      Cc: <stable@vger.kernel.org> # 3.14.x-
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e528eb9
    • Mark Rutland's avatar
      arm64: armv8_deprecated: ensure extension of addr · 01ce16f4
      Mark Rutland authored
      commit 55de49f9 upstream.
      
      Our compat swp emulation holds the compat user address in an unsigned
      int, which it passes to __user_swpX_asm(). When a 32-bit value is passed
      in a register, the upper 32 bits of the register are unknown, and we
      must extend the value to 64 bits before we can use it as a base address.
      
      This patch casts the address to unsigned long to ensure it has been
      suitably extended, avoiding the potential issue, and silencing a related
      warning from clang.
      
      Fixes: bd35a4ad ("arm64: Port SWP/SWPB emulation support from arm")
      Cc: <stable@vger.kernel.org> # 3.19.x-
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01ce16f4
    • Kees Cook's avatar
      usercopy: Adjust tests to deal with SMAP/PAN · 51ff10e7
      Kees Cook authored
      commit f5f893c5 upstream.
      
      Under SMAP/PAN/etc, we cannot write directly to userspace memory, so
      this rearranges the test bytes to get written through copy_to_user().
      Additionally drops the bad copy_from_user() test that would trigger a
      memcpy() against userspace on failure.
      
      [arnd: the test module was added in 3.14, and this backported patch
             should apply cleanly on all version from 3.14 to 4.10.
             The original patch was in 4.11 on top of a context change
             I saw the bug triggered with kselftest on a 4.4.y stable kernel]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51ff10e7
    • Mike Marciniszyn's avatar
      RDMA/qib,hfi1: Fix MR reference count leak on write with immediate · 746d4893
      Mike Marciniszyn authored
      commit 1feb4006 upstream.
      
      The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
      reference when a buffer cannot be allocated for returning the immediate
      data.
      
      The issue is that the rkey validation has already occurred and the RNR
      nak fails to release the reference that was fruitlessly gotten.  The
      the peer will send the identical single packet request when its RNR
      timer pops.
      
      The fix is to release the held reference prior to the rnr nak exit.
      This is the only sequence the requires both rkey validation and the
      buffer allocation on the same packet.
      
      Cc: Stable <stable@vger.kernel.org> # 4.7+
      Tested-by: default avatarTadeusz Struk <tadeusz.struk@intel.com>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      746d4893
    • Kristina Martsenko's avatar
      arm64: entry: improve data abort handling of tagged pointers · 3ccf6956
      Kristina Martsenko authored
      commit 276e9327 upstream.
      
      This backport has a minor difference from the upstream commit: it adds
      the asm-uaccess.h file, which is not present in 4.4, because 4.4 does
      not have commit b4b8664d ("arm64: don't pull uaccess.h into *.S").
      
      Original patch description:
      
      When handling a data abort from EL0, we currently zero the top byte of
      the faulting address, as we assume the address is a TTBR0 address, which
      may contain a non-zero address tag. However, the address may be a TTBR1
      address, in which case we should not zero the top byte. This patch fixes
      that. The effect is that the full TTBR1 address is passed to the task's
      signal handler (or printed out in the kernel log).
      
      When handling a data abort from EL1, we leave the faulting address
      intact, as we assume it's either a TTBR1 address or a TTBR0 address with
      tag 0x00. This is true as far as I'm aware, we don't seem to access a
      tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
      forget about address tags, and code added in the future may not always
      remember to remove tags from addresses before accessing them. So add tag
      handling to the EL1 data abort handler as well. This also makes it
      consistent with the EL0 data abort handler.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ccf6956
    • Kristina Martsenko's avatar
      arm64: hw_breakpoint: fix watchpoint matching for tagged pointers · 4eaef365
      Kristina Martsenko authored
      commit 7dcd9dd8 upstream.
      
      This backport has a few small differences from the upstream commit:
       - The address tag is removed in watchpoint_handler() instead of
         get_distance_from_watchpoint(), because 4.4 does not have commit
         fdfeff0f ("arm64: hw_breakpoint: Handle inexact watchpoint
         addresses").
       - A macro is backported (untagged_addr), as it is not present in 4.4.
      
      Original patch description:
      
      When we take a watchpoint exception, the address that triggered the
      watchpoint is found in FAR_EL1. We compare it to the address of each
      configured watchpoint to see which one was hit.
      
      The configured watchpoint addresses are untagged, while the address in
      FAR_EL1 will have an address tag if the data access was done using a
      tagged address. The tag needs to be removed to compare the address to
      the watchpoints.
      
      Currently we don't remove it, and as a result can report the wrong
      watchpoint as being hit (specifically, always either the highest TTBR0
      watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4eaef365
    • Artem Savkov's avatar
      Make __xfs_xattr_put_listen preperly report errors. · bc5f31d3
      Artem Savkov authored
      commit 791cc43b upstream.
      
      Commit 2a6fba6d "xfs: only return -errno or success from attr ->put_listent"
      changes the returnvalue of __xfs_xattr_put_listen to 0 in case when there is
      insufficient space in the buffer assuming that setting context->count to -1
      would be enough, but all of the ->put_listent callers only check seen_enough.
      This results in a failed assertion:
      XFS: Assertion failed: context->count >= 0, file: fs/xfs/xfs_xattr.c, line: 175
      in insufficient buffer size case.
      
      This is only reproducible with at least 2 xattrs and only when the buffer
      gets depleted before the last one.
      
      Furthermore if buffersize is such that it is enough to hold the last xattr's
      name, but not enough to hold the sum of preceeding xattr names listxattr won't
      fail with ERANGE, but will suceed returning last xattr's name without the
      first character. The first character end's up overwriting data stored at
      (context->alist - 1).
      Signed-off-by: default avatarArtem Savkov <asavkov@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Nikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc5f31d3
    • Trond Myklebust's avatar
      NFSv4: Don't perform cached access checks before we've OPENed the file · e8a1086a
      Trond Myklebust authored
      commit 762674f8 upstream.
      
      Donald Buczek reports that a nfs4 client incorrectly denies
      execute access based on outdated file mode (missing 'x' bit).
      After the mode on the server is 'fixed' (chmod +x) further execution
      attempts continue to fail, because the nfs ACCESS call updates
      the access parameter but not the mode parameter or the mode in
      the inode.
      
      The root cause is ultimately that the VFS is calling may_open()
      before the NFS client has a chance to OPEN the file and hence revalidate
      the access and attribute caches.
      
      Al Viro suggests:
      >>> Make nfs_permission() relax the checks when it sees MAY_OPEN, if you know
      >>> that things will be caught by server anyway?
      >>
      >> That can work as long as we're guaranteed that everything that calls
      >> inode_permission() with MAY_OPEN on a regular file will also follow up
      >> with a vfs_open() or dentry_open() on success. Is this always the
      >> case?
      >
      > 1) in do_tmpfile(), followed by do_dentry_open() (not reachable by NFS since
      > it doesn't have ->tmpfile() instance anyway)
      >
      > 2) in atomic_open(), after the call of ->atomic_open() has succeeded.
      >
      > 3) in do_last(), followed on success by vfs_open()
      >
      > That's all.  All calls of inode_permission() that get MAY_OPEN come from
      > may_open(), and there's no other callers of that puppy.
      Reported-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=109771
      Link: http://lkml.kernel.org/r/1451046656-26319-1-git-send-email-buczek@molgen.mpg.de
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8a1086a
    • Trond Myklebust's avatar
      NFS: Ensure we revalidate attributes before using execute_ok() · 53302082
      Trond Myklebust authored
      commit 5c5fc09a upstream.
      
      Donald Buczek reports that NFS clients can also report incorrect
      results for access() due to lack of revalidation of attributes
      before calling execute_ok().
      Looking closely, it seems chdir() is afflicted with the same problem.
      
      Fix is to ensure we call nfs_revalidate_inode_rcu() or
      nfs_revalidate_inode() as appropriate before deciding to trust
      execute_ok().
      Reported-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
      Link: http://lkml.kernel.org/r/1451331530-3748-1-git-send-email-buczek@molgen.mpg.deSigned-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53302082
    • Michal Hocko's avatar
      mm: consider memblock reservations for deferred memory initialization sizing · cb1fb15c
      Michal Hocko authored
      commit 864b9a39 upstream.
      
      We have seen an early OOM killer invocation on ppc64 systems with
      crashkernel=4096M:
      
      	kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
      	kthreadd cpuset=/ mems_allowed=7
      	CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
      	Call Trace:
      	  dump_stack+0xb0/0xf0 (unreliable)
      	  dump_header+0xb0/0x258
      	  out_of_memory+0x5f0/0x640
      	  __alloc_pages_nodemask+0xa8c/0xc80
      	  kmem_getpages+0x84/0x1a0
      	  fallback_alloc+0x2a4/0x320
      	  kmem_cache_alloc_node+0xc0/0x2e0
      	  copy_process.isra.25+0x260/0x1b30
      	  _do_fork+0x94/0x470
      	  kernel_thread+0x48/0x60
      	  kthreadd+0x264/0x330
      	  ret_from_kernel_thread+0x5c/0xa4
      
      	Mem-Info:
      	active_anon:0 inactive_anon:0 isolated_anon:0
      	 active_file:0 inactive_file:0 isolated_file:0
      	 unevictable:0 dirty:0 writeback:0 unstable:0
      	 slab_reclaimable:5 slab_unreclaimable:73
      	 mapped:0 shmem:0 pagetables:0 bounce:0
      	 free:0 free_pcp:0 free_cma:0
      	Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
      	lowmem_reserve[]: 0 0 0 0
      	Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
      	0 total pagecache pages
      	0 pages in swap cache
      	Swap cache stats: add 0, delete 0, find 0/0
      	Free swap  = 0kB
      	Total swap = 0kB
      	819200 pages RAM
      	0 pages HighMem/MovableOnly
      	817481 pages reserved
      	0 pages cma reserved
      	0 pages hwpoisoned
      
      the reason is that the managed memory is too low (only 110MB) while the
      rest of the the 50GB is still waiting for the deferred intialization to
      be done.  update_defer_init estimates the initial memoty to initialize
      to 2GB at least but it doesn't consider any memory allocated in that
      range.  In this particular case we've had
      
      	Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)
      
      so the low 2GB is mostly depleted.
      
      Fix this by considering memblock allocations in the initial static
      initialization estimation.  Move the max_initialise to
      reset_deferred_meminit and implement a simple memblock_reserved_memory
      helper which iterates all reserved blocks and sums the size of all that
      start below the given address.  The cumulative size is than added on top
      of the initial estimation.  This is still not ideal because
      reset_deferred_meminit doesn't consider holes and so reservation might
      be above the initial estimation whihch we ignore but let's make the
      logic simpler until we really need to handle more complicated cases.
      
      Fixes: 3a80a7fa ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
      Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      cb1fb15c