1. 26 Sep, 2018 28 commits
  2. 19 Sep, 2018 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.71 · 1244bbb3
      Greg Kroah-Hartman authored
      1244bbb3
    • Linus Torvalds's avatar
      mm: get rid of vmacache_flush_all() entirely · 06274364
      Linus Torvalds authored
      commit 7a9cdebd upstream.
      
      Jann Horn points out that the vmacache_flush_all() function is not only
      potentially expensive, it's buggy too.  It also happens to be entirely
      unnecessary, because the sequence number overflow case can be avoided by
      simply making the sequence number be 64-bit.  That doesn't even grow the
      data structures in question, because the other adjacent fields are
      already 64-bit.
      
      So simplify the whole thing by just making the sequence number overflow
      case go away entirely, which gets rid of all the complications and makes
      the code faster too.  Win-win.
      
      [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
        also just goes away entirely with this ]
      Reported-by: default avatarJann Horn <jannh@google.com>
      Suggested-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06274364
    • Ian Kent's avatar
      autofs: fix autofs_sbi() does not check super block type · 8b34a7b1
      Ian Kent authored
      commit 0633da48 upstream.
      
      autofs_sbi() does not check the superblock magic number to verify it has
      been given an autofs super block.
      
      Backport Note: autofs4 has been renamed to autofs upstream. As a result
      the upstream patch does not apply cleanly onto 4.14.y.
      
      Link: http://lkml.kernel.org/r/153475422934.17131.7563724552005298277.stgit@pluto.themaw.net
      Reported-by: <syzbot+87c3c541582e56943277@syzkaller.appspotmail.com>
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b34a7b1
    • Jason Wang's avatar
      tuntap: fix use after free during release · daf0ca74
      Jason Wang authored
      commit 7063efd3 upstream.
      
      After commit b196d88a ("tun: fix use after free for ptr_ring") we
      need clean up tx ring during release(). But unfortunately, it tries to
      do the cleanup blindly after socket were destroyed which will lead
      another use-after-free. Fix this by doing the cleanup before dropping
      the last reference of the socket in __tun_detach().
      
      Backport Note :-
      Upstream commit moves the ptr_ring_cleanup call from tun_chr_close to
      __tun_detach. Upstream applied that patch after replacing skb_array with
      ptr_ring. This patch moves the skb_array_cleanup call from
      tun_chr_close to __tun_detach.
      Reported-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
      Acked-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
      Fixes: b196d88a ("tun: fix use after free for ptr_ring")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      daf0ca74
    • Jason Wang's avatar
      tun: fix use after free for ptr_ring · ab75811f
      Jason Wang authored
      commit b196d88a upstream.
      
      We used to initialize ptr_ring during TUNSETIFF, this is because its
      size depends on the tx_queue_len of netdevice. And we try to clean it
      up when socket were detached from netdevice. A race were spotted when
      trying to do uninit during a read which will lead a use after free for
      pointer ring. Solving this by always initialize a zero size ptr_ring
      in open() and do resizing during TUNSETIFF, and then we can safely do
      cleanup during close(). With this, there's no need for the workaround
      that was introduced by commit 4df0bfc7 ("tun: fix a memory leak
      for tfile->tx_array").
      
      Backport Note :-
      Comparison with the upstream patch:
      [1] A "semantic revert" of the changes made in
          4df0bfc7("tun: fix a memory leak for tfile->tx_array").
              4df0bfc7 was applied upstream, and then skb array was changed
      	to use ptr_ring. The upstream patch then removes the changes introduced
      	by 4df0bfc7. This backport does the same; "revert" the changes
      	made by 4df0bfc7.
      [2] xdp_rxq_info_unreg() being called in relevant locations
              As xdp_rxq_info related patches are not present in 4.14, these
      	changes are not needed in the backport.
      [3] An instance of ptr_ring_init needs to be replaced by skb_array_init
      	Inside tun_attach()
      [4] ptr_ring_cleanup needs to be replaced by skb_array_cleanup
      	Inside tun_chr_close()
      
      Note that the backport for 7063efd3 ("tuntap: fix use after free during release")
      needs to be applied on top of this patch.
      
      Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Fixes: 1576d986 ("tun: switch to use skb array for tx")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab75811f
    • Wei Yongjun's avatar
      mtd: ubi: wl: Fix error return code in ubi_wl_init() · 8626c40a
      Wei Yongjun authored
      commit 7233982a upstream.
      
      Fix to return error code -ENOMEM from the kmem_cache_alloc() error
      handling case instead of 0, as done elsewhere in this function.
      
      Fixes: f78e5623 ("ubi: fastmap: Erase outdated anchor PEBs during
      attach")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8626c40a
    • Taehee Yoo's avatar
      ip: frags: fix crash in ip_do_fragment() · 08fb833b
      Taehee Yoo authored
      commit 5d407b07 upstream
      
      A kernel crash occurrs when defragmented packet is fragmented
      in ip_do_fragment().
      In defragment routine, skb_orphan() is called and
      skb->ip_defrag_offset is set. but skb->sk and
      skb->ip_defrag_offset are same union member. so that
      frag->sk is not NULL.
      Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
      defragmented packet is fragmented.
      
      test commands:
         %iptables -t nat -I POSTROUTING -j MASQUERADE
         %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
      
      splat looks like:
      [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
      [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
      [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
      [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
      [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
      [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
      [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
      [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
      [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
      [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
      [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
      [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
      [  261.198158] Call Trace:
      [  261.199018]  ? dst_output+0x180/0x180
      [  261.205011]  ? save_trace+0x300/0x300
      [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
      [  261.213034]  ? sched_clock_local+0xd4/0x140
      [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
      [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
      [  261.227014]  ? find_held_lock+0x39/0x1c0
      [  261.233008]  ip_finish_output+0x51d/0xb50
      [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
      [  261.250152]  ? rcu_is_watching+0x77/0x120
      [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
      [  261.261033]  ? nf_hook_slow+0xb1/0x160
      [  261.265007]  ip_output+0x1c7/0x710
      [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
      [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.282996]  ? nf_hook_slow+0xb1/0x160
      [  261.287007]  raw_sendmsg+0x21f9/0x4420
      [  261.291008]  ? dst_output+0x180/0x180
      [  261.297003]  ? sched_clock_cpu+0x126/0x170
      [  261.301003]  ? find_held_lock+0x39/0x1c0
      [  261.306155]  ? stop_critical_timings+0x420/0x420
      [  261.311004]  ? check_flags.part.36+0x450/0x450
      [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.326142]  ? cyc2ns_read_end+0x10/0x10
      [  261.330139]  ? raw_bind+0x280/0x280
      [  261.334138]  ? sched_clock_cpu+0x126/0x170
      [  261.338995]  ? check_flags.part.36+0x450/0x450
      [  261.342991]  ? __lock_acquire+0x4500/0x4500
      [  261.348994]  ? inet_sendmsg+0x11c/0x500
      [  261.352989]  ? dst_output+0x180/0x180
      [  261.357012]  inet_sendmsg+0x11c/0x500
      [ ... ]
      
      v2:
       - clear skb->sk at reassembly routine.(Eric Dumarzet)
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08fb833b
    • Peter Oskolkov's avatar
      ip: process in-order fragments efficiently · b3a0c61b
      Peter Oskolkov authored
      This patch changes the runtime behavior of IP defrag queue:
      incoming in-order fragments are added to the end of the current
      list/"run" of in-order fragments at the tail.
      
      On some workloads, UDP stream performance is substantially improved:
      
      RX: ./udp_stream -F 10 -T 2 -l 60
      TX: ./udp_stream -c -H <host> -F 10 -T 5 -l 60
      
      with this patchset applied on a 10Gbps receiver:
      
        throughput=9524.18
        throughput_units=Mbit/s
      
      upstream (net-next):
      
        throughput=4608.93
        throughput_units=Mbit/s
      Reported-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit a4fd284a)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b3a0c61b
    • Peter Oskolkov's avatar
      ip: add helpers to process in-order fragments faster. · c91f27fb
      Peter Oskolkov authored
      This patch introduces several helper functions/macros that will be
      used in the follow-up patch. No runtime changes yet.
      
      The new logic (fully implemented in the second patch) is as follows:
      
      * Nodes in the rb-tree will now contain not single fragments, but lists
        of consecutive fragments ("runs").
      
      * At each point in time, the current "active" run at the tail is
        maintained/tracked. Fragments that arrive in-order, adjacent
        to the previous tail fragment, are added to this tail run without
        triggering the re-balancing of the rb-tree.
      
      * If a fragment arrives out of order with the offset _before_ the tail run,
        it is inserted into the rb-tree as a single fragment.
      
      * If a fragment arrives after the current tail fragment (with a gap),
        it starts a new "tail" run, as is inserted into the rb-tree
        at the end as the head of the new run.
      
      skb->cb is used to store additional information
      needed here (suggested by Eric Dumazet).
      Reported-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 353c9cb3)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c91f27fb
    • Dan Carpenter's avatar
      ipv4: frags: precedence bug in ip_expire() · 04b28f40
      Dan Carpenter authored
      We accidentally removed the parentheses here, but they are required
      because '!' has higher precedence than '&'.
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 70837ffe)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      04b28f40
    • Eric Dumazet's avatar
      net: sk_buff rbnode reorg · 6b921536
      Eric Dumazet authored
      commit bffa72cf upstream
      
      skb->rbnode shares space with skb->next, skb->prev and skb->tstamp
      
      Current uses (TCP receive ofo queue and netem) need to save/restore
      tstamp, while skb->dev is either NULL (TCP) or a constant for a given
      queue (netem).
      
      Since we plan using an RB tree for TCP retransmit queue to speedup SACK
      processing with large BDP, this patch exchanges skb->dev and
      skb->tstamp.
      
      This saves some overhead in both TCP and netem.
      
      v2: removes the swtstamp field from struct tcp_skb_cb
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b921536
    • Eric Dumazet's avatar
      net: add rb_to_skb() and other rb tree helpers · 37c7cc80
      Eric Dumazet authored
      Geeralize private netem_rb_to_skb()
      
      TCP rtx queue will soon be converted to rb-tree,
      so we will need skb_rbtree_walk() helpers.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 18a4c0ea)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37c7cc80