1. 19 Oct, 2018 6 commits
    • Greg Kroah-Hartman's avatar
      Merge tag 'for-linus-20181019' of git://git.kernel.dk/linux-block · b2a205ff
      Greg Kroah-Hartman authored
      Jens writes:
        "Block fixes for 4.19-final
      
         Two small fixes that should go into this release."
      
      * tag 'for-linus-20181019' of git://git.kernel.dk/linux-block:
        block: don't deal with discard limit in blkdev_issue_discard()
        nvme: remove ns sibling before clearing path
      b2a205ff
    • Greg Kroah-Hartman's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 91b15613
      Greg Kroah-Hartman authored
      David writes:
        "Networking
      
         1) Fix gro_cells leak in xfrm layer, from Li RongQing.
      
         2) BPF selftests change RLIMIT_MEMLOCK blindly, don't do that.  From
            Eric Dumazet.
      
         3) AF_XDP calls synchronize_net() under RCU lock, fix from Björn
            Töpel.
      
         4) Out of bounds packet access in _decode_session6(), from Alexei
            Starovoitov.
      
         5) Several ethtool bugs, where we copy a struct into the kernel twice
            and our validations of the values in the first copy can be
            invalidated by the second copy due to asynchronous updates to the
            memory by the user.  From Wenwen Wang.
      
         6) Missing netlink attribute validation in cls_api, from Davide
            Caratti.
      
         7) LLC SAP sockets neet to be SOCK_RCU FREE, from Cong Wang.
      
         8) rxrpc operates on wrong kvec, from Yue Haibing.
      
         9) A regression was introduced by the disassosciation of route
            neighbour references in rt6_probe(), causing probe for
            neighbourless routes to not be properly rate limited.  Fix from
            Sabrina Dubroca.
      
         10) Unsafe RCU locking in tipc, from Tung Nguyen.
      
         11) Use after free in inet6_mc_check(), from Eric Dumazet.
      
         12) PMTU from icmp packets should update the SCTP transport pathmtu,
             from Xin Long.
      
         13) Missing peer put on error in rxrpc, from David Howells.
      
         14) Fix pedit in nfp driver, from Pieter Jansen van Vuuren.
      
         15) Fix overflowing shift statement in qla3xxx driver, from Nathan
             Chancellor.
      
         16) Fix Spectre v1 in ptp code, from Gustavo A. R. Silva.
      
         17) udp6_unicast_rcv_skb() interprets udpv6_queue_rcv_skb() return
             value in an inverted manner, fix from Paolo Abeni.
      
         18) Fix missed unresolved entries in ipmr dumps, from Nikolay
             Aleksandrov.
      
         19) Fix NAPI handling under high load, we can completely miss events
             when NAPI has to loop more than one time in a cycle.  From Heiner
             Kallweit."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits)
        ip6_tunnel: Fix encapsulation layout
        tipc: fix info leak from kernel tipc_event
        net: socket: fix a missing-check bug
        net: sched: Fix for duplicate class dump
        r8169: fix NAPI handling under high load
        net: ipmr: fix unresolved entry dumps
        net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion()
        sctp: fix the data size calculation in sctp_data_size
        virtio_net: avoid using netif_tx_disable() for serializing tx routine
        udp6: fix encap return code for resubmitting
        mlxsw: core: Fix use-after-free when flashing firmware during init
        sctp: not free the new asoc when sctp_wait_for_connect returns err
        sctp: fix race on sctp_id2asoc
        r8169: re-enable MSI-X on RTL8168g
        net: bpfilter: use get_pid_task instead of pid_task
        ptp: fix Spectre v1 vulnerability
        net: qla3xxx: Remove overflowing shift statement
        geneve, vxlan: Don't set exceptions if skb->len < mtu
        geneve, vxlan: Don't check skb_dst() twice
        sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL instead
        ...
      91b15613
    • Greg Kroah-Hartman's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 2a966610
      Greg Kroah-Hartman authored
      David writes:
        "Sparc fixes:
      
         The main bit here is fixing how fallback system calls are handled in
         the sparc vDSO.
      
         Unfortunately, I fat fingered the commit and some perf debugging
         hacks slipped into the vDSO fix, which I revert in the very next
         commit."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Revert unintended perf changes.
        sparc: vDSO: Silence an uninitialized variable warning
        sparc: Fix syscall fallback bugs in VDSO.
      2a966610
    • Greg Kroah-Hartman's avatar
      Merge tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm · 7555c5d5
      Greg Kroah-Hartman authored
      Dave writes:
        "drm fixes for 4.19 final
      
         Just a last set of misc core fixes for final.
      
         4 fixes, one use after free, one fb integration fix, one EDID fix,
         and one laptop panel quirk,"
      
      * tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm:
        drm/edid: VSDB yCBCr420 Deep Color mode bit definitions
        drm: fix use of freed memory in drm_mode_setcrtc
        drm: fb-helper: Reject all pixel format changing requests
        drm/edid: Add 6 bpc quirk for BOE panel in HP Pavilion 15-n233sl
      7555c5d5
    • Greg Kroah-Hartman's avatar
      Merge tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · eb6d938f
      Greg Kroah-Hartman authored
      Doug writes:
        "Really final for-rc pull request for 4.19
      
         Ok, so last week I thought we had sent our final pull request for
         4.19.  Well, wouldn't ya know someone went and found a couple Spectre
         v1 fixes were needed :-/.  So, a couple *very* small specter patches
         for this (hopefully) final -rc week."
      
      * tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/ucma: Fix Spectre v1 vulnerability
        IB/ucm: Fix Spectre v1 vulnerability
      eb6d938f
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2018-10-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · f8e6e1b6
      Dave Airlie authored
      drm-misc-fixes for v4.19:
      - Fix use of freed memory in drm_mode_setcrtc.
      - Reject pixel format changing requests in fb helper.
      - Add 6 bpc quirk for HP Pavilion 15-n233sl
      - Fix VSDB yCBCr420 Deep Color mode bit definitions
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/647fe5d0-4ec5-57cc-9f23-a4836b29e278@linux.intel.com
      f8e6e1b6
  2. 18 Oct, 2018 32 commits
    • Stefano Brivio's avatar
      ip6_tunnel: Fix encapsulation layout · d4d576f5
      Stefano Brivio authored
      Commit 058214a4 ("ip6_tun: Add infrastructure for doing
      encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before
      the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation
      Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header.
      
      As long as the option didn't actually end up in generated packets, this
      wasn't an issue. Then commit 89a23c8b ("ip6_tunnel: Fix missing tunnel
      encapsulation limit option") fixed sending of this option, and the
      resulting layout, e.g. for FoU, is:
      
      .-------------------.------------.----------.-------------------.----- - -
      | Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload
      '-------------------'------------'----------'-------------------'----- - -
      
      Needless to say, FoU and GUE (at least) won't work over IPv6. The option
      is appended by default, and I couldn't find a way to disable it with the
      current iproute2.
      
      Turn this into a more reasonable:
      
      .-------------------.----------.------------.-------------------.----- - -
      | Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload
      '-------------------'----------'------------'-------------------'----- - -
      
      With this, and with 84dad559 ("udp6: fix encap return code for
      resubmitting"), FoU and GUE work again over IPv6.
      
      Fixes: 058214a4 ("ip6_tun: Add infrastructure for doing encapsulation")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4d576f5
    • Jon Maloy's avatar
      tipc: fix info leak from kernel tipc_event · b06f9d9f
      Jon Maloy authored
      We initialize a struct tipc_event allocated on the kernel stack to
      zero to avert info leak to user space.
      
      Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b06f9d9f
    • Wenwen Wang's avatar
      net: socket: fix a missing-check bug · b6168562
      Wenwen Wang authored
      In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
      statement to see whether it is necessary to pre-process the ethtool
      structure, because, as mentioned in the comment, the structure
      ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
      is allocated through compat_alloc_user_space(). One thing to note here is
      that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
      partially determined by 'rule_cnt', which is actually acquired from the
      user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
      get_user(). After 'rxnfc' is allocated, the data in the original user-space
      buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
      including the 'rule_cnt' field. However, after this copy, no check is
      re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
      race to change the value in the 'compat_rxnfc->rule_cnt' between these two
      copies. Through this way, the attacker can bypass the previous check on
      'rule_cnt' and inject malicious data. This can cause undefined behavior of
      the kernel and introduce potential security risk.
      
      This patch avoids the above issue via copying the value acquired by
      get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6168562
    • Phil Sutter's avatar
      net: sched: Fix for duplicate class dump · 3c53ed8f
      Phil Sutter authored
      When dumping classes by parent, kernel would return classes twice:
      
      | # tc qdisc add dev lo root prio
      | # tc class show dev lo
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      | # tc class show dev lo parent 8001:
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      
      This comes from qdisc_match_from_root() potentially returning the root
      qdisc itself if its handle matched. Though in that case, root's classes
      were already dumped a few lines above.
      
      Fixes: cb395b20 ("net: sched: optimize class dumps")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c53ed8f
    • Heiner Kallweit's avatar
      r8169: fix NAPI handling under high load · 6b839b6c
      Heiner Kallweit authored
      rtl_rx() and rtl_tx() are called only if the respective bits are set
      in the interrupt status register. Under high load NAPI may not be
      able to process all data (work_done == budget) and it will schedule
      subsequent calls to the poll callback.
      rtl_ack_events() however resets the bits in the interrupt status
      register, therefore subsequent calls to rtl8169_poll() won't call
      rtl_rx() and rtl_tx() - chip interrupts are still disabled.
      
      Fix this by calling rtl_rx() and rtl_tx() independent of the bits
      set in the interrupt status register. Both functions will detect
      if there's nothing to do for them.
      
      Fixes: da78dbff ("r8169: remove work from irq handler.")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b839b6c
    • David S. Miller's avatar
      sparc: Revert unintended perf changes. · 27faeebd
      David S. Miller authored
      Some local debugging hacks accidently slipped into the VDSO commit.
      
      Sorry!
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27faeebd
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 2ee653f6
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2018-10-18
      
      1) Free the xfrm interface gro_cells when deleting the
         interface, otherwise we leak it. From Li RongQing.
      
      2) net/core/flow.c does not exist anymore, so remove it
         from the MAINTAINERS file.
      
      3) Fix a slab-out-of-bounds in _decode_session6.
         From Alexei Starovoitov.
      
      4) Fix RCU protection when policies inserted into
         thei bydst lists. From Florian Westphal.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ee653f6
    • Ming Lei's avatar
      block: don't deal with discard limit in blkdev_issue_discard() · 744889b7
      Ming Lei authored
      blk_queue_split() does respect this limit via bio splitting, so no
      need to do that in blkdev_issue_discard(), then we can align to
      normal bio submit(bio_add_page() & submit_bio()).
      
      More importantly, this patch fixes one issue introduced in a22c4d7e
      ("block: re-add discard_granularity and alignment checks"), in which
      zero discard bio may be generated in case of zero alignment.
      
      Fixes: a22c4d7e ("block: re-add discard_granularity and alignment checks")
      Cc: stable@vger.kernel.org
      Cc: Ming Lin <ming.l@ssi.samsung.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Xiao Ni <xni@redhat.com>
      Tested-by: default avatarMariusz Dabrowski <mariusz.dabrowski@intel.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      744889b7
    • Eric Sandeen's avatar
      fscache: Fix out of bound read in long cookie keys · fa520c47
      Eric Sandeen authored
      fscache_set_key() can incur an out-of-bounds read, reported by KASAN:
      
       BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
       Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615
      
      and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236
      
        BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
        BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
        Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466
      
      This happens for any index_key_len which is not divisible by 4 and is
      larger than the size of the inline key, because the code allocates exactly
      index_key_len for the key buffer, but the hashing loop is stepping through
      it 4 bytes (u32) at a time in the buf[] array.
      
      Fix this by calculating how many u32 buffers we'll need by using
      DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
      buffer to hold the index_key, then using that same count as the hashing
      index limit.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa520c47
    • David Howells's avatar
      fscache: Fix incomplete initialisation of inline key space · 1ff22883
      David Howells authored
      The inline key in struct rxrpc_cookie is insufficiently initialized,
      zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
      bytes will end up hashing uninitialized memory because the memcpy only
      partially fills the last buf[] element.
      
      Fix this by clearing fscache_cookie objects on allocation rather than using
      the slab constructor to initialise them.  We're going to pretty much fill
      in the entire struct anyway, so bringing it into our dcache writably
      shouldn't incur much overhead.
      
      This removes the need to do clearance in fscache_set_key() (where we aren't
      doing it correctly anyway).
      
      Also, we don't need to set cookie->key_len in fscache_set_key() as we
      already did it in the only caller, so remove that.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Reported-by: default avatarEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ff22883
    • Al Viro's avatar
      cachefiles: fix the race between cachefiles_bury_object() and rmdir(2) · 169b8033
      Al Viro authored
      the victim might've been rmdir'ed just before the lock_rename();
      unlike the normal callers, we do not look the source up after the
      parents are locked - we know it beforehand and just recheck that it's
      still the child of what used to be its parent.  Unfortunately,
      the check is too weak - we don't spot a dead directory since its
      ->d_parent is unchanged, dentry is positive, etc.  So we sail all
      the way to ->rename(), with hosting filesystems _not_ expecting
      to be asked renaming an rmdir'ed subdirectory.
      
      The fix is easy, fortunately - the lock on parent is sufficient for
      making IS_DEADDIR() on child safe.
      
      Cc: stable@vger.kernel.org
      Fixes: 9ae326a6 (CacheFiles: A cache that backs onto a mounted filesystem)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      169b8033
    • Linus Torvalds's avatar
      mremap: properly flush TLB before releasing the page · eb66ae03
      Linus Torvalds authored
      Jann Horn points out that our TLB flushing was subtly wrong for the
      mremap() case.  What makes mremap() special is that we don't follow the
      usual "add page to list of pages to be freed, then flush tlb, and then
      free pages".  No, mremap() obviously just _moves_ the page from one page
      table location to another.
      
      That matters, because mremap() thus doesn't directly control the
      lifetime of the moved page with a freelist: instead, the lifetime of the
      page is controlled by the page table locking, that serializes access to
      the entry.
      
      As a result, we need to flush the TLB not just before releasing the lock
      for the source location (to avoid any concurrent accesses to the entry),
      but also before we release the destination page table lock (to avoid the
      TLB being flushed after somebody else has already done something to that
      page).
      
      This also makes the whole "need_flush" logic unnecessary, since we now
      always end up flushing the TLB for every valid entry.
      Reported-and-tested-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb66ae03
    • Christoph Hellwig's avatar
      LICENSES: Remove CC-BY-SA-4.0 license text · 19e6420e
      Christoph Hellwig authored
      Using non-GPL licenses for our documentation is rather problematic,
      as it can directly include other files, which generally are GPLv2
      licensed and thus not compatible.
      
      Remove this license now that the only user (idr.rst) is gone to avoid
      people semi-accidentally using it again.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19e6420e
    • Greg Kroah-Hartman's avatar
      Merge branch 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax · ca9f672f
      Greg Kroah-Hartman authored
      Matthew writes:
        "IDA/IDR fixes for 4.19
      
         I have two tiny fixes, one for the IDA test-suite and one for the IDR
         documentation license."
      
      * 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax:
        idr: Change documentation license
        test_ida: Fix lockdep warning
      ca9f672f
    • Nikolay Aleksandrov's avatar
      net: ipmr: fix unresolved entry dumps · eddf016b
      Nikolay Aleksandrov authored
      If the skb space ends in an unresolved entry while dumping we'll miss
      some unresolved entries. The reason is due to zeroing the entry counter
      between dumping resolved and unresolved mfc entries. We should just
      keep counting until the whole table is dumped and zero when we move to
      the next as we have a separate table counter.
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Fixes: 8fb472c0 ("ipmr: improve hash scalability")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eddf016b
    • Gregory CLEMENT's avatar
      net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion() · 06a36ecb
      Gregory CLEMENT authored
      The ocelot_vlant_wait_for_completion() function is very similar to the
      ocelot_mact_wait_for_completion(). It seemed to have be copied but the
      comment was not updated, so let's fix it.
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06a36ecb
    • Xin Long's avatar
      sctp: fix the data size calculation in sctp_data_size · 5660b9d9
      Xin Long authored
      sctp data size should be calculated by subtracting data chunk header's
      length from chunk_hdr->length, not just data header.
      
      Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5660b9d9
    • Ake Koomsin's avatar
      virtio_net: avoid using netif_tx_disable() for serializing tx routine · 05c998b7
      Ake Koomsin authored
      Commit 713a98d9 ("virtio-net: serialize tx routine during reset")
      introduces netif_tx_disable() after netif_device_detach() in order to
      avoid use-after-free of tx queues. However, there are two issues.
      
      1) Its operation is redundant with netif_device_detach() in case the
         interface is running.
      2) In case of the interface is not running before suspending and
         resuming, the tx does not get resumed by netif_device_attach().
         This results in losing network connectivity.
      
      It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for
      serializing tx routine during reset. This also preserves the symmetry
      of netif_device_detach() and netif_device_attach().
      
      Fixes commit 713a98d9 ("virtio-net: serialize tx routine during reset")
      Signed-off-by: default avatarAke Koomsin <ake@igel.co.jp>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05c998b7
    • Greg Kroah-Hartman's avatar
      Merge tag 'trace-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 9bd871df
      Greg Kroah-Hartman authored
      Steven writes:
        "tracing: Two fixes for 4.19
      
         This fixes two bugs:
          - Fix size mismatch of tracepoint array
          - Have preemptirq test module use same clock source of the selftest"
      
      * tag 'trace-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Use trace_clock_local() for looping in preemptirq_delay_test.c
        tracepoint: Fix tracepoint array element size mismatch
      9bd871df
    • Paolo Abeni's avatar
      udp6: fix encap return code for resubmitting · 84dad559
      Paolo Abeni authored
      The commit eb63f296 ("udp6: add missing checks on edumux packet
      processing") used the same return code convention of the ipv4 counterpart,
      but ipv6 uses the opposite one: positive values means resubmit.
      
      This change addresses the issue, using positive return value for
      resubmitting. Also update the related comment, which was broken, too.
      
      Fixes: eb63f296 ("udp6: add missing checks on edumux packet processing")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84dad559
    • Ido Schimmel's avatar
      mlxsw: core: Fix use-after-free when flashing firmware during init · 9b3bc7db
      Ido Schimmel authored
      When the switch driver (e.g., mlxsw_spectrum) determines it needs to
      flash a new firmware version it resets the ASIC after the flashing
      process. The bus driver (e.g., mlxsw_pci) then registers itself again
      with mlxsw_core which means (among other things) that the device
      registers itself again with the hwmon subsystem again.
      
      Since the device was registered with the hwmon subsystem using
      devm_hwmon_device_register_with_groups(), then the old hwmon device
      (registered before the flashing) was never unregistered and was
      referencing stale data, resulting in a use-after free.
      
      Fix by removing reliance on device managed APIs in mlxsw_hwmon_init().
      
      Fixes: c86d62cc ("mlxsw: spectrum: Reset FW after flash")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Tested-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b3bc7db
    • Xin Long's avatar
      sctp: not free the new asoc when sctp_wait_for_connect returns err · c863850c
      Xin Long authored
      When sctp_wait_for_connect is called to wait for connect ready
      for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could
      be triggered if cpu is scheduled out and the new asoc is freed
      elsewhere, as it will return err and later the asoc gets freed
      again in sctp_sendmsg.
      
      [  285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100)
      [  285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0
      [  285.846193] Kernel panic - not syncing: panic_on_warn set ...
      [  285.846193]
      [  285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584
      [  285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  285.852164] Call Trace:
      ...
      [  285.872210]  ? __list_del_entry_valid+0x50/0xa0
      [  285.872894]  sctp_association_free+0x42/0x2d0 [sctp]
      [  285.873612]  sctp_sendmsg+0x5a4/0x6b0 [sctp]
      [  285.874236]  sock_sendmsg+0x30/0x40
      [  285.874741]  ___sys_sendmsg+0x27a/0x290
      [  285.875304]  ? __switch_to_asm+0x34/0x70
      [  285.875872]  ? __switch_to_asm+0x40/0x70
      [  285.876438]  ? ptep_set_access_flags+0x2a/0x30
      [  285.877083]  ? do_wp_page+0x151/0x540
      [  285.877614]  __sys_sendmsg+0x58/0xa0
      [  285.878138]  do_syscall_64+0x55/0x180
      [  285.878669]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This is a similar issue with the one fixed in Commit ca3af4dd
      ("sctp: do not free asoc when it is already dead in sctp_sendmsg").
      But this one can't be fixed by returning -ESRCH for the dead asoc
      in sctp_wait_for_connect, as it will break sctp_connect's return
      value to users.
      
      This patch is to simply set err to -ESRCH before it returns to
      sctp_sendmsg when any err is returned by sctp_wait_for_connect
      for sp->strm_interleave, so that no asoc would be freed due to
      this.
      
      When users see this error, they will know the packet hasn't been
      sent. And it also makes sense to not free asoc because waiting
      connect fails, like the second call for sctp_wait_for_connect in
      sctp_sendmsg_to_asoc.
      
      Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c863850c
    • Marcelo Ricardo Leitner's avatar
      sctp: fix race on sctp_id2asoc · b336deca
      Marcelo Ricardo Leitner authored
      syzbot reported an use-after-free involving sctp_id2asoc.  Dmitry Vyukov
      helped to root cause it and it is because of reading the asoc after it
      was freed:
      
              CPU 1                       CPU 2
      (working on socket 1)            (working on socket 2)
      	                         sctp_association_destroy
      sctp_id2asoc
         spin lock
           grab the asoc from idr
         spin unlock
                                         spin lock
      				     remove asoc from idr
      				   spin unlock
      				   free(asoc)
         if asoc->base.sk != sk ... [*]
      
      This can only be hit if trying to fetch asocs from different sockets. As
      we have a single IDR for all asocs, in all SCTP sockets, their id is
      unique on the system. An application can try to send stuff on an id
      that matches on another socket, and the if in [*] will protect from such
      usage. But it didn't consider that as that asoc may belong to another
      socket, it may be freed in parallel (read: under another socket lock).
      
      We fix it by moving the checks in [*] into the protected region. This
      fixes it because the asoc cannot be freed while the lock is held.
      
      Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
      Acked-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b336deca
    • Heiner Kallweit's avatar
      r8169: re-enable MSI-X on RTL8168g · 9675931e
      Heiner Kallweit authored
      Similar to d49c88d7 ("r8169: Enable MSI-X on RTL8106e") after
      e9d0ba506ea8 ("PCI: Reprogram bridge prefetch registers on resume")
      we can safely assume that this also fixes the root cause of
      the issue worked around by 7c53a722 ("r8169: don't use MSI-X on
      RTL8168g"). So let's revert it.
      
      Fixes: 7c53a722 ("r8169: don't use MSI-X on RTL8168g")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9675931e
    • Taehee Yoo's avatar
      net: bpfilter: use get_pid_task instead of pid_task · 84258438
      Taehee Yoo authored
      pid_task() dereferences rcu protected tasks array.
      But there is no rcu_read_lock() in shutdown_umh() routine so that
      rcu_read_lock() is needed.
      get_pid_task() is wrapper function of pid_task. it holds rcu_read_lock()
      then calls pid_task(). if task isn't NULL, it increases reference count
      of task.
      
      test commands:
         %modprobe bpfilter
         %modprobe -rv bpfilter
      
      splat looks like:
      [15102.030932] =============================
      [15102.030957] WARNING: suspicious RCU usage
      [15102.030985] 4.19.0-rc7+ #21 Not tainted
      [15102.031010] -----------------------------
      [15102.031038] kernel/pid.c:330 suspicious rcu_dereference_check() usage!
      [15102.031063]
      	       other info that might help us debug this:
      
      [15102.031332]
      	       rcu_scheduler_active = 2, debug_locks = 1
      [15102.031363] 1 lock held by modprobe/1570:
      [15102.031389]  #0: 00000000580ef2b0 (bpfilter_lock){+.+.}, at: stop_umh+0x13/0x52 [bpfilter]
      [15102.031552]
                     stack backtrace:
      [15102.031583] CPU: 1 PID: 1570 Comm: modprobe Not tainted 4.19.0-rc7+ #21
      [15102.031607] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
      [15102.031628] Call Trace:
      [15102.031676]  dump_stack+0xc9/0x16b
      [15102.031723]  ? show_regs_print_info+0x5/0x5
      [15102.031801]  ? lockdep_rcu_suspicious+0x117/0x160
      [15102.031855]  pid_task+0x134/0x160
      [15102.031900]  ? find_vpid+0xf0/0xf0
      [15102.032017]  shutdown_umh.constprop.1+0x1e/0x53 [bpfilter]
      [15102.032055]  stop_umh+0x46/0x52 [bpfilter]
      [15102.032092]  __x64_sys_delete_module+0x47e/0x570
      [ ... ]
      
      Fixes: d2ba09c1 ("net: add skeleton of bpfilter kernel module")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84258438
    • Gustavo A. R. Silva's avatar
      ptp: fix Spectre v1 vulnerability · efa61c8c
      Gustavo A. R. Silva authored
      pin_index can be indirectly controlled by user-space, hence leading
      to a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/ptp/ptp_chardev.c:253 ptp_ioctl() warn: potential spectre issue
      'ops->pin_config' [r] (local cap)
      
      Fix this by sanitizing pin_index before using it to index
      ops->pin_config, and before passing it as an argument to
      function ptp_set_pinfunc(), in which it is used to index
      info->pin_config.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efa61c8c
    • Dan Carpenter's avatar
      sparc: vDSO: Silence an uninitialized variable warning · 62d6f3b7
      Dan Carpenter authored
      Smatch complains that "val" would be uninitialized if kstrtoul() fails.
      
      Fixes: 9a08862a ("vDSO for sparc")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62d6f3b7
    • Nathan Chancellor's avatar
      net: qla3xxx: Remove overflowing shift statement · 8c3bf9b6
      Nathan Chancellor authored
      Clang currently warns:
      
      drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift
      result (0xF00000000) requires 37 bits to represent, but 'int' only has
      32 bits [-Wshift-overflow]
                          ((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
                            ~~~~~~~~~~~~~~ ^  ~~
      1 warning generated.
      
      The warning is certainly accurate since ISP_NVRAM_MASK is defined as
      (0x000F << 16) which is then shifted by 16, resulting in 64424509440,
      well above UINT_MAX.
      
      Given that this is the only location in this driver where ISP_NVRAM_MASK
      is shifted again, it seems likely that ISP_NVRAM_MASK was originally
      defined without a shift and during the move of the shift to the
      definition, this statement wasn't properly removed (since ISP_NVRAM_MASK
      is used in the statenent right above this). Only the maintainers can
      confirm this since this statment has been here since the driver was
      first added to the kernel.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/127Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c3bf9b6
    • David S. Miller's avatar
      Merge branch 'geneve-vxlan-mtu' · dc6d0f0b
      David S. Miller authored
      Stefano Brivio says:
      
      ====================
      geneve, vxlan: Don't set exceptions if skb->len < mtu
      
      This series fixes the exception abuse described in 2/2, and 1/2
      is just a preparatory change to make 2/2 less ugly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc6d0f0b
    • Stefano Brivio's avatar
      geneve, vxlan: Don't set exceptions if skb->len < mtu · 6b4f92af
      Stefano Brivio authored
      We shouldn't abuse exceptions: if the destination MTU is already higher
      than what we're transmitting, no exception should be created.
      
      Fixes: 52a589d5 ("geneve: update skb dst pmtu on tx path")
      Fixes: a93bf0ff ("vxlan: update skb dst pmtu on tx path")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b4f92af
    • Stefano Brivio's avatar
      geneve, vxlan: Don't check skb_dst() twice · 7463e4f9
      Stefano Brivio authored
      Commit f15ca723 ("net: don't call update_pmtu unconditionally") avoids
      that we try updating PMTU for a non-existent destination, but didn't clean
      up cases where the check was already explicit. Drop those redundant checks.
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7463e4f9
    • David S. Miller's avatar
      sparc: Fix syscall fallback bugs in VDSO. · 776ca154
      David S. Miller authored
      First, the trap number for 32-bit syscalls is 0x10.
      
      Also, only negate the return value when syscall error is indicated by
      the carry bit being set.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      776ca154
  3. 17 Oct, 2018 2 commits
    • Steven Rostedt (VMware)'s avatar
      tracing: Use trace_clock_local() for looping in preemptirq_delay_test.c · 12ad0cb2
      Steven Rostedt (VMware) authored
      The preemptirq_delay_test module is used for the ftrace selftest code that
      tests the latency tracers. The problem is that it uses ktime for the delay
      loop, and then checks the tracer to see if the delay loop is caught, but the
      tracer uses trace_clock_local() which uses various different other clocks to
      measure the latency. As ktime uses the clock cycles, and the code then
      converts that to nanoseconds, it causes rounding errors, and the preemptirq
      latency tests are failing due to being off by 1 (it expects to see a delay
      of 500000 us, but the delay is only 499999 us). This is happening due to a
      rounding error in the ktime (which is totally legit). The purpose of the
      test is to see if it can catch the delay, not to test the accuracy between
      trace_clock_local() and ktime_get(). Best to use apples to apples, and have
      the delay loop use the same clock as the latency tracer does.
      
      Cc: stable@vger.kernel.org
      Fixes: f96e8577 ("lib: Add module for testing preemptoff/irqsoff latency tracers")
      Acked-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      12ad0cb2
    • Mathieu Desnoyers's avatar
      tracepoint: Fix tracepoint array element size mismatch · 9c0be3f6
      Mathieu Desnoyers authored
      commit 46e0c9be ("kernel: tracepoints: add support for relative
      references") changes the layout of the __tracepoint_ptrs section on
      architectures supporting relative references. However, it does so
      without turning struct tracepoint * const into const int elsewhere in
      the tracepoint code, which has the following side-effect:
      
      Setting mod->num_tracepoints is done in by module.c:
      
          mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs",
                                               sizeof(*mod->tracepoints_ptrs),
                                               &mod->num_tracepoints);
      
      Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size
      (rather than sizeof(int)), num_tracepoints is erroneously set to half the
      size it should be on 64-bit arch. So a module with an odd number of
      tracepoints misses the last tracepoint due to effect of integer
      division.
      
      So in the module going notifier:
      
              for_each_tracepoint_range(mod->tracepoints_ptrs,
                      mod->tracepoints_ptrs + mod->num_tracepoints,
                      tp_module_going_check_quiescent, NULL);
      
      the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually
      evaluates to something within the bounds of the array, but miss the
      last tracepoint if the number of tracepoints is odd on 64-bit arch.
      
      Fix this by introducing a new typedef: tracepoint_ptr_t, which
      is either "const int" on architectures that have PREL32 relocations,
      or "struct tracepoint * const" on architectures that does not have
      this feature.
      
      Also provide a new tracepoint_ptr_defer() static inline to
      encapsulate deferencing this type rather than duplicate code and
      ugly idefs within the for_each_tracepoint_range() implementation.
      
      This issue appears in 4.19-rc kernels, and should ideally be fixed
      before the end of the rc cycle.
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morris <james.morris@microsoft.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      9c0be3f6