1. 28 Aug, 2019 23 commits
    • Takashi Iwai's avatar
      sky2: Disable MSI on yet another ASUS boards (P6Xxxx) · 189308d5
      Takashi Iwai authored
      A similar workaround for the suspend/resume problem is needed for yet
      another ASUS machines, P6X models.  Like the previous fix, the BIOS
      doesn't provide the standard DMI_SYS_* entry, so again DMI_BOARD_*
      entries are used instead.
      Reported-and-tested-by: default avatarSteveM <swm@swm1.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      189308d5
    • David S. Miller's avatar
      Merge branch 'nfp-flower-fix-bugs-in-merge-tunnel-encap-code' · 807e3299
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      nfp: flower: fix bugs in merge tunnel encap code
      
      John says:
      
      There are few bugs in the merge encap code that have come to light with
      recent driver changes. Effectively, flow bind callbacks were being
      registered twice when using internal ports (new 'busy' code triggers
      this). There was also an issue with neighbour notifier messages being
      ignored for internal ports.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      807e3299
    • John Hurley's avatar
      nfp: flower: handle neighbour events on internal ports · e8024cb4
      John Hurley authored
      Recent code changes to NFP allowed the offload of neighbour entries to FW
      when the next hop device was an internal port. This allows for offload of
      tunnel encap when the end-point IP address is applied to such a port.
      
      Unfortunately, the neighbour event handler still rejects events that are
      not associated with a repr dev and so the firmware neighbour table may get
      out of sync for internal ports.
      
      Fix this by allowing internal port neighbour events to be correctly
      processed.
      
      Fixes: 45756dfe ("nfp: flower: allow tunnels to output to internal port")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8024cb4
    • John Hurley's avatar
      nfp: flower: prevent ingress block binds on internal ports · 739d7c57
      John Hurley authored
      Internal port TC offload is implemented through user-space applications
      (such as OvS) by adding filters at egress via TC clsact qdiscs. Indirect
      block offload support in the NFP driver accepts both ingress qdisc binds
      and egress binds if the device is an internal port. However, clsact sends
      bind notification for both ingress and egress block binds which can lead
      to the driver registering multiple callbacks and receiving multiple
      notifications of new filters.
      
      Fix this by rejecting ingress block bind callbacks when the port is
      internal and only adding filter callbacks for egress binds.
      
      Fixes: 4d12ba42 ("nfp: flower: allow offloading of matches on 'internal' ports")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      739d7c57
    • David S. Miller's avatar
      Merge branch 'r8152-fix-side-effect' · 80a6a5d6
      David S. Miller authored
      Hayes Wang says:
      
      ====================
      r8152: fix side effect
      
      v3:
      Update the commit message for patch #1.
      
      v2:
      Replace patch #2 with "r8152: remove calling netif_napi_del".
      
      v1:
      The commit 0ee1f473 ("r8152: napi hangup fix after disconnect")
      add a check to avoid using napi_disable after netif_napi_del. However,
      the commit ffa9fec3 ("r8152: set RTL8152_UNPLUG only for real
      disconnection") let the check useless.
      
      Therefore, I revert commit 0ee1f473 ("r8152: napi hangup fix
      after disconnect") first, and add another patch to fix it.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80a6a5d6
    • Hayes Wang's avatar
      r8152: remove calling netif_napi_del · 973dc6cf
      Hayes Wang authored
      Remove unnecessary use of netif_napi_del. This also avoids to call
      napi_disable() after netif_napi_del().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      973dc6cf
    • Hayes Wang's avatar
      Revert "r8152: napi hangup fix after disconnect" · 49d4b141
      Hayes Wang authored
      This reverts commit 0ee1f473.
      
      The commit 0ee1f473 ("r8152: napi hangup fix after
      disconnect") adds a check about RTL8152_UNPLUG to determine
      if calling napi_disable() is invalid in rtl8152_close(),
      when rtl8152_disconnect() is called. This avoids to use
      napi_disable() after calling netif_napi_del().
      
      Howver, commit ffa9fec3 ("r8152: set RTL8152_UNPLUG
      only for real disconnection") causes that RTL8152_UNPLUG
      is not always set when calling rtl8152_disconnect().
      Therefore, I have to revert commit 0ee1f473 ("r8152:
      napi hangup fix after disconnect"), first. And submit
      another patch to fix it.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49d4b141
    • Davide Caratti's avatar
      net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue · 092e22e5
      Davide Caratti authored
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we can't assume anymore that
      per-cpu counters are there in the error path of skb_array_produce().
      Otherwise, the following splat can be seen:
      
       Unable to handle kernel paging request at virtual address 0000600dea430008
       Mem abort info:
         ESR = 0x96000005
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000005
         CM = 0, WnR = 0
       user pgtable: 64k pages, 48-bit VAs, pgdp = 000000007b97530e
       [0000600dea430008] pgd=0000000000000000, pud=0000000000000000
       Internal error: Oops: 96000005 [#1] SMP
      [...]
       pstate: 10000005 (nzcV daif -PAN -UAO)
       pc : pfifo_fast_enqueue+0x524/0x6e8
       lr : pfifo_fast_enqueue+0x46c/0x6e8
       sp : ffff800d39376fe0
       x29: ffff800d39376fe0 x28: 1ffff001a07d1e40
       x27: ffff800d03e8f188 x26: ffff800d03e8f200
       x25: 0000000000000062 x24: ffff800d393772f0
       x23: 0000000000000000 x22: 0000000000000403
       x21: ffff800cca569a00 x20: ffff800d03e8ee00
       x19: ffff800cca569a10 x18: 00000000000000bf
       x17: 0000000000000000 x16: 0000000000000000
       x15: 0000000000000000 x14: ffff1001a726edd0
       x13: 1fffe4000276a9a4 x12: 0000000000000000
       x11: dfff200000000000 x10: ffff800d03e8f1a0
       x9 : 0000000000000003 x8 : 0000000000000000
       x7 : 00000000f1f1f1f1 x6 : ffff1001a726edea
       x5 : ffff800cca56a53c x4 : 1ffff001bf9a8003
       x3 : 1ffff001bf9a8003 x2 : 1ffff001a07d1dcb
       x1 : 0000600dea430000 x0 : 0000600dea430008
       Process ping (pid: 6067, stack limit = 0x00000000dc0aa557)
       Call trace:
        pfifo_fast_enqueue+0x524/0x6e8
        htb_enqueue+0x660/0x10e0 [sch_htb]
        __dev_queue_xmit+0x123c/0x2de0
        dev_queue_xmit+0x24/0x30
        ip_finish_output2+0xc48/0x1720
        ip_finish_output+0x548/0x9d8
        ip_output+0x334/0x788
        ip_local_out+0x90/0x138
        ip_send_skb+0x44/0x1d0
        ip_push_pending_frames+0x5c/0x78
        raw_sendmsg+0xed8/0x28d0
        inet_sendmsg+0xc4/0x5c0
        sock_sendmsg+0xac/0x108
        __sys_sendto+0x1ac/0x2a0
        __arm64_sys_sendto+0xc4/0x138
        el0_svc_handler+0x13c/0x298
        el0_svc+0x8/0xc
       Code: f9402e80 d538d081 91002000 8b010000 (885f7c03)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Fixes: 8a53e616 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      CC: Stefano Brivio <sbrivio@redhat.com>
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      092e22e5
    • Willem de Bruijn's avatar
      tcp: inherit timestamp on mtu probe · 888a5c53
      Willem de Bruijn authored
      TCP associates tx timestamp requests with a byte in the bytestream.
      If merging skbs in tcp_mtu_probe, migrate the tstamp request.
      
      Similar to MSG_EOR, do not allow moving a timestamp from any segment
      in the probe but the last. This to avoid merging multiple timestamps.
      
      Tested with the packetdrill script at
      https://github.com/wdebruij/packetdrill/commits/mtu_probe-1
      
      Link: http://patchwork.ozlabs.org/patch/1143278/#2232897
      Fixes: 4ed2d765 ("net-timestamp: TCP timestamping")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      888a5c53
    • Vlad Buslov's avatar
      net: sched: act_sample: fix psample group handling on overwrite · dbf47a2a
      Vlad Buslov authored
      Action sample doesn't properly handle psample_group pointer in overwrite
      case. Following issues need to be fixed:
      
      - In tcf_sample_init() function RCU_INIT_POINTER() is used to set
        s->psample_group, even though we neither setting the pointer to NULL, nor
        preventing concurrent readers from accessing the pointer in some way.
        Use rcu_swap_protected() instead to safely reset the pointer.
      
      - Old value of s->psample_group is not released or deallocated in any way,
        which results resource leak. Use psample_group_put() on non-NULL value
        obtained with rcu_swap_protected().
      
      - The function psample_group_put() that released reference to struct
        psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
        grace period when deallocating it. Extend struct psample_group with rcu
        head and use kfree_rcu when freeing it.
      
      Fixes: 5c5670fa ("net/sched: Introduce sample tc action")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbf47a2a
    • Thomas Falcon's avatar
      ibmvnic: Do not process reset during or after device removal · 36f1031c
      Thomas Falcon authored
      Currently, the ibmvnic driver will not schedule device resets
      if the device is being removed, but does not check the device
      state before the reset is actually processed. This leads to a race
      where a reset is scheduled with a valid device state but is
      processed after the driver has been removed, resulting in an oops.
      
      Fix this by checking the device state before processing a queued
      reset event.
      Reported-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Tested-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36f1031c
    • Justin Pettit's avatar
      openvswitch: Clear the L4 portion of the key for "later" fragments. · 0754b4e8
      Justin Pettit authored
      Only the first fragment in a datagram contains the L4 headers.  When the
      Open vSwitch module parses a packet, it always sets the IP protocol
      field in the key, but can only set the L4 fields on the first fragment.
      The original behavior would not clear the L4 portion of the key, so
      garbage values would be sent in the key for "later" fragments.  This
      patch clears the L4 fields in that circumstance to prevent sending those
      garbage values as part of the upcall.
      Signed-off-by: default avatarJustin Pettit <jpettit@ovn.org>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0754b4e8
    • Greg Rose's avatar
      openvswitch: Properly set L4 keys on "later" IP fragments · ad06a566
      Greg Rose authored
      When IP fragments are reassembled before being sent to conntrack, the
      key from the last fragment is used.  Unless there are reordering
      issues, the last fragment received will not contain the L4 ports, so the
      key for the reassembled datagram won't contain them.  This patch updates
      the key once we have a reassembled datagram.
      
      The handle_fragments() function works on L3 headers so we pull the L3/L4
      flow key update code from key_extract into a new function
      'key_extract_l3l4'.  Then we add a another new function
      ovs_flow_key_update_l3l4() and export it so that it is accessible by
      handle_fragments() for conntrack packet reassembly.
      Co-authored-by: default avatarJustin Pettit <jpettit@ovn.org>
      Signed-off-by: default avatarGreg Rose <gvrose8192@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad06a566
    • Eric Dumazet's avatar
      mld: fix memory leak in mld_del_delrec() · a84d0164
      Eric Dumazet authored
      Similar to the fix done for IPv4 in commit e5b1c6c6
      ("igmp: fix memory leak in igmpv3_del_delrec()"), we need to
      make sure mca_tomb and mca_sources are not blindly overwritten.
      
      Using swap() then a call to ip6_mc_clear_src() will take care
      of the missing free.
      
      BUG: memory leak
      unreferenced object 0xffff888117d9db00 (size 64):
        comm "syz-executor247", pid 6918, jiffies 4294943989 (age 25.350s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 fe 88 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000005b463030>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<000000005b463030>] slab_post_alloc_hook mm/slab.h:522 [inline]
          [<000000005b463030>] slab_alloc mm/slab.c:3319 [inline]
          [<000000005b463030>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
          [<00000000939cbf94>] kmalloc include/linux/slab.h:552 [inline]
          [<00000000939cbf94>] kzalloc include/linux/slab.h:748 [inline]
          [<00000000939cbf94>] ip6_mc_add1_src net/ipv6/mcast.c:2236 [inline]
          [<00000000939cbf94>] ip6_mc_add_src+0x31f/0x420 net/ipv6/mcast.c:2356
          [<00000000d8972221>] ip6_mc_source+0x4a8/0x600 net/ipv6/mcast.c:449
          [<000000002b203d0d>] do_ipv6_setsockopt.isra.0+0x1b92/0x1dd0 net/ipv6/ipv6_sockglue.c:748
          [<000000001f1e2d54>] ipv6_setsockopt+0x89/0xd0 net/ipv6/ipv6_sockglue.c:944
          [<00000000c8f7bdf9>] udpv6_setsockopt+0x4e/0x90 net/ipv6/udp.c:1558
          [<000000005a9a0c5e>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3139
          [<00000000910b37b2>] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
          [<00000000e9108023>] __do_sys_setsockopt net/socket.c:2100 [inline]
          [<00000000e9108023>] __se_sys_setsockopt net/socket.c:2097 [inline]
          [<00000000e9108023>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2097
          [<00000000f4818160>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296
          [<000000008d367e8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1666d49e ("mld: do not remove mld souce list info when set link down")
      Fixes: 9c8bb163 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a84d0164
    • Davide Caratti's avatar
      net/sched: pfifo_fast: fix wrong dereference when qdisc is reset · 04d37cf4
      Davide Caratti authored
      Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
      'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
      counters are present when 'reset()' is called for pfifo_fast qdiscs.
      Otherwise, the following script:
      
       # tc q a dev lo handle 1: root htb default 100
       # tc c a dev lo parent 1: classid 1:100 htb \
       > rate 95Mbit ceil 100Mbit burst 64k
       [...]
       # tc f a dev lo parent 1: protocol arp basic classid 1:100
       [...]
       # tc q a dev lo parent 1:100 handle 100: pfifo_fast
       [...]
       # tc q d dev lo root
      
      can generate the following splat:
      
       Unable to handle kernel paging request at virtual address dfff2c01bd148000
       Mem abort info:
         ESR = 0x96000004
         Exception class = DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
       Data abort info:
         ISV = 0, ISS = 0x00000004
         CM = 0, WnR = 0
       [dfff2c01bd148000] address between user and kernel address ranges
       Internal error: Oops: 96000004 [#1] SMP
       [...]
       pstate: 80000005 (Nzcv daif -PAN -UAO)
       pc : pfifo_fast_reset+0x280/0x4d8
       lr : pfifo_fast_reset+0x21c/0x4d8
       sp : ffff800d09676fa0
       x29: ffff800d09676fa0 x28: ffff200012ee22e4
       x27: dfff200000000000 x26: 0000000000000000
       x25: ffff800ca0799958 x24: ffff1001940f332b
       x23: 0000000000000007 x22: ffff200012ee1ab8
       x21: 0000600de8a40000 x20: 0000000000000000
       x19: ffff800ca0799900 x18: 0000000000000000
       x17: 0000000000000002 x16: 0000000000000000
       x15: 0000000000000000 x14: 0000000000000000
       x13: 0000000000000000 x12: ffff1001b922e6e2
       x11: 1ffff001b922e6e1 x10: 0000000000000000
       x9 : 1ffff001b922e6e1 x8 : dfff200000000000
       x7 : 0000000000000000 x6 : 0000000000000000
       x5 : 1fffe400025dc45c x4 : 1fffe400025dc357
       x3 : 00000c01bd148000 x2 : 0000600de8a40000
       x1 : 0000000000000007 x0 : 0000600de8a40004
       Call trace:
        pfifo_fast_reset+0x280/0x4d8
        qdisc_reset+0x6c/0x370
        htb_reset+0x150/0x3b8 [sch_htb]
        qdisc_reset+0x6c/0x370
        dev_deactivate_queue.constprop.5+0xe0/0x1a8
        dev_deactivate_many+0xd8/0x908
        dev_deactivate+0xe4/0x190
        qdisc_graft+0x88c/0xbd0
        tc_get_qdisc+0x418/0x8a8
        rtnetlink_rcv_msg+0x3a8/0xa78
        netlink_rcv_skb+0x18c/0x328
        rtnetlink_rcv+0x28/0x38
        netlink_unicast+0x3c4/0x538
        netlink_sendmsg+0x538/0x9a0
        sock_sendmsg+0xac/0xf8
        ___sys_sendmsg+0x53c/0x658
        __sys_sendmsg+0xc8/0x140
        __arm64_sys_sendmsg+0x74/0xa8
        el0_svc_handler+0x164/0x468
        el0_svc+0x10/0x14
       Code: 910012a0 92400801 d343fc03 11000c21 (38fb6863)
      
      Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
      before dereferencing 'qdisc->cpu_qstats'.
      
      Changes since v1:
       - coding style improvements, thanks to Stefano Brivio
      
      Fixes: 8a53e616 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
      CC: Paolo Abeni <pabeni@redhat.com>
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04d37cf4
    • David S. Miller's avatar
      Merge branch 'macb-Update-ethernet-compatible-string-for-SiFive-FU540' · 2965daa3
      David S. Miller authored
      Yash Shah says:
      
      ====================
      macb: Update ethernet compatible string for SiFive FU540
      
      This patch series renames the compatible property to a more appropriate
      string. The patchset is based on Linux-5.3-rc6 and tested on SiFive
      Unleashed board
      
      Change history:
      Since v1:
      - Dropped PATCH3 because it's already merged
      - Change the reference url in the patch descriptions to point to a
        'lore.kernel.org' link instead of 'lkml.org'
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2965daa3
    • Yash Shah's avatar
      macb: Update compatibility string for SiFive FU540-C000 · 6342ea88
      Yash Shah authored
      Update the compatibility string for SiFive FU540-C000 as per the new
      string updated in the binding doc.
      Reference:
      https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/Signed-off-by: default avatarYash Shah <yash.shah@sifive.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Reviewed-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Tested-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6342ea88
    • Yash Shah's avatar
      macb: bindings doc: update sifive fu540-c000 binding · abecec41
      Yash Shah authored
      As per the discussion with Nicolas Ferre[0], rename the compatible property
      to a more appropriate and specific string.
      
      [0] https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/Signed-off-by: default avatarYash Shah <yash.shah@sifive.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Reviewed-by: default avatarPaul Walmsley <paul.walmsley@sifive.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      abecec41
    • Eric Dumazet's avatar
      tcp: remove empty skb from write queue in error cases · fdfc5c85
      Eric Dumazet authored
      Vladimir Rutsky reported stuck TCP sessions after memory pressure
      events. Edge Trigger epoll() user would never receive an EPOLLOUT
      notification allowing them to retry a sendmsg().
      
      Jason tested the case of sk_stream_alloc_skb() returning NULL,
      but there are other paths that could lead both sendmsg() and sendpage()
      to return -1 (EAGAIN), with an empty skb queued on the write queue.
      
      This patch makes sure we remove this empty skb so that
      Jason code can detect that the queue is empty, and
      call sk->sk_write_space(sk) accordingly.
      
      Fixes: ce5ec440 ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Reported-by: default avatarVladimir Rutsky <rutsky@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdfc5c85
    • Ka-Cheong Poon's avatar
      net/rds: Fix info leak in rds6_inc_info_copy() · 7d0a0658
      Ka-Cheong Poon authored
      The rds6_inc_info_copy() function has a couple struct members which
      are leaking stack information.  The ->tos field should hold actual
      information and the ->flags field needs to be zeroed out.
      
      Fixes: 3eb45036 ("rds: add type of service(tos) infrastructure")
      Fixes: b7ff8b10 ("rds: Extend RDS API for IPv6 support")
      Reported-by: default avatar黄ID蝴蝶 <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarKa-Cheong Poon <ka-cheong.poon@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d0a0658
    • Feng Sun's avatar
      net: fix skb use after free in netpoll · 2c1644cf
      Feng Sun authored
      After commit baeababb
      ("tun: return NET_XMIT_DROP for dropped packets"),
      when tun_net_xmit drop packets, it will free skb and return NET_XMIT_DROP,
      netpoll_send_skb_on_dev will run into following use after free cases:
      1. retry netpoll_start_xmit with freed skb;
      2. queue freed skb in npinfo->txq.
      queue_process will also run into use after free case.
      
      hit netpoll_send_skb_on_dev first case with following kernel log:
      
      [  117.864773] kernel BUG at mm/slub.c:306!
      [  117.864773] invalid opcode: 0000 [#1] SMP PTI
      [  117.864774] CPU: 3 PID: 2627 Comm: loop_printmsg Kdump: loaded Tainted: P           OE     5.3.0-050300rc5-generic #201908182231
      [  117.864775] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [  117.864775] RIP: 0010:kmem_cache_free+0x28d/0x2b0
      [  117.864781] Call Trace:
      [  117.864781]  ? tun_net_xmit+0x21c/0x460
      [  117.864781]  kfree_skbmem+0x4e/0x60
      [  117.864782]  kfree_skb+0x3a/0xa0
      [  117.864782]  tun_net_xmit+0x21c/0x460
      [  117.864782]  netpoll_start_xmit+0x11d/0x1b0
      [  117.864788]  netpoll_send_skb_on_dev+0x1b8/0x200
      [  117.864789]  __br_forward+0x1b9/0x1e0 [bridge]
      [  117.864789]  ? skb_clone+0x53/0xd0
      [  117.864790]  ? __skb_clone+0x2e/0x120
      [  117.864790]  deliver_clone+0x37/0x50 [bridge]
      [  117.864790]  maybe_deliver+0x89/0xc0 [bridge]
      [  117.864791]  br_flood+0x6c/0x130 [bridge]
      [  117.864791]  br_dev_xmit+0x315/0x3c0 [bridge]
      [  117.864792]  netpoll_start_xmit+0x11d/0x1b0
      [  117.864792]  netpoll_send_skb_on_dev+0x1b8/0x200
      [  117.864792]  netpoll_send_udp+0x2c6/0x3e8
      [  117.864793]  write_msg+0xd9/0xf0 [netconsole]
      [  117.864793]  console_unlock+0x386/0x4e0
      [  117.864793]  vprintk_emit+0x17e/0x280
      [  117.864794]  vprintk_default+0x29/0x50
      [  117.864794]  vprintk_func+0x4c/0xbc
      [  117.864794]  printk+0x58/0x6f
      [  117.864795]  loop_fun+0x24/0x41 [printmsg_loop]
      [  117.864795]  kthread+0x104/0x140
      [  117.864795]  ? 0xffffffffc05b1000
      [  117.864796]  ? kthread_park+0x80/0x80
      [  117.864796]  ret_from_fork+0x35/0x40
      Signed-off-by: default avatarFeng Sun <loyou85@gmail.com>
      Signed-off-by: default avatarXiaojun Zhao <xiaojunzhao141@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c1644cf
    • Vladimir Oltean's avatar
      net: dsa: tag_8021q: Future-proof the reserved fields in the custom VID · bcccb0a5
      Vladimir Oltean authored
      After witnessing the discussion in https://lkml.org/lkml/2019/8/14/151
      w.r.t. ioctl extensibility, it became clear that such an issue might
      prevent that the 3 RSV bits inside the DSA 802.1Q tag might also suffer
      the same fate and be useless for further extension.
      
      So clearly specify that the reserved bits should currently be
      transmitted as zero and ignored on receive. The DSA tagger already does
      this (and has always did), and is the only known user so far (no
      Wireshark dissection plugin, etc). So there should be no incompatibility
      to speak of.
      
      Fixes: 0471dd42 ("net: dsa: tag_8021q: Create a stable binary format")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcccb0a5
    • Marco Hartmann's avatar
      Add genphy_c45_config_aneg() function to phy-c45.c · 94acaeb5
      Marco Hartmann authored
      Commit 34786005 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
      genphy_config_aneg") introduced a check that aborts phy_config_aneg()
      if the phy is a C45 phy.
      This causes phy_state_machine() to call phy_error() so that the phy
      ends up in PHY_HALTED state.
      
      Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
      (analogous to the C22 case) so that the state machine can run
      correctly.
      
      genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
      in drivers/net/phy/marvell10g.c, excluding vendor specific
      configurations for 1000BaseT.
      
      Fixes: 22b56e82 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
      Signed-off-by: default avatarMarco Hartmann <marco.hartmann@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94acaeb5
  2. 27 Aug, 2019 11 commits
    • Cong Wang's avatar
      net_sched: fix a NULL pointer deref in ipt action · 981471bd
      Cong Wang authored
      The net pointer in struct xt_tgdtor_param is not explicitly
      initialized therefore is still NULL when dereferencing it.
      So we have to find a way to pass the correct net pointer to
      ipt_destroy_target().
      
      The best way I find is just saving the net pointer inside the per
      netns struct tcf_idrinfo, which could make this patch smaller.
      
      Fixes: 0c66dc1e ("netfilter: conntrack: register hooks in netns when needed by ruleset")
      Reported-and-tested-by: itugrok@yahoo.com
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      981471bd
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 9e8312f5
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
      
         - Fix a page lock leak in nfs_pageio_resend()
      
         - Ensure O_DIRECT reports an error if the bytes read/written is 0
      
         - Don't handle errors if the bind/connect succeeded
      
         - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was
           invalidat ed"
      
        Bugfixes:
      
         - Don't refresh attributes with mounted-on-file information
      
         - Fix return values for nfs4_file_open() and nfs_finish_open()
      
         - Fix pnfs layoutstats reporting of I/O errors
      
         - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort
           for soft I/O errors when the user specifies a hard mount.
      
         - Various fixes to the error handling in sunrpc
      
         - Don't report writepage()/writepages() errors twice"
      
      * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: remove set but not used variable 'mapping'
        NFSv2: Fix write regression
        NFSv2: Fix eof handling
        NFS: Fix writepage(s) error handling to not report errors twice
        NFS: Fix spurious EIO read errors
        pNFS/flexfiles: Don't time out requests on hard mounts
        SUNRPC: Handle connection breakages correctly in call_status()
        Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"
        SUNRPC: Handle EADDRINUSE and ENOBUFS correctly
        pNFS/flexfiles: Turn off soft RPC calls
        SUNRPC: Don't handle errors if the bind/connect succeeded
        NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()
        NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup
        NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
        NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
        NFSv4: Fix return value in nfs_finish_open()
        NFSv4: Fix return values for nfs4_file_open()
        NFS: Don't refresh attributes with mounted-on-file information
      9e8312f5
    • Linus Torvalds's avatar
      Merge tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 6525771f
      Linus Torvalds authored
      Pull ARC updates from Vineet Gupta:
      
       - support for Edge Triggered IRQs in ARC IDU intc
      
       - other fixes here and there
      
      * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        arc: prefer __section from compiler_attributes.h
        dt-bindings: IDU-intc: Add support for edge-triggered interrupts
        dt-bindings: IDU-intc: Clean up documentation
        ARCv2: IDU-intc: Add support for edge-triggered interrupts
        ARC: unwind: Mark expected switch fall-throughs
        ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
        ARC: fix typo in setup_dma_ops log message
        ARCv2: entry: early return from exception need not clear U & DE bits
      6525771f
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 8d645408
      Linus Torvalds authored
      Pull MFD fix from Lee Jones:
       "Identify potentially unused functions in rk808 driver when !PM"
      
      * tag 'mfd-fixes-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        mfd: rk808: Make PM function declaration static
        mfd: rk808: Mark pm functions __maybe_unused
      8d645408
    • Linus Torvalds's avatar
      Merge tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 0004654f
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes as usual:
      
         - More coverage of USB-audio descriptor sanity checks
      
         - A fix for mute LED regression on Conexant HD-audio codecs
      
         - A few device-specific fixes and quirks for USB-audio and HD-audio
      
         - A fix for (die-hard remaining) possible race in sequencer core
      
         - FireWire oxfw regression fix that was introduced in 5.3-rc1"
      
      * tag 'sound-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: oxfw: fix to handle correct stream for PCM playback
        ALSA: seq: Fix potential concurrent access to the deleted pool
        ALSA: usb-audio: Check mixer unit bitmap yet more strictly
        ALSA: line6: Fix memory leak at line6_init_pcm() error path
        ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate()
        ALSA: hda/ca0132 - Add new SBZ quirk
        ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604
        ALSA: hda - Fixes inverted Conexant GPIO mic mute led
      0004654f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 452a0444
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
          Leoshkevich.
      
       2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
          SMC from Jason Baron.
      
       3) ipv6_mc_may_pull() should return 0 for malformed packets, not
          -EINVAL. From Stefano Brivio.
      
       4) Don't forget to unpin umem xdp pages in error path of
          xdp_umem_reg(). From Ivan Khoronzhuk.
      
       5) Fix sta object leak in mac80211, from Johannes Berg.
      
       6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
          switches. From Florian Fainelli.
      
       7) Revert DMA sync removal from r8169 which was causing regressions on
          some MIPS Loongson platforms. From Heiner Kallweit.
      
       8) Use after free in flow dissector, from Jakub Sitnicki.
      
       9) Fix NULL derefs of net devices during ICMP processing across
          collect_md tunnels, from Hangbin Liu.
      
      10) proto_register() memory leaks, from Zhang Lin.
      
      11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
          from John Fastabend.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
        r8152: Set memory to all 0xFFs on failed reg reads
        openvswitch: Fix conntrack cache with timeout
        ipv4: mpls: fix mpls_xmit for iptunnel
        nexthop: Fix nexthop_num_path for blackhole nexthops
        net: rds: add service level support in rds-info
        net: route dump netlink NLM_F_MULTI flag missing
        s390/qeth: reject oversized SNMP requests
        sock: fix potential memory leak in proto_register()
        MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
        xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
        ipv4/icmp: fix rt dst dev null pointer dereference
        openvswitch: Fix log message in ovs conntrack
        bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
        bpf: fix use after free in prog symbol exposure
        bpf: fix precision tracking in presence of bpf2bpf calls
        flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
        Revert "r8169: remove not needed call to dma_sync_single_for_device"
        ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
        net/ncsi: Fix the payload copying for the request coming from Netlink
        qed: Add cleanup in qed_slowpath_start()
        ...
      452a0444
    • YueHaibing's avatar
      NFS: remove set but not used variable 'mapping' · 99300a85
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      fs/nfs/write.c: In function nfs_page_async_flush:
      fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable]
      
      It is not use since commit aefb623c422e ("NFS: Fix
      writepage(s) error handling to not report errors twice")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      99300a85
    • Trond Myklebust's avatar
      NFSv2: Fix write regression · d33d4beb
      Trond Myklebust authored
      Ensure we update the write result count on success, since the
      RPC call itself does not do so.
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      d33d4beb
    • Trond Myklebust's avatar
      NFSv2: Fix eof handling · 71affe9b
      Trond Myklebust authored
      If we received a reply from the server with a zero length read and
      no error, then that implies we are at eof.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      71affe9b
    • Lee Jones's avatar
      mfd: rk808: Make PM function declaration static · 4d82fa67
      Lee Jones authored
      Avoids:
        ../drivers/mfd/rk808.c:771:1: warning: symbol 'rk8xx_pm_ops' \
          was not declared. Should it be static?
      
      Fixes: 5752bc43 ("mfd: rk808: Mark pm functions __maybe_unused")
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      4d82fa67
    • Arnd Bergmann's avatar
      mfd: rk808: Mark pm functions __maybe_unused · 5752bc43
      Arnd Bergmann authored
      The newly added suspend/resume functions are only used if CONFIG_PM
      is enabled:
      
      drivers/mfd/rk808.c:752:12: error: 'rk8xx_resume' defined but not used [-Werror=unused-function]
      drivers/mfd/rk808.c:732:12: error: 'rk8xx_suspend' defined but not used [-Werror=unused-function]
      
      Mark them as __maybe_unused so the compiler can silently drop them
      when they are not needed.
      
      Fixes: 586c1b41 ("mfd: rk808: Add RK817 and RK809 support")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      5752bc43
  3. 26 Aug, 2019 6 commits