1. 06 Jun, 2020 1 commit
    • Herbert Xu's avatar
      rhashtable: Drop raw RCU deref in nested_table_free · 4a3084aa
      Herbert Xu authored
      This patch replaces some unnecessary uses of rcu_dereference_raw
      in the rhashtable code with rcu_dereference_protected.
      
      The top-level nested table entry is only marked as RCU because it
      shares the same type as the tree entries underneath it.  So it
      doesn't need any RCU protection.
      
      We also don't need RCU protection when we're freeing a nested RCU
      table because by this stage we've long passed a memory barrier
      when anyone could change the nested table.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a3084aa
  2. 05 Jun, 2020 12 commits
  3. 04 Jun, 2020 21 commits
    • Pavel Machek's avatar
      net/xdp: use shift instead of 64 bit division · 7d877c35
      Pavel Machek authored
      64bit division is kind of expensive, and shift should do the job here.
      Signed-off-by: default avatarPavel Machek (CIP) <pavel@denx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d877c35
    • Vinay Kumar Yadav's avatar
      crypto/chtls:Fix compile error when CONFIG_IPV6 is disabled · a624a865
      Vinay Kumar Yadav authored
      Fix compile errors,warnings when CONFIG_IPV6 is disabled and
      inconsistent indenting.
      
      v1->v2:
      - Corrected errors/warnings reported when used newer gcc version,
        unused array.
      
      Fixes: 6abde0b2 ("crypto/chtls: IPv6 support for inline TLS")
      Signed-off-by: default avatarVinay Kumar Yadav <vinay.yadav@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a624a865
    • Paolo Abeni's avatar
      inet_connection_sock: clear inet_num out of destroy helper · 6761893e
      Paolo Abeni authored
      Clearing the 'inet_num' field is necessary and safe if and
      only if the socket is not bound. The MPTCP protocol calls
      the destroy helper on bound sockets, as tcp_v{4,6}_syn_recv_sock
      completed successfully.
      
      Move the clearing of such field out of the common code, otherwise
      the MPTCP MP_JOIN error path will find the wrong 'inet_num' value
      on socket disposal, __inet_put_port() will acquire the wrong lock
      and bind_node removal could race with other modifiers possibly
      corrupting the bind hash table.
      Reported-and-tested-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Fixes: 729cd643 ("mptcp: cope better with MP_JOIN failure")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6761893e
    • Wang Hai's avatar
      yam: fix possible memory leak in yam_init_driver · 98749b71
      Wang Hai authored
      If register_netdev(dev) fails, free_netdev(dev) needs
      to be called, otherwise a memory leak will occur.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98749b71
    • Roelof Berg's avatar
      lan743x: Use correct MAC_CR configuration for 1 GBit speed · 7cdee28c
      Roelof Berg authored
      Corrected the MAC_CR configuration bits for 1 GBit operation. The data
      sheet allows MAC_CR(2:1) to be 10 and also 11 for 1 GBit/s speed, but
      only 10 works correctly.
      
      Devices tested:
      Microchip Lan7431, fixed-phy mode
      Microchip Lan7430, normal phy mode
      
      Fixes: 6f197fb6 ("lan743x: Added fixed link and RGMII support")
      Signed-off-by: default avatarRoelof Berg <rberg@berg-solutions.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7cdee28c
    • Valentin Longchamp's avatar
      net: ethernet: freescale: remove unneeded include for ucc_geth · 09820ce8
      Valentin Longchamp authored
      net/sch_generic.h does not need to be included, remove it.
      Signed-off-by: default avatarValentin Longchamp <valentin@longchamp.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09820ce8
    • Heiner Kallweit's avatar
      r8169: fix failing WoL · 12006848
      Heiner Kallweit authored
      Th referenced change added an extra hw reset to rtl8169_net_suspend()
      what makes WoL fail on few chip versions. Therefore skip the extra
      reset if we're going down and WoL is enabled.
      In rtl_shutdown() rtl8169_hw_reset() is called by rtl8169_net_suspend()
      already if needed, therefore avoid issues issue by removing the extra
      call. The fix was tested on a system with RTL8168g.
      
      Meanwhile rtl8169_hw_reset() does more than a hw reset and should be
      renamed. But that's net-next material.
      
      Fixes: 8ac8e8c6 ("r8169: make rtl8169_down central chip quiesce function")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12006848
    • Dan Carpenter's avatar
      net: ethernet: dwmac: Fix an error code in imx_dwmac_probe() · f6c1fb0a
      Dan Carpenter authored
      The code is return PTR_ERR(NULL) which is zero or success.  We should
      return -ENOMEM instead.
      
      Fixes: 94abdad6 ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarFugang Duan <fugang.duan@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6c1fb0a
    • Ahmed S. Darwish's avatar
      net: mdiobus: Disable preemption upon u64_stats update · c7e261d8
      Ahmed S. Darwish authored
      The u64_stats mechanism uses sequence counters to protect against 64-bit
      values tearing on 32-bit architectures. Updating u64_stats is thus a
      sequence counter write side critical section where preemption must be
      disabled.
      
      For mdiobus_stats_acct(), disable preemption upon the u64_stats update.
      It is called from process context through mdiobus_read() and
      mdiobus_write().
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7e261d8
    • Ahmed S. Darwish's avatar
      u64_stats: Document writer non-preemptibility requirement · 6501bf87
      Ahmed S. Darwish authored
      The u64_stats mechanism uses sequence counters to protect against 64-bit
      values tearing on 32-bit architectures. Updating such statistics is a
      sequence counter write side critical section.
      
      Preemption must be disabled before entering this seqcount write critical
      section.  Failing to do so, the seqcount read side can preempt the write
      side section and spin for the entire scheduler tick.  If that reader
      belongs to a real-time scheduling class, it can spin forever and the
      kernel will livelock.
      
      Document this statistics update side non-preemptibility requirement.
      
      Reword the introductory paragraph to highlight u64_stats raison d'être:
      64-bit values tearing protection on 32-bit architectures. Divide
      documentation on a basis of internal design vs. usage constraints.
      
      Reword the u64_stats header file top comment to always mention "Reader"
      or "Writer" at the start of each bullet point, making it easier to
      follow which side each point is actually for.
      
      Clarify the statement "whole thing is a NOOP on 64bit arches or UP
      kernels".  For 32-bit UP kernels, preemption is always disabled for the
      statistics read side section.
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6501bf87
    • Ahmed S. Darwish's avatar
      net: phy: fixed_phy: Remove unused seqcount · 79cbb6bc
      Ahmed S. Darwish authored
      Commit bf7afb29 ("phy: improve safety of fixed-phy MII register
      reading") protected the fixed PHY status with a sequence counter.
      
      Two years later, commit d2b97793 ("net: phy: fixed-phy: remove
      fixed_phy_update_state()") removed the sequence counter's write side
      critical section -- neutralizing its read side retry loop.
      
      Remove the unused seqcount.
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79cbb6bc
    • Ahmed S. Darwish's avatar
      net: core: device_rename: Use rwsem instead of a seqcount · 11d6011c
      Ahmed S. Darwish authored
      Sequence counters write paths are critical sections that must never be
      preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.
      
      Commit 5dbe7c17 ("net: fix kernel deadlock with interface rename and
      netdev name retrieval.") handled a deadlock, observed with
      CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
      infinitely spinning: it got scheduled after the seqcount write side
      blocked inside its own critical section.
      
      To fix that deadlock, among other issues, the commit added a
      cond_resched() inside the read side section. While this will get the
      non-preemptible kernel eventually unstuck, the seqcount reader is fully
      exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.
      
      The fix is also still broken: if the seqcount reader belongs to a
      real-time scheduling policy, it can spin forever and the kernel will
      livelock.
      
      Disabling preemption over the seqcount write side critical section will
      not work: inside it are a number of GFP_KERNEL allocations and mutex
      locking through the drivers/base/ :: device_rename() call chain.
      
      >From all the above, replace the seqcount with a rwsem.
      
      Fixes: 5dbe7c17 (net: fix kernel deadlock with interface rename and netdev name retrieval.)
      Fixes: 30e6c9fa (net: devnet_rename_seq should be a seqcount)
      Fixes: c91f6df2 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name)
      Cc: <stable@vger.kernel.org>
      Reported-by: kbuild test robot <lkp@intel.com> [ v1 missing up_read() on error exit ]
      Reported-by: Dan Carpenter <dan.carpenter@oracle.com> [ v1 missing up_read() on error exit ]
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11d6011c
    • Michal Vokáč's avatar
      net: dsa: qca8k: Fix "Unexpected gfp" kernel exception · 67122a79
      Michal Vokáč authored
      Commit 7e99e347 ("net: dsa: remove dsa_switch_alloc helper")
      replaced the dsa_switch_alloc helper by devm_kzalloc in all DSA
      drivers. Unfortunately it introduced a typo in qca8k.c driver and
      wrong argument is passed to the devm_kzalloc function.
      
      This fix mitigates the following kernel exception:
      
        Unexpected gfp: 0x6 (__GFP_HIGHMEM|GFP_DMA32). Fixing up to gfp: 0x101 (GFP_DMA|__GFP_ZERO). Fix your code!
        CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.5.9-yocto-ua #1
        Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
        Workqueue: events deferred_probe_work_func
        [<c0014924>] (unwind_backtrace) from [<c00123bc>] (show_stack+0x10/0x14)
        [<c00123bc>] (show_stack) from [<c04c8fb4>] (dump_stack+0x90/0xa4)
        [<c04c8fb4>] (dump_stack) from [<c00e1b10>] (new_slab+0x20c/0x214)
        [<c00e1b10>] (new_slab) from [<c00e1cd0>] (___slab_alloc.constprop.0+0x1b8/0x540)
        [<c00e1cd0>] (___slab_alloc.constprop.0) from [<c00e2074>] (__slab_alloc.constprop.0+0x1c/0x24)
        [<c00e2074>] (__slab_alloc.constprop.0) from [<c00e4538>] (__kmalloc_track_caller+0x1b0/0x298)
        [<c00e4538>] (__kmalloc_track_caller) from [<c02cccac>] (devm_kmalloc+0x24/0x70)
        [<c02cccac>] (devm_kmalloc) from [<c030d888>] (qca8k_sw_probe+0x94/0x1ac)
        [<c030d888>] (qca8k_sw_probe) from [<c0304788>] (mdio_probe+0x30/0x54)
        [<c0304788>] (mdio_probe) from [<c02c93bc>] (really_probe+0x1e0/0x348)
        [<c02c93bc>] (really_probe) from [<c02c9884>] (driver_probe_device+0x60/0x16c)
        [<c02c9884>] (driver_probe_device) from [<c02c7fb0>] (bus_for_each_drv+0x70/0x94)
        [<c02c7fb0>] (bus_for_each_drv) from [<c02c9708>] (__device_attach+0xb4/0x11c)
        [<c02c9708>] (__device_attach) from [<c02c8148>] (bus_probe_device+0x84/0x8c)
        [<c02c8148>] (bus_probe_device) from [<c02c8cec>] (deferred_probe_work_func+0x64/0x90)
        [<c02c8cec>] (deferred_probe_work_func) from [<c0033c14>] (process_one_work+0x1d4/0x41c)
        [<c0033c14>] (process_one_work) from [<c00340a4>] (worker_thread+0x248/0x528)
        [<c00340a4>] (worker_thread) from [<c0039148>] (kthread+0x124/0x150)
        [<c0039148>] (kthread) from [<c00090d8>] (ret_from_fork+0x14/0x3c)
        Exception stack(0xee1b5fb0 to 0xee1b5ff8)
        5fa0:                                     00000000 00000000 00000000 00000000
        5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
        qca8k 2188000.ethernet-1:0a: Using legacy PHYLIB callbacks. Please migrate to PHYLINK!
        qca8k 2188000.ethernet-1:0a eth2 (uninitialized): PHY [2188000.ethernet-1:01] driver [Generic PHY]
        qca8k 2188000.ethernet-1:0a eth1 (uninitialized): PHY [2188000.ethernet-1:02] driver [Generic PHY]
      
      Fixes: 7e99e347 ("net: dsa: remove dsa_switch_alloc helper")
      Signed-off-by: default avatarMichal Vokáč <michal.vokac@ysoft.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67122a79
    • Jiri Benc's avatar
      geneve: change from tx_error to tx_dropped on missing metadata · 9d149045
      Jiri Benc authored
      If the geneve interface is in collect_md (external) mode, it can't send any
      packets submitted directly to its net interface, as such packets won't have
      metadata attached. This is expected.
      
      However, the kernel itself sends some packets to the interface, most
      notably, IPv6 DAD, IPv6 multicast listener reports, etc. This is not wrong,
      as tunnel metadata can be specified in routing table (although technically,
      that has never worked for IPv6, but hopefully will be fixed eventually) and
      then the interface must correctly participate in IPv6 housekeeping.
      
      The problem is that any such attempt increases the tx_error counter. Just
      bringing up a geneve interface with IPv6 enabled is enough to see a number
      of tx_errors. That causes confusion among users, prompting them to find
      a network error where there is none.
      
      Change the counter used to tx_dropped. That better conveys the meaning
      (there's nothing wrong going on, just some packets are getting dropped) and
      hopefully will make admins panic less.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d149045
    • David S. Miller's avatar
      Merge branch 'ena-xdp-fixes' · a9a7d129
      David S. Miller authored
      Sameeh Jubran says:
      
      ====================
      Fix xdp in ena driver
      
      This patchset includes 2 XDP related bug fixes
      
      Difference from v1:
      * Fixed "Fixes" tag
      ====================
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9a7d129
    • Sameeh Jubran's avatar
      net: ena: xdp: update napi budget for DROP and ABORTED · 3921a81c
      Sameeh Jubran authored
      This patch fixes two issues with XDP:
      
      1. If the XDP verdict is XDP_ABORTED we break the loop, which results in
         us handling one buffer per napi cycle instead of the total budget
         (usually 64). To overcome this simply change the xdp_verdict check to
         != XDP_PASS. When the verdict is XDP_PASS, the skb is not expected to
         be NULL.
      
      2. Update the residual budget for XDP_DROP and XDP_ABORTED, since
         packets are handled in these cases.
      
      Fixes: 548c4940 ("net: ena: Implement XDP_TX action")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3921a81c
    • Sameeh Jubran's avatar
      net: ena: xdp: XDP_TX: fix memory leak · cd07eccc
      Sameeh Jubran authored
      When sending very high packet rate, the XDP tx queues can get full and
      start dropping packets. In this case we don't free the pages which
      results in ena driver draining the system memory.
      
      Fix:
      Simply free the pages when necessary.
      
      Fixes: 548c4940 ("net: ena: Implement XDP_TX action")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd07eccc
    • Ahmed Abdelsalam's avatar
      seg6: fix seg6_validate_srh() to avoid slab-out-of-bounds · bb986a50
      Ahmed Abdelsalam authored
      The seg6_validate_srh() is used to validate SRH for three cases:
      
      case1: SRH of data-plane SRv6 packets to be processed by the Linux kernel.
      Case2: SRH of the netlink message received  from user-space (iproute2)
      Case3: SRH injected into packets through setsockopt
      
      In case1, the SRH can be encoded in the Reduced way (i.e., first SID is
      carried in DA only and not represented as SID in the SRH) and the
      seg6_validate_srh() now handles this case correctly.
      
      In case2 and case3, the SRH shouldn’t be encoded in the Reduced way
      otherwise we lose the first segment (i.e., the first hop).
      
      The current implementation of the seg6_validate_srh() allow SRH of case2
      and case3 to be encoded in the Reduced way. This leads a slab-out-of-bounds
      problem.
      
      This patch verifies SRH of case1, case2 and case3. Allowing case1 to be
      reduced while preventing SRH of case2 and case3 from being reduced .
      
      Reported-by: syzbot+e8c028b62439eac42073@syzkaller.appspotmail.com
      Reported-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Fixes: 0cb7498f ("seg6: fix SRH processing to comply with RFC8754")
      Signed-off-by: default avatarAhmed Abdelsalam <ahabdels@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb986a50
    • Tuong Lien's avatar
      tipc: fix NULL pointer dereference in streaming · 5e9eeccc
      Tuong Lien authored
      syzbot found the following crash:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000019: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x00000000000000c8-0x00000000000000cf]
      CPU: 1 PID: 7060 Comm: syz-executor394 Not tainted 5.7.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__tipc_sendstream+0xbde/0x11f0 net/tipc/socket.c:1591
      Code: 00 00 00 00 48 39 5c 24 28 48 0f 44 d8 e8 fa 3e db f9 48 b8 00 00 00 00 00 fc ff df 48 8d bb c8 00 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 e2 04 00 00 48 8b 9b c8 00 00 00 48 b8 00 00 00
      RSP: 0018:ffffc90003ef7818 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8797fd9d
      RDX: 0000000000000019 RSI: ffffffff8797fde6 RDI: 00000000000000c8
      RBP: ffff888099848040 R08: ffff88809a5f6440 R09: fffffbfff1860b4c
      R10: ffffffff8c305a5f R11: fffffbfff1860b4b R12: ffff88809984857e
      R13: 0000000000000000 R14: ffff888086aa4000 R15: 0000000000000000
      FS:  00000000009b4880(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000140 CR3: 00000000a7fdf000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tipc_sendstream+0x4c/0x70 net/tipc/socket.c:1533
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:672
       ____sys_sendmsg+0x32f/0x810 net/socket.c:2352
       ___sys_sendmsg+0x100/0x170 net/socket.c:2406
       __sys_sendmmsg+0x195/0x480 net/socket.c:2496
       __do_sys_sendmmsg net/socket.c:2525 [inline]
       __se_sys_sendmmsg net/socket.c:2522 [inline]
       __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2522
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x440199
      ...
      
      This bug was bisected to commit 0a3e060f ("tipc: add test for Nagle
      algorithm effectiveness"). However, it is not the case, the trouble was
      from the base in the case of zero data length message sending, we would
      unexpectedly make an empty 'txq' queue after the 'tipc_msg_append()' in
      Nagle mode.
      
      A similar crash can be generated even without the bisected patch but at
      the link layer when it accesses the empty queue.
      
      We solve the issues by building at least one buffer to go with socket's
      header and an optional data section that may be empty like what we had
      with the 'tipc_msg_build()'.
      
      Note: the previous commit 4c21daae ("tipc: Fix NULL pointer
      dereference in __tipc_sendstream()") is obsoleted by this one since the
      'txq' will be never empty and the check of 'skb != NULL' is unnecessary
      but it is safe anyway.
      
      Reported-by: syzbot+8eac6d030e7807c21d32@syzkaller.appspotmail.com
      Fixes: c0bceb97 ("tipc: add smart nagle feature")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e9eeccc
    • Cong Wang's avatar
      genetlink: fix memory leaks in genl_family_rcv_msg_dumpit() · c36f0555
      Cong Wang authored
      There are two kinds of memory leaks in genl_family_rcv_msg_dumpit():
      
      1. Before we call ops->start(), whenever an error happens, we forget
         to free the memory allocated in genl_family_rcv_msg_dumpit().
      
      2. When ops->start() fails, the 'info' has been already installed on
         the per socket control block, so we should not free it here. More
         importantly, nlk->cb_running is still false at this point, so
         netlink_sock_destruct() cannot free it either.
      
      The first kind of memory leaks is easier to resolve, but the second
      one requires some deeper thoughts.
      
      After reviewing how netfilter handles this, the most elegant solution
      I find is just to use a similar way to allocate the memory, that is,
      moving memory allocations from caller into ops->start(). With this,
      we can solve both kinds of memory leaks: for 1), no memory allocation
      happens before ops->start(); for 2), ops->start() handles its own
      failures and 'info' is installed to the socket control block only
      when success. The only ugliness here is we have to pass all local
      variables on stack via a struct, but this is not hard to understand.
      
      Alternatively, we can introduce a ops->free() to solve this too,
      but it is overkill as only genetlink has this problem so far.
      
      Fixes: 1927f41a ("net: genetlink: introduce dump info struct to be available during dumpit op")
      Reported-by: syzbot+21f04f481f449c8db840@syzkaller.appspotmail.com
      Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: YueHaibing <yuehaibing@huawei.com>
      Cc: Shaochun Chen <cscnull@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c36f0555
    • Rohit Maheshwari's avatar
      crypto/chcr: error seen if CONFIG_CHELSIO_TLS_DEVICE isn't set · ef1c7559
      Rohit Maheshwari authored
      cxgb4_uld_in_use() is used only by cxgb4_ktls_det_feature() which
      is under CONFIG_CHELSIO_TLS_DEVICE macro.
      
      Fixes: a3ac249a ("cxgb4/chcr: Enable ktls settings at run time")
      Signed-off-by: default avatarRohit Maheshwari <rohitm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef1c7559
  4. 03 Jun, 2020 6 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · cb8e59cc
      Linus Torvalds authored
      Pull networking updates from David Miller:
      
       1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
          Augusto von Dentz.
      
       2) Add GSO partial support to igc, from Sasha Neftin.
      
       3) Several cleanups and improvements to r8169 from Heiner Kallweit.
      
       4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
          device self-test. From Andrew Lunn.
      
       5) Start moving away from custom driver versions, use the globally
          defined kernel version instead, from Leon Romanovsky.
      
       6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.
      
       7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.
      
       8) Add sriov and vf support to hinic, from Luo bin.
      
       9) Support Media Redundancy Protocol (MRP) in the bridging code, from
          Horatiu Vultur.
      
      10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.
      
      11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
          Dubroca. Also add ipv6 support for espintcp.
      
      12) Lots of ReST conversions of the networking documentation, from Mauro
          Carvalho Chehab.
      
      13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
          from Doug Berger.
      
      14) Allow to dump cgroup id and filter by it in inet_diag code, from
          Dmitry Yakunin.
      
      15) Add infrastructure to export netlink attribute policies to
          userspace, from Johannes Berg.
      
      16) Several optimizations to sch_fq scheduler, from Eric Dumazet.
      
      17) Fallback to the default qdisc if qdisc init fails because otherwise
          a packet scheduler init failure will make a device inoperative. From
          Jesper Dangaard Brouer.
      
      18) Several RISCV bpf jit optimizations, from Luke Nelson.
      
      19) Correct the return type of the ->ndo_start_xmit() method in several
          drivers, it's netdev_tx_t but many drivers were using
          'int'. From Yunjian Wang.
      
      20) Add an ethtool interface for PHY master/slave config, from Oleksij
          Rempel.
      
      21) Add BPF iterators, from Yonghang Song.
      
      22) Add cable test infrastructure, including ethool interfaces, from
          Andrew Lunn. Marvell PHY driver is the first to support this
          facility.
      
      23) Remove zero-length arrays all over, from Gustavo A. R. Silva.
      
      24) Calculate and maintain an explicit frame size in XDP, from Jesper
          Dangaard Brouer.
      
      25) Add CAP_BPF, from Alexei Starovoitov.
      
      26) Support terse dumps in the packet scheduler, from Vlad Buslov.
      
      27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.
      
      28) Add devm_register_netdev(), from Bartosz Golaszewski.
      
      29) Minimize qdisc resets, from Cong Wang.
      
      30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
          eliminate set_fs/get_fs calls. From Christoph Hellwig.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
        selftests: net: ip_defrag: ignore EPERM
        net_failover: fixed rollback in net_failover_open()
        Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
        Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
        vmxnet3: allow rx flow hash ops only when rss is enabled
        hinic: add set_channels ethtool_ops support
        selftests/bpf: Add a default $(CXX) value
        tools/bpf: Don't use $(COMPILE.c)
        bpf, selftests: Use bpf_probe_read_kernel
        s390/bpf: Use bcr 0,%0 as tail call nop filler
        s390/bpf: Maintain 8-byte stack alignment
        selftests/bpf: Fix verifier test
        selftests/bpf: Fix sample_cnt shared between two threads
        bpf, selftests: Adapt cls_redirect to call csum_level helper
        bpf: Add csum_level helper for fixing up csum levels
        bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
        sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
        crypto/chtls: IPv6 support for inline TLS
        Crypto/chcr: Fixes a coccinile check error
        Crypto/chcr: Fixes compilations warnings
        ...
      cb8e59cc
    • Linus Torvalds's avatar
      Merge branch 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 2e63f6ce
      Linus Torvalds authored
      Pull comedi uaccess cleanups from Al Viro:
       "Comedi compat ioctls done saner - killing the single biggest pile of
        __get_user/__put_user outside of arch/* in the process"
      
      * 'uaccess.comedi' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        comedi: get rid of compat_alloc_user_space() mess in COMEDI_CMD{,TEST} compat
        comedi: do_cmd_ioctl(): lift copyin/copyout into the caller
        comedi: do_cmdtest_ioctl(): lift copyin/copyout into the caller
        comedi: lift copy_from_user() into callers of __comedi_get_user_cmd()
        comedi: get rid of compat_alloc_user_space() mess in COMEDI_INSNLIST compat
        comedi: get rid of compat_alloc_user_space() mess in COMEDI_INSN compat
        comedi: get rid of compat_alloc_user_space() mess in COMEDI_RANGEINFO compat
        comedi: get rid of compat_alloc_user_space() mess in COMEDI_CHANINFO compat
        comedi: get rid of indirection via translated_ioctl()
        comedi: move compat ioctl handling to native fops
      2e63f6ce
    • Linus Torvalds's avatar
      Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ae03c53d
      Linus Torvalds authored
      Pull splice updates from Al Viro:
       "Christoph's assorted splice cleanups"
      
      * 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: rename pipe_buf ->steal to ->try_steal
        fs: make the pipe_buf_operations ->confirm operation optional
        fs: make the pipe_buf_operations ->steal operation optional
        trace: remove tracing_pipe_buf_ops
        pipe: merge anon_pipe_buf*_ops
        fs: simplify do_splice_from
        fs: simplify do_splice_to
      ae03c53d
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 039aeb9d
      Linus Torvalds authored
      Pull kvm updates from Paolo Bonzini:
       "ARM:
         - Move the arch-specific code into arch/arm64/kvm
      
         - Start the post-32bit cleanup
      
         - Cherry-pick a few non-invasive pre-NV patches
      
        x86:
         - Rework of TLB flushing
      
         - Rework of event injection, especially with respect to nested
           virtualization
      
         - Nested AMD event injection facelift, building on the rework of
           generic code and fixing a lot of corner cases
      
         - Nested AMD live migration support
      
         - Optimization for TSC deadline MSR writes and IPIs
      
         - Various cleanups
      
         - Asynchronous page fault cleanups (from tglx, common topic branch
           with tip tree)
      
         - Interrupt-based delivery of asynchronous "page ready" events (host
           side)
      
         - Hyper-V MSRs and hypercalls for guest debugging
      
         - VMX preemption timer fixes
      
        s390:
         - Cleanups
      
        Generic:
         - switch vCPU thread wakeup from swait to rcuwait
      
        The other architectures, and the guest side of the asynchronous page
        fault work, will come next week"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits)
        KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test
        KVM: check userspace_addr for all memslots
        KVM: selftests: update hyperv_cpuid with SynDBG tests
        x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls
        x86/kvm/hyper-v: enable hypercalls regardless of hypercall page
        x86/kvm/hyper-v: Add support for synthetic debugger interface
        x86/hyper-v: Add synthetic debugger definitions
        KVM: selftests: VMX preemption timer migration test
        KVM: nVMX: Fix VMX preemption timer migration
        x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
        KVM: x86/pmu: Support full width counting
        KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in
        KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT
        KVM: x86: acknowledgment mechanism for async pf page ready notifications
        KVM: x86: interrupt based APF 'page ready' event delivery
        KVM: introduce kvm_read_guest_offset_cached()
        KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present()
        KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info
        Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously"
        KVM: VMX: Replace zero-length array with flexible-array
        ...
      039aeb9d
    • Linus Torvalds's avatar
      Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux · 6b2591c2
      Linus Torvalds authored
      Pull hyper-v updates from Wei Liu:
      
       - a series from Andrea to support channel reassignment
      
       - a series from Vitaly to clean up Vmbus message handling
      
       - a series from Michael to clean up and augment hyperv-tlfs.h
      
       - patches from Andy to clean up GUID usage in Hyper-V code
      
       - a few other misc patches
      
      * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (29 commits)
        Drivers: hv: vmbus: Resolve more races involving init_vp_index()
        Drivers: hv: vmbus: Resolve race between init_vp_index() and CPU hotplug
        vmbus: Replace zero-length array with flexible-array
        Driver: hv: vmbus: drop a no long applicable comment
        hyper-v: Switch to use UUID types directly
        hyper-v: Replace open-coded variant of %*phN specifier
        hyper-v: Supply GUID pointer to printf() like functions
        hyper-v: Use UUID API for exporting the GUID (part 2)
        asm-generic/hyperv: Add definitions for Get/SetVpRegister hypercalls
        x86/hyperv: Split hyperv-tlfs.h into arch dependent and independent files
        x86/hyperv: Remove HV_PROCESSOR_POWER_STATE #defines
        KVM: x86: hyperv: Remove duplicate definitions of Reference TSC Page
        drivers: hv: remove redundant assignment to pointer primary_channel
        scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned
        Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type
        Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug
        Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic
        PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality
        Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal
        hv_utils: Always execute the fcopy and vss callbacks in a tasklet
        ...
      6b2591c2
    • Linus Torvalds's avatar
      Merge tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux · f1e45535
      Linus Torvalds authored
      Pull kgdb updates from Daniel Thompson:
       "By far the biggest change in this cycle are the changes that allow
        much earlier debug of systems that are hooked up via UART by taking
        advantage of the earlycon framework to implement the kgdb I/O hooks
        before handing over to the regular polling I/O drivers once they are
        available. When discussing Doug's work we also found and fixed an
        broken raw_smp_processor_id() sequence in in_dbg_master().
      
        Also included are a collection of much smaller fixes and tweaks: a
        couple of tweaks to ged rid of doc gen or coccicheck warnings, future
        proof some internal calculations that made implicit power-of-2
        assumptions and eliminate some rather weird handling of magic
        environment variables in kdb"
      
      * tag 'kgdb-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
        kdb: Remove the misfeature 'KDBFLAGS'
        kdb: Cleanup math with KDB_CMD_HISTORY_COUNT
        serial: amba-pl011: Support kgdboc_earlycon
        serial: 8250_early: Support kgdboc_earlycon
        serial: qcom_geni_serial: Support kgdboc_earlycon
        serial: kgdboc: Allow earlycon initialization to be deferred
        Documentation: kgdboc: Document new kgdboc_earlycon parameter
        kgdb: Don't call the deinit under spinlock
        kgdboc: Disable all the early code when kgdboc is a module
        kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles
        kgdboc: Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc
        kgdb: Prevent infinite recursive entries to the debugger
        kgdb: Delay "kgdbwait" to dbg_late_init() by default
        kgdboc: Use a platform device to handle tty drivers showing up late
        Revert "kgdboc: disable the console lock when in kgdb"
        kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
        kgdb: Return true in kgdb_nmi_poll_knock()
        kgdb: Drop malformed kernel doc comment
        kgdb: Fix spurious true from in_dbg_master()
      f1e45535