1. 01 Sep, 2022 4 commits
    • David Howells's avatar
      rxrpc: Fix calc of resend age · 214a9dc7
      David Howells authored
      Fix the calculation of the resend age to add a microsecond value as
      microseconds, not nanoseconds.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      214a9dc7
    • David Howells's avatar
      rxrpc: Fix local destruction being repeated · d3d86303
      David Howells authored
      If the local processor work item for the rxrpc local endpoint gets requeued
      by an event (such as an incoming packet) between it getting scheduled for
      destruction and the UDP socket being closed, the rxrpc_local_destroyer()
      function can get run twice.  The second time it can hang because it can end
      up waiting for cleanup events that will never happen.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d3d86303
    • David Howells's avatar
      rxrpc: Fix an insufficiently large sglist in rxkad_verify_packet_2() · 0d40f728
      David Howells authored
      rxkad_verify_packet_2() has a small stack-allocated sglist of 4 elements,
      but if that isn't sufficient for the number of fragments in the socket
      buffer, we try to allocate an sglist large enough to hold all the
      fragments.
      
      However, for large packets with a lot of fragments, this isn't sufficient
      and we need at least one additional fragment.
      
      The problem manifests as skb_to_sgvec() returning -EMSGSIZE and this then
      getting returned by userspace.  Most of the time, this isn't a problem as
      rxrpc sets a limit of 5692, big enough for 4 jumbo subpackets to be glued
      together; occasionally, however, the server will ignore the reported limit
      and give a packet that's a lot bigger - say 19852 bytes with ->nr_frags
      being 7.  skb_to_sgvec() then tries to return a "zeroth" fragment that
      seems to occur before the fragments counted by ->nr_frags and we hit the
      end of the sglist too early.
      
      Note that __skb_to_sgvec() also has an skb_walk_frags() loop that is
      recursive up to 24 deep.  I'm not sure if I need to take account of that
      too - or if there's an easy way of counting those frags too.
      
      Fix this by counting an extra frag and allocating a larger sglist based on
      that.
      
      Fixes: d0d5c0cd ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      0d40f728
    • David Howells's avatar
      rxrpc: Fix ICMP/ICMP6 error handling · ac56a0b4
      David Howells authored
      Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing
      it to siphon off UDP packets early in the handling of received UDP packets
      thereby avoiding the packet going through the UDP receive queue, it doesn't
      get ICMP packets through the UDP ->sk_error_report() callback.  In fact, it
      doesn't appear that there's any usable option for getting hold of ICMP
      packets.
      
      Fix this by adding a new UDP encap hook to distribute error messages for
      UDP tunnels.  If the hook is set, then the tunnel driver will be able to
      see ICMP packets.  The hook provides the offset into the packet of the UDP
      header of the original packet that caused the notification.
      
      An alternative would be to call the ->error_handler() hook - but that
      requires that the skbuff be cloned (as ip_icmp_error() or ipv6_cmp_error()
      do, though isn't really necessary or desirable in rxrpc's case is we want
      to parse them there and then, not queue them).
      
      Changes
      =======
      ver #3)
       - Fixed an uninitialised variable.
      
      ver #2)
       - Fixed some missing CONFIG_AF_RXRPC_IPV6 conditionals.
      
      Fixes: 5271953c ("rxrpc: Use the UDP encap_rcv hook")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      ac56a0b4
  2. 30 Aug, 2022 2 commits
  3. 29 Aug, 2022 3 commits
    • David S. Miller's avatar
      Merge branch 'u64_stats-fixups' · cb10b0f9
      David S. Miller authored
      Sebastian Andrzej Siewior says:
      
      ====================
      net: u64_stats fixups for 32bit.
      
      while looking at the u64-stats patch
      	https://lore.kernel.org/all/20220817162703.728679-10-bigeasy@linutronix.de
      
      I noticed that u64_stats_fetch_begin() is used. That suspicious thing
      about it is that network processing, including stats update, is
      performed in NAPI and so I would expect to see
      u64_stats_fetch_begin_irq() in order to avoid updates from NAPI during
      the read. This is only needed on 32bit-UP where the seqcount is not
      used. This is address in 2/2. The remaining user take some kind of
      precaution and may use u64_stats_fetch_begin().
      
      I updated the previously mentioned patch to get rid of
      u64_stats_fetch_begin_irq(). If this is not considered stable patch
      worthy then it can be ignored and considred fixed by the other series
      which removes the special 32bit cases.
      
      The xrs700x driver reads and writes the counter from preemptible context
      so the only missing piece here is at least disable preemption on the
      writer side to avoid preemption while the writer is in progress. The
      possible reader would spin then until the writer completes its write
      critical section which is considered bad. This is addressed in 1/2 by
      using u64_stats_update_begin_irqsave() and so disable interrupts during
      the write critical section.
      The other closet resemblance I found is mdio_bus.c::mdiobus_stats_acct()
      where preemtion is disabled unconditionally. This is something I want to
      avoid since it also affects 64bit.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb10b0f9
    • Sebastian Andrzej Siewior's avatar
      net: Use u64_stats_fetch_begin_irq() for stats fetch. · 278d3ba6
      Sebastian Andrzej Siewior authored
      On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the
      reader is in preemptible context and the writer side
      (u64_stats_update_begin*()) runs in an interrupt context (IRQ or
      softirq) then the writer can update the stats during the read operation.
      This update remains undetected.
      
      Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP
      are not interrupted by a writer. 32bit-SMP remains unaffected by this
      change.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Catherine Sullivan <csully@google.com>
      Cc: David Awogbemila <awogbemila@google.com>
      Cc: Dimitris Michailidis <dmichail@fungible.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jeroen de Borst <jeroendb@google.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Simon Horman <simon.horman@corigine.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-wireless@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: oss-drivers@corigine.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278d3ba6
    • Sebastian Andrzej Siewior's avatar
      net: dsa: xrs700x: Use irqsave variant for u64 stats update · 3f8ae9fe
      Sebastian Andrzej Siewior authored
      xrs700x_read_port_counters() updates the stats from a worker using the
      u64_stats_update_begin() version. This is okay on 32-UP since on the
      reader side preemption is disabled.
      On 32bit-SMP the writer can be preempted by the reader at which point
      the reader will spin on the seqcount until writer continues and
      completes the update.
      
      Assigning the mib_mutex mutex to the underlying seqcount would ensure
      proper synchronisation. The API for that on the u64_stats_init() side
      isn't available. Since it is the only user, just use disable interrupts
      during the update.
      
      Use u64_stats_update_begin_irqsave() on the writer side to ensure an
      uninterrupted update.
      
      Fixes: ee00b24f ("net: dsa: add Arrow SpeedChips XRS700x driver")
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: George McCollister <george.mccollister@gmail.com>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Vladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarGeorge McCollister <george.mccollister@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f8ae9fe
  4. 27 Aug, 2022 7 commits
  5. 26 Aug, 2022 2 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 2e085ec0
      David S. Miller authored
      Daniel borkmann says:
      
      ====================
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 11 non-merge commits during the last 14 day(s) which contain
      a total of 13 files changed, 61 insertions(+), 24 deletions(-).
      
      The main changes are:
      
      1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi.
      
      2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger.
      
      3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu.
      
      4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui.
      
      5) Fix range tracking for array poke descriptors, from Daniel Borkmann.
      
      6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson.
      
      7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian.
      
      8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima.
      
      9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e085ec0
    • David S. Miller's avatar
      Merge tag 'wireless-2022-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 4ba9d38b
      David S. Miller authored
      Johannes Berg says:
      
      ====================
      pull-request: wireless-2022-08-26
      
      Here are a couple of fixes for the current cycle,
      see the tag description below.
      
      Just a couple of fixes:
       * two potential leaks
       * use-after-free in certain scan races
       * warning in IBSS code
       * error return from a debugfs file was wrong
       * possible NULL-ptr-deref when station lookup fails
      
      Please pull and let me know if there's any problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ba9d38b
  6. 25 Aug, 2022 22 commits
    • Zhengping Jiang's avatar
      Bluetooth: hci_sync: hold hdev->lock when cleanup hci_conn · 2da8eb83
      Zhengping Jiang authored
      When disconnecting all devices, hci_conn_failed is used to cleanup
      hci_conn object when the hci_conn object cannot be aborted.
      The function hci_conn_failed requires the caller holds hdev->lock.
      
      Fixes: 9b3628d7 ("Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted")
      Signed-off-by: default avatarZhengping Jiang <jiangzp@google.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      2da8eb83
    • Wolfram Sang's avatar
      Bluetooth: move from strlcpy with unused retval to strscpy · cb0d160f
      Wolfram Sang authored
      Follow the advice of the below link and prefer 'strscpy' in this
      subsystem. Conversion is 1:1 because the return value is not used.
      Generated by a coccinelle script.
      
      Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      cb0d160f
    • Archie Pusaka's avatar
      Bluetooth: hci_event: Fix checking conn for le_conn_complete_evt · f48735a9
      Archie Pusaka authored
      To prevent multiple conn complete events, we shouldn't look up the
      conn with hci_lookup_le_connect, since it requires the state to be
      BT_CONNECT. By the time the duplicate event is processed, the state
      might have changed, so we end up processing the new event anyway.
      
      Change the lookup function to hci_conn_hash_lookup_ba.
      
      Fixes: d5ebaa7c ("Bluetooth: hci_event: Ignore multiple conn complete events")
      Signed-off-by: default avatarArchie Pusaka <apusaka@chromium.org>
      Reviewed-by: default avatarSonny Sasaka <sonnysasaka@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      f48735a9
    • Luiz Augusto von Dentz's avatar
      Bluetooth: ISO: Fix not handling shutdown condition · c5729093
      Luiz Augusto von Dentz authored
      In order to properly handle shutdown syscall the code shall not assume
      that the how argument is always SHUT_RDWR resulting in SHUTDOWN_MASK as
      that would result in poll to immediately report EPOLLHUP instead of
      properly waiting for disconnect_cfm (Disconnect Complete) which is
      rather important for the likes of BAP as the CIG may need to be
      reprogrammed.
      
      Fixes: ccf74f23 ("Bluetooth: Add BTPROTO_ISO socket type")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      c5729093
    • Tetsuo Handa's avatar
      Bluetooth: hci_sync: fix double mgmt_pending_free() in remove_adv_monitor() · 3cfbc6ac
      Tetsuo Handa authored
      syzbot is reporting double kfree() at remove_adv_monitor() [1], for
      commit 7cf5c297 ("Bluetooth: hci_sync: Refactor remove Adv
      Monitor") forgot to remove duplicated mgmt_pending_remove() when
      merging "if (err) {" path and "if (!pending) {" path.
      
      Link: https://syzkaller.appspot.com/bug?extid=915a8416bf15895b8e07 [1]
      Reported-by: default avatarsyzbot <syzbot+915a8416bf15895b8e07@syzkaller.appspotmail.com>
      Fixes: 7cf5c297 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      3cfbc6ac
    • Luiz Augusto von Dentz's avatar
      Bluetooth: MGMT: Fix Get Device Flags · 23b72814
      Luiz Augusto von Dentz authored
      Get Device Flags don't check if device does actually use an RPA in which
      case it shall only set HCI_CONN_FLAG_REMOTE_WAKEUP if LL Privacy is
      enabled.
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      23b72814
    • Luiz Augusto von Dentz's avatar
      Bluetooth: L2CAP: Fix build errors in some archs · b840304f
      Luiz Augusto von Dentz authored
      This attempts to fix the follow errors:
      
      In function 'memcmp',
          inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9,
          inlined from 'l2cap_global_chan_by_psm' at
          net/bluetooth/l2cap_core.c:2003:15:
      ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp'
      specified bound 6 exceeds source size 0 [-Werror=stringop-overread]
         44 | #define __underlying_memcmp     __builtin_memcmp
            |                                 ^
      ./include/linux/fortify-string.h:420:16: note: in expansion of macro
      '__underlying_memcmp'
        420 |         return __underlying_memcmp(p, q, size);
            |                ^~~~~~~~~~~~~~~~~~~
      In function 'memcmp',
          inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9,
          inlined from 'l2cap_global_chan_by_psm' at
          net/bluetooth/l2cap_core.c:2004:15:
      ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp'
      specified bound 6 exceeds source size 0 [-Werror=stringop-overread]
         44 | #define __underlying_memcmp     __builtin_memcmp
            |                                 ^
      ./include/linux/fortify-string.h:420:16: note: in expansion of macro
      '__underlying_memcmp'
        420 |         return __underlying_memcmp(p, q, size);
            |                ^~~~~~~~~~~~~~~~~~~
      
      Fixes: 332f1795 ("Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b840304f
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_sync: Fix suspend performance regression · 1fd02d56
      Luiz Augusto von Dentz authored
      This attempts to fix suspend performance when there is no connections by
      not updating the event mask.
      
      Fixes: ef61b6ea ("Bluetooth: Always set event mask on suspend")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      1fd02d56
    • Hans de Goede's avatar
      Bluetooth: hci_event: Fix vendor (unknown) opcode status handling · b82a26d8
      Hans de Goede authored
      Commit c8992cff ("Bluetooth: hci_event: Use of a function table to
      handle Command Complete") was (presumably) meant to only refactor things
      without any functional changes.
      
      But it does have one undesirable side-effect, before *status would always
      be set to skb->data[0] and it might be overridden by some of the opcode
      specific handling. While now it always set by the opcode specific handlers.
      This means that if the opcode is not known *status does not get set any
      more at all!
      
      This behavior change has broken bluetooth support for BCM4343A0 HCIs,
      the hci_bcm.c code tries to configure UART attached HCIs at a higher
      baudraute using vendor specific opcodes. The BCM4343A0 does not
      support this and this used to simply fail:
      
      [   25.646442] Bluetooth: hci0: BCM: failed to write clock (-56)
      [   25.646481] Bluetooth: hci0: Failed to set baudrate
      
      After which things would continue with the initial baudraute. But now
      that hci_cmd_complete_evt() no longer sets status for unknown opcodes
      *status is left at 0. This causes the hci_bcm.c code to think the baudraute
      has been changed on the HCI side and to also adjust the UART baudrate,
      after which communication with the HCI is broken, leading to:
      
      [   28.579042] Bluetooth: hci0: command 0x0c03 tx timeout
      [   36.961601] Bluetooth: hci0: BCM: Reset failed (-110)
      
      And non working bluetooth. Fix this by restoring the previous
      default "*status = skb->data[0]" handling for unknown opcodes.
      
      Fixes: c8992cff ("Bluetooth: hci_event: Use of a function table to handle Command Complete")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b82a26d8
    • Daniel Borkmann's avatar
      bpf: Don't use tnum_range on array range checking for poke descriptors · a657182a
      Daniel Borkmann authored
      Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which
      is based on a customized syzkaller:
      
        BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0
        Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489
        CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
        1.13.0-1ubuntu1.1 04/01/2014
        Call Trace:
         <TASK>
         dump_stack_lvl+0x9c/0xc9
         print_address_description.constprop.0+0x1f/0x1f0
         ? bpf_int_jit_compile+0x1257/0x13f0
         kasan_report.cold+0xeb/0x197
         ? kvmalloc_node+0x170/0x200
         ? bpf_int_jit_compile+0x1257/0x13f0
         bpf_int_jit_compile+0x1257/0x13f0
         ? arch_prepare_bpf_dispatcher+0xd0/0xd0
         ? rcu_read_lock_sched_held+0x43/0x70
         bpf_prog_select_runtime+0x3e8/0x640
         ? bpf_obj_name_cpy+0x149/0x1b0
         bpf_prog_load+0x102f/0x2220
         ? __bpf_prog_put.constprop.0+0x220/0x220
         ? find_held_lock+0x2c/0x110
         ? __might_fault+0xd6/0x180
         ? lock_downgrade+0x6e0/0x6e0
         ? lock_is_held_type+0xa6/0x120
         ? __might_fault+0x147/0x180
         __sys_bpf+0x137b/0x6070
         ? bpf_perf_link_attach+0x530/0x530
         ? new_sync_read+0x600/0x600
         ? __fget_files+0x255/0x450
         ? lock_downgrade+0x6e0/0x6e0
         ? fput+0x30/0x1a0
         ? ksys_write+0x1a8/0x260
         __x64_sys_bpf+0x7a/0xc0
         ? syscall_enter_from_user_mode+0x21/0x70
         do_syscall_64+0x3b/0x90
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
        RIP: 0033:0x7f917c4e2c2d
      
      The problem here is that a range of tnum_range(0, map->max_entries - 1) has
      limited ability to represent the concrete tight range with the tnum as the
      set of resulting states from value + mask can result in a superset of the
      actual intended range, and as such a tnum_in(range, reg->var_off) check may
      yield true when it shouldn't, for example tnum_range(0, 2) would result in
      00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here
      represented by a less precise superset of {0, 1, 2, 3}. As the register is
      known const scalar, really just use the concrete reg->var_off.value for the
      upper index check.
      
      Fixes: d2e4c1e6 ("bpf: Constant map key tracking for prog array pokes")
      Reported-by: default avatarHsin-Wei Hung <hsinweih@uci.edu>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.netSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a657182a
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4c612826
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from ipsec and netfilter (with one broken Fixes tag).
      
        Current release - new code bugs:
      
         - dsa: don't dereference NULL extack in dsa_slave_changeupper()
      
         - dpaa: fix <1G ethernet on LS1046ARDB
      
         - neigh: don't call kfree_skb() under spin_lock_irqsave()
      
        Previous releases - regressions:
      
         - r8152: fix the RX FIFO settings when suspending
      
         - dsa: microchip: keep compatibility with device tree blobs with no
           phy-mode
      
         - Revert "net: macsec: update SCI upon MAC address change."
      
         - Revert "xfrm: update SA curlft.use_time", comply with RFC 2367
      
        Previous releases - always broken:
      
         - netfilter: conntrack: work around exceeded TCP receive window
      
         - ipsec: fix a null pointer dereference of dst->dev on a metadata dst
           in xfrm_lookup_with_ifid
      
         - moxa: get rid of asymmetry in DMA mapping/unmapping
      
         - dsa: microchip: make learning configurable and keep it off while
           standalone
      
         - ice: xsk: prohibit usage of non-balanced queue id
      
         - rxrpc: fix locking in rxrpc's sendmsg
      
        Misc:
      
         - another chunk of sysctl data race silencing"
      
      * tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
        net: lantiq_xrx200: restore buffer if memory allocation failed
        net: lantiq_xrx200: fix lock under memory pressure
        net: lantiq_xrx200: confirm skb is allocated before using
        net: stmmac: work around sporadic tx issue on link-up
        ionic: VF initial random MAC address if no assigned mac
        ionic: fix up issues with handling EAGAIN on FW cmds
        ionic: clear broken state on generation change
        rxrpc: Fix locking in rxrpc's sendmsg
        net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
        MAINTAINERS: rectify file entry in BONDING DRIVER
        i40e: Fix incorrect address type for IPv6 flow rules
        ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
        net: Fix a data-race around sysctl_somaxconn.
        net: Fix a data-race around netdev_unregister_timeout_secs.
        net: Fix a data-race around gro_normal_batch.
        net: Fix data-races around sysctl_devconf_inherit_init_net.
        net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
        net: Fix a data-race around netdev_budget_usecs.
        net: Fix data-races around sysctl_max_skb_frags.
        net: Fix a data-race around netdev_budget.
        ...
      4c612826
    • Jakub Kicinski's avatar
      Merge branch 'net-lantiq_xrx200-fix-errors-under-memory-pressure' · d974730c
      Jakub Kicinski authored
      Aleksander Jan Bajkowski says:
      
      ====================
      net: lantiq_xrx200: fix errors under memory pressure
      
      This series fixes issues that can occur in the driver under memory pressure.
      Situations when the system cannot allocate memory are rare, so the mentioned
      bugs have been fixed recently. The patches have been tested on a BT Home
      router with the Lantiq xRX200 chipset.
      
      Changelog:
        v3: - removed netdev_err() log from the first patch
        v2:
         - the second patch has been changed, so that under memory pressure situation
           the driver will not receive packets indefinitely regardless of the NAPI budget,
         - the third patch has been added.
      ====================
      
      Link: https://lore.kernel.org/r/20220824215408.4695-1-olek2@wp.plSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d974730c
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: restore buffer if memory allocation failed · c9c3b177
      Aleksander Jan Bajkowski authored
      In a situation where memory allocation fails, an invalid buffer address
      is stored. When this descriptor is used again, the system panics in the
      build_skb() function when accessing memory.
      
      Fixes: 7ea6cd16 ("lantiq: net: fix duplicated skb in rx descriptor ring")
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c9c3b177
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: fix lock under memory pressure · c4b6e934
      Aleksander Jan Bajkowski authored
      When the xrx200_hw_receive() function returns -ENOMEM, the NAPI poll
      function immediately returns an error.
      This is incorrect for two reasons:
      * the function terminates without enabling interrupts or scheduling NAPI,
      * the error code (-ENOMEM) is returned instead of the number of received
      packets.
      
      After the first memory allocation failure occurs, packet reception is
      locked due to disabled interrupts from DMA..
      
      Fixes: fe1a5642 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c4b6e934
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: confirm skb is allocated before using · c8b04370
      Aleksander Jan Bajkowski authored
      xrx200_hw_receive() assumes build_skb() always works and goes straight
      to skb_reserve(). However, build_skb() can fail under memory pressure.
      
      Add a check in case build_skb() failed to allocate and return NULL.
      
      Fixes: e0155935 ("net: lantiq_xrx200: convert to build_skb")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c8b04370
    • Heiner Kallweit's avatar
      net: stmmac: work around sporadic tx issue on link-up · a3a57bf0
      Heiner Kallweit authored
      This is a follow-up to the discussion in [0]. It seems to me that
      at least the IP version used on Amlogic SoC's sometimes has a problem
      if register MAC_CTRL_REG is written whilst the chip is still processing
      a previous write. But that's just a guess.
      Adding a delay between two writes to this register helps, but we can
      also simply omit the offending second write. This patch uses the second
      approach and is based on a suggestion from Qi Duan.
      Benefit of this approach is that we can save few register writes, also
      on not affected chip versions.
      
      [0] https://www.spinics.net/lists/netdev/msg831526.html
      
      Fixes: bfab27a1 ("stmmac: add the experimental PCI support")
      Suggested-by: default avatarQi Duan <qi.duan@amlogic.com>
      Suggested-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/e99857ce-bd90-5093-ca8c-8cd480b5a0a2@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a3a57bf0
    • Jakub Kicinski's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · ef332fe1
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-08-24 (ixgbe, i40e)
      
      This series contains updates to ixgbe and i40e drivers.
      
      Jake stops incorrect resetting of SYSTIME registers when starting
      cyclecounter for ixgbe.
      
      Sylwester corrects a check on source IP address when validating destination
      for i40e.
      
      * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        i40e: Fix incorrect address type for IPv6 flow rules
        ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
      ====================
      
      Link: https://lore.kernel.org/r/20220824193748.874343-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ef332fe1
    • Jakub Kicinski's avatar
      Merge branch 'ionic-bug-fixes' · 92df825a
      Jakub Kicinski authored
      Shannon Nelson says:
      
      ====================
      ionic: bug fixes
      
      These are a couple of maintenance bug fixes for the Pensando ionic
      networking driver.
      
      Mohamed takes care of a "plays well with others" issue where the
      VF spec is a bit vague on VF mac addresses, but certain customers
      have come to expect behavior based on other vendor drivers.
      
      Shannon addresses a couple of corner cases seen in internal
      stress testing.
      ====================
      
      Link: https://lore.kernel.org/r/20220824165051.6185-1-snelson@pensando.ioSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      92df825a
    • R Mohamed Shah's avatar
      ionic: VF initial random MAC address if no assigned mac · 19058be7
      R Mohamed Shah authored
      Assign a random mac address to the VF interface station
      address if it boots with a zero mac address in order to match
      similar behavior seen in other VF drivers.  Handle the errors
      where the older firmware does not allow the VF to set its own
      station address.
      
      Newer firmware will allow the VF to set the station mac address
      if it hasn't already been set administratively through the PF.
      Setting it will also be allowed if the VF has trust.
      
      Fixes: fbb39807 ("ionic: support sr-iov operations")
      Signed-off-by: default avatarR Mohamed Shah <mohamed@pensando.io>
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      19058be7
    • Shannon Nelson's avatar
      ionic: fix up issues with handling EAGAIN on FW cmds · 0fc4dd45
      Shannon Nelson authored
      In looping on FW update tests we occasionally see the
      FW_ACTIVATE_STATUS command fail while it is in its EAGAIN loop
      waiting for the FW activate step to finsh inside the FW.  The
      firmware is complaining that the done bit is set when a new
      dev_cmd is going to be processed.
      
      Doing a clean on the cmd registers and doorbell before exiting
      the wait-for-done and cleaning the done bit before the sleep
      prevents this from occurring.
      
      Fixes: fbfb8031 ("ionic: Add hardware init and device commands")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0fc4dd45
    • Shannon Nelson's avatar
      ionic: clear broken state on generation change · 9cb9dadb
      Shannon Nelson authored
      There is a case found in heavy testing where a link flap happens just
      before a firmware Recovery event and the driver gets stuck in the
      BROKEN state.  This comes from the driver getting interrupted by a FW
      generation change when coming back up from the link flap, and the call
      to ionic_start_queues() in ionic_link_status_check() fails.  This can be
      addressed by having the fw_up code clear the BROKEN bit if seen, rather
      than waiting for a user to manually force the interface down and then
      back up.
      
      Fixes: 9e8eaf84 ("ionic: stop watchdog when in broken state")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9cb9dadb
    • David Howells's avatar
      rxrpc: Fix locking in rxrpc's sendmsg · b0f571ec
      David Howells authored
      Fix three bugs in the rxrpc's sendmsg implementation:
      
       (1) rxrpc_new_client_call() should release the socket lock when returning
           an error from rxrpc_get_call_slot().
      
       (2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
           held in the event that we're interrupted by a signal whilst waiting
           for tx space on the socket or relocking the call mutex afterwards.
      
           Fix this by: (a) moving the unlock/lock of the call mutex up to
           rxrpc_send_data() such that the lock is not held around all of
           rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
           whether we're return with the lock dropped.  Note that this means
           recvmsg() will not block on this call whilst we're waiting.
      
       (3) After dropping and regaining the call mutex, rxrpc_send_data() needs
           to go and recheck the state of the tx_pending buffer and the
           tx_total_len check in case we raced with another sendmsg() on the same
           call.
      
      Thinking on this some more, it might make sense to have different locks for
      sendmsg() and recvmsg().  There's probably no need to make recvmsg() wait
      for sendmsg().  It does mean that recvmsg() can return MSG_EOR indicating
      that a call is dead before a sendmsg() to that call returns - but that can
      currently happen anyway.
      
      Without fix (2), something like the following can be induced:
      
      	WARNING: bad unlock balance detected!
      	5.16.0-rc6-syzkaller #0 Not tainted
      	-------------------------------------
      	syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
      	[<ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	but there are no more locks to release!
      
      	other info that might help us debug this:
      	no locks held by syz-executor011/3597.
      	...
      	Call Trace:
      	 <TASK>
      	 __dump_stack lib/dump_stack.c:88 [inline]
      	 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      	 print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
      	 __lock_release kernel/locking/lockdep.c:5306 [inline]
      	 lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
      	 __mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
      	 rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
      	 sock_sendmsg_nosec net/socket.c:704 [inline]
      	 sock_sendmsg+0xcf/0x120 net/socket.c:724
      	 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
      	 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
      	 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
      	 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      	 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      [Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      cc: Hawkins Jiawei <yin31149@gmail.com>
      cc: Khalid Masum <khalid.masum.92@gmail.com>
      cc: Dan Carpenter <dan.carpenter@oracle.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b0f571ec