1. 25 May, 2022 2 commits
    • Maciej Żenczykowski's avatar
      xfrm: do not set IPv4 DF flag when encapsulating IPv6 frames <= 1280 bytes. · 6821ad87
      Maciej Żenczykowski authored
      One may want to have DF set on large packets to support discovering
      path mtu and limiting the size of generated packets (hence not
      setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still
      supporting networks that are incapable of carrying even minimal
      sized IPv6 frames (post encapsulation).
      
      Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that
      are not larger than the minimum IPv6 mtu of 1280 isn't useful,
      because the resulting ICMP Fragmentation Required error isn't
      actionable (even assuming you receive it) because IPv6 will not
      drop it's path mtu below 1280 anyway.  While the IPv4 stack
      could prefrag the packets post encap, this requires the ICMP
      error to be successfully delivered and causes a loss of the
      original IPv6 frame (thus requiring a retransmit and latency
      hit).  Luckily with IPv4 if we simply don't set the DF flag,
      we'll just make further fragmenting the packets some other
      router's problems.
      
      We'll still learn the correct IPv4 path mtu through encapsulation
      of larger IPv6 frames.
      
      I'm still not convinced this patch is entirely sufficient to make
      everything happy... but I don't see how it could possibly
      make things worse.
      
      See also recent:
        4ff2980b 'xfrm: fix tunnel model fragmentation behavior'
      and friends
      
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Lina Wang <lina.wang@mediatek.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarMaciej Zenczykowski <maze@google.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      6821ad87
    • Michal Kubecek's avatar
      Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" · 9c90c9b3
      Michal Kubecek authored
      This reverts commit 4dc2a5a8.
      
      A non-zero return value from pfkey_broadcast() does not necessarily mean
      an error occurred as this function returns -ESRCH when no registered
      listener received the message. In particular, a call with
      BROADCAST_PROMISC_ONLY flag and null one_sk argument can never return
      zero so that this commit in fact prevents processing any PF_KEY message.
      One visible effect is that racoon daemon fails to find encryption
      algorithms like aes and refuses to start.
      
      Excluding -ESRCH return value would fix this but it's not obvious that
      we really want to bail out here and most other callers of
      pfkey_broadcast() also ignore the return value. Also, as pointed out by
      Steffen Klassert, PF_KEY is kind of deprecated and newer userspace code
      should use netlink instead so that we should only disturb the code for
      really important fixes.
      
      v2: add a comment explaining why is the return value ignored
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      9c90c9b3
  2. 23 May, 2022 2 commits
    • liuyacan's avatar
      net/smc: fix listen processing for SMC-Rv2 · 8c3b8dc5
      liuyacan authored
      In the process of checking whether RDMAv2 is available, the current
      implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
      smc buf desc, but the latter may fail. Unfortunately, the caller
      will only check the former. In this case, a NULL pointer reference
      will occur in smc_clc_send_confirm_accept() when accessing
      conn->rmb_desc.
      
      This patch does two things:
      1. Use the return code to determine whether V2 is available.
      2. If the return code is NODEV, continue to check whether V1 is
      available.
      
      Fixes: e49300a6 ("net/smc: add listen processing for SMC-Rv2")
      Signed-off-by: default avatarliuyacan <liuyacan@corp.netease.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c3b8dc5
    • liuyacan's avatar
      net/smc: postpone sk_refcnt increment in connect() · 75c1edf2
      liuyacan authored
      Same trigger condition as commit 86434744. When setsockopt runs
      in parallel to a connect(), and switch the socket into fallback
      mode. Then the sk_refcnt is incremented in smc_connect(), but
      its state stay in SMC_INIT (NOT SMC_ACTIVE). This cause the
      corresponding sk_refcnt decrement in __smc_release() will not be
      performed.
      
      Fixes: 86434744 ("net/smc: add fallback check to connect()")
      Signed-off-by: default avatarliuyacan <liuyacan@corp.netease.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75c1edf2
  3. 22 May, 2022 13 commits
    • Randy Dunlap's avatar
      net: dsa: restrict SMSC_LAN9303_I2C kconfig · 0a3ad7d3
      Randy Dunlap authored
      Since kconfig 'select' does not follow dependency chains, if symbol KSA
      selects KSB, then KSA should also depend on the same symbols that KSB
      depends on, in order to prevent Kconfig warnings and possible build
      errors.
      
      Change NET_DSA_SMSC_LAN9303_I2C and NET_DSA_SMSC_LAN9303_MDIO so that
      they are limited to VLAN_8021Q if the latter is enabled. This prevents
      the Kconfig warning:
      
      WARNING: unmet direct dependencies detected for NET_DSA_SMSC_LAN9303
        Depends on [m]: NETDEVICES [=y] && NET_DSA [=y] && (VLAN_8021Q [=m] || VLAN_8021Q [=m]=n)
        Selected by [y]:
        - NET_DSA_SMSC_LAN9303_I2C [=y] && NETDEVICES [=y] && NET_DSA [=y] && I2C [=y]
      
      Fixes: 430065e2 ("net: dsa: lan9303: add VLAN IDs to master device")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Vivien Didelot <vivien.didelot@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Vladimir Oltean <olteanv@gmail.com>
      Cc: Juergen Borleis <jbe@pengutronix.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Mans Rullgard <mans@mansr.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a3ad7d3
    • David S. Miller's avatar
      Merge branch 'dpaa2-swtso-fixes' · 7e4d1c23
      David S. Miller authored
      Ioana Ciornei says:
      
      ====================
      dpaa2-eth: software TSO fixes
      
      This patch fixes the software TSO feature in dpaa2-eth.
      
      There are multiple errors that I made in the initial submission of the
      code, which I didn't caught since I was always running with passthough
      IOMMU.
      
      The bug report came in bugzilla:
      https://bugzilla.kernel.org/show_bug.cgi?id=215886
      
      The bugs are in the Tx confirmation path, where I was trying to retrieve
      a virtual address after DMA unmapping the area. Besides that, another
      dma_unmap call was made with the wrong size.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e4d1c23
    • Ioana Ciornei's avatar
      dpaa2-eth: unmap the SGT buffer before accessing its contents · 0a09c5b8
      Ioana Ciornei authored
      DMA unmap the Scatter/Gather table before going through the array to
      unmap and free each of the header and data chunks. This is so we do not
      touch the data between the dma_map and dma_unmap calls.
      
      Fixes: 3dc709e0 ("dpaa2-eth: add support for software TSO")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a09c5b8
    • Ioana Ciornei's avatar
      dpaa2-eth: use the correct software annotation field · d5f4e19a
      Ioana Ciornei authored
      The incorrect software annotation field was being used, swa->sg.sgt_size
      instead of swa->tso.sgt_size, which meant that the SGT buffer was
      unmapped with a wrong size.
      This is also confirmed by the DMA API debug prints which showed the
      following:
      
      [   38.962434] DMA-API: fsl_dpaa2_eth dpni.2: device driver frees DMA memory with different size [device address=0x0000fffffafba740] [map size=224 bytes] [unmap size=0 bytes]
      [   38.980496] WARNING: CPU: 11 PID: 1131 at kernel/dma/debug.c:973 check_unmap+0x58c/0x9b0
      [   38.988586] Modules linked in:
      [   38.991631] CPU: 11 PID: 1131 Comm: iperf3 Not tainted 5.18.0-rc7-00117-g59130eeb2b8f #1972
      [   38.999970] Hardware name: NXP Layerscape LX2160ARDB (DT)
      
      Fixes: 3dc709e0 ("dpaa2-eth: add support for software TSO")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5f4e19a
    • Ioana Ciornei's avatar
      dpaa2-eth: retrieve the virtual address before dma_unmap · 06d12994
      Ioana Ciornei authored
      The TSO header was DMA unmapped before the virtual address was retrieved
      and then used to free the buffer. This meant that we were actually
      removing the DMA map and then trying to search for it to help in
      retrieving the virtual address. This lead to a invalid virtual address
      being used in the kfree call.
      
      Fix this by calling dpaa2_iova_to_virt() prior to the dma_unmap call.
      
      [  487.231819] Unable to handle kernel paging request at virtual address fffffd9807000008
      
      (...)
      
      [  487.354061] Hardware name: SolidRun LX2160A Honeycomb (DT)
      [  487.359535] pstate: a0400005 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  487.366485] pc : kfree+0xac/0x304
      [  487.369799] lr : kfree+0x204/0x304
      [  487.373191] sp : ffff80000c4eb120
      [  487.376493] x29: ffff80000c4eb120 x28: ffff662240c46400 x27: 0000000000000001
      [  487.383621] x26: 0000000000000001 x25: ffff662246da0cc0 x24: ffff66224af78000
      [  487.390748] x23: ffffad184f4ce008 x22: ffffad1850185000 x21: ffffad1838d13cec
      [  487.397874] x20: ffff6601c0000000 x19: fffffd9807000000 x18: 0000000000000000
      [  487.405000] x17: ffffb910cdc49000 x16: ffffad184d7d9080 x15: 0000000000004000
      [  487.412126] x14: 0000000000000008 x13: 000000000000ffff x12: 0000000000000000
      [  487.419252] x11: 0000000000000004 x10: 0000000000000001 x9 : ffffad184d7d927c
      [  487.426379] x8 : 0000000000000000 x7 : 0000000ffffffd1d x6 : ffff662240a94900
      [  487.433505] x5 : 0000000000000003 x4 : 0000000000000009 x3 : ffffad184f4ce008
      [  487.440632] x2 : ffff662243eec000 x1 : 0000000100000100 x0 : fffffc0000000000
      [  487.447758] Call trace:
      [  487.450194]  kfree+0xac/0x304
      [  487.453151]  dpaa2_eth_free_tx_fd.isra.0+0x33c/0x3e0 [fsl_dpaa2_eth]
      [  487.459507]  dpaa2_eth_tx_conf+0x100/0x2e0 [fsl_dpaa2_eth]
      [  487.464989]  dpaa2_eth_poll+0xdc/0x380 [fsl_dpaa2_eth]
      
      Fixes: 3dc709e0 ("dpaa2-eth: add support for software TSO")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215886Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06d12994
    • Christophe JAILLET's avatar
      hinic: Avoid some over memory allocation · 15d221d0
      Christophe JAILLET authored
      'prod_idx' (atomic_t) is larger than 'shadow_idx' (u16), so some memory is
      over-allocated.
      
      Fixes: b15a9f37 ("net-next/hinic: Add wq")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15d221d0
    • Uwe Kleine-König's avatar
      net: fec: Do proper error checking for optional clks · 43252ed1
      Uwe Kleine-König authored
      An error code returned by devm_clk_get() might have other meanings than
      "This clock doesn't exist". So use devm_clk_get_optional() and handle
      all remaining errors as fatal.
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43252ed1
    • David S. Miller's avatar
      Merge branch 'rxrpc-fixes' · c12b9588
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Miscellaneous fixes
      
      Here are some fixes for AF_RXRPC:
      
       (1) Fix listen() allowing preallocation to overrun the prealloc buffer.
      
       (2) Prevent resending the request if we've seen the reply starting to
           arrive.
      
       (3) Fix accidental sharing of ACK state between transmission and
           reception.
      
       (4) Ignore ACKs in which ack.previousPacket regresses.  This indicates the
           highest DATA number so far seen, so should not be seen to go
           backwards.
      
       (5) Fix the determination of when to generate an IDLE-type ACK,
           simplifying it so that we generate one if we have more than two DATA
           packets that aren't hard-acked (consumed) or soft-acked (in the rx
           buffer, but could be discarded and re-requested).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c12b9588
    • David Howells's avatar
      rxrpc: Fix decision on when to generate an IDLE ACK · 9a3dedcf
      David Howells authored
      Fix the decision on when to generate an IDLE ACK by keeping a count of the
      number of packets we've received, but not yet soft-ACK'd, and the number of
      packets we've processed, but not yet hard-ACK'd, rather than trying to keep
      track of which DATA sequence numbers correspond to those points.
      
      We then generate an ACK when either counter exceeds 2.  The counters are
      both cleared when we transcribe the information into any sort of ACK packet
      for transmission.  IDLE and DELAY ACKs are skipped if both counters are 0
      (ie. no change).
      
      Fixes: 805b21b9 ("rxrpc: Send an ACK after every few DATA packets we receive")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a3dedcf
    • David Howells's avatar
      rxrpc: Don't let ack.previousPacket regress · 81524b63
      David Howells authored
      The previousPacket field in the rx ACK packet should never go backwards -
      it's now the highest DATA sequence number received, not the last on
      received (it used to be used for out of sequence detection).
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81524b63
    • David Howells's avatar
      rxrpc: Fix overlapping ACK accounting · 8940ba3c
      David Howells authored
      Fix accidental overlapping of Rx-phase ACK accounting with Tx-phase ACK
      accounting through variables shared between the two.  call->acks_* members
      refer to ACKs received in the Tx phase and call->ackr_* members to ACKs
      sent/to be sent during the Rx phase.
      
      Fixes: 1a2391c3 ("rxrpc: Fix detection of out of order acks")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Jeffrey Altman <jaltman@auristor.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8940ba3c
    • David Howells's avatar
      rxrpc: Don't try to resend the request if we're receiving the reply · 114af61f
      David Howells authored
      rxrpc has a timer to trigger resending of unacked data packets in a call.
      This is not cancelled when a client call switches to the receive phase on
      the basis that most calls don't last long enough for it to ever expire.
      However, if it *does* expire after we've started to receive the reply, we
      shouldn't then go into trying to retransmit or pinging the server to find
      out if an ack got lost.
      
      Fix this by skipping the resend code if we're into receiving the reply to a
      client call.
      
      Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      114af61f
    • David Howells's avatar
      rxrpc: Fix listen() setting the bar too high for the prealloc rings · 88e22159
      David Howells authored
      AF_RXRPC's listen() handler lets you set the backlog up to 32 (if you bump
      up the sysctl), but whilst the preallocation circular buffers have 32 slots
      in them, one of them has to be a dead slot because we're using CIRC_CNT().
      
      This means that listen(rxrpc_sock, 32) will cause an oops when the socket
      is closed because rxrpc_service_prealloc_one() allocated one too many calls
      and rxrpc_discard_prealloc() won't then be able to get rid of them because
      it'll think the ring is empty.  rxrpc_release_calls_on_socket() then tries
      to abort them, but oopses because call->peer isn't yet set.
      
      Fix this by setting the maximum backlog to RXRPC_BACKLOG_MAX - 1 to match
      the ring capacity.
      
       BUG: kernel NULL pointer dereference, address: 0000000000000086
       ...
       RIP: 0010:rxrpc_send_abort_packet+0x73/0x240 [rxrpc]
       Call Trace:
        <TASK>
        ? __wake_up_common_lock+0x7a/0x90
        ? rxrpc_notify_socket+0x8e/0x140 [rxrpc]
        ? rxrpc_abort_call+0x4c/0x60 [rxrpc]
        rxrpc_release_calls_on_socket+0x107/0x1a0 [rxrpc]
        rxrpc_release+0xc9/0x1c0 [rxrpc]
        __sock_release+0x37/0xa0
        sock_close+0x11/0x20
        __fput+0x89/0x240
        task_work_run+0x59/0x90
        do_exit+0x319/0xaa0
      
      Fixes: 00e90712 ("rxrpc: Preallocate peers, conns and calls for incoming service requests")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lists.infradead.org/pipermail/linux-afs/2022-March/005079.htmlSigned-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88e22159
  4. 21 May, 2022 5 commits
  5. 20 May, 2022 3 commits
  6. 19 May, 2022 12 commits
    • Harini Katakam's avatar
      net: macb: Fix PTP one step sync support · 5cebb40b
      Harini Katakam authored
      PTP one step sync packets cannot have CSUM padding and insertion in
      SW since time stamp is inserted on the fly by HW.
      In addition, ptp4l version 3.0 and above report an error when skb
      timestamps are reported for packets that not processed for TX TS
      after transmission.
      Add a helper to identify PTP one step sync and fix the above two
      errors. Add a common mask for PTP header flag field "twoStepflag".
      Also reset ptp OSS bit when one step is not selected.
      
      Fixes: ab91f0a9 ("net: macb: Add hardware PTP support")
      Fixes: 653e92a9 ("net: macb: add support for padding and fcs computation")
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Reviewed-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Link: https://lore.kernel.org/r/20220518170756.7752-1-harini.katakam@xilinx.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cebb40b
    • Linus Torvalds's avatar
      Merge tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d904c8cc
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, xfrm and netfilter subtrees.
      
        Notably this reverts a recent TCP/DCCP netns-related change to address
        a possible UaF.
      
        Current release - regressions:
      
         - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"
      
         - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in
           ifdown
      
        Previous releases - regressions:
      
         - netfilter: flowtable: fix TCP flow teardown
      
         - can: revert "can: m_can: pci: use custom bit timings for Elkhart
           Lake"
      
         - xfrm: check encryption module availability consistency
      
         - eth: vmxnet3: fix possible use-after-free bugs in
           vmxnet3_rq_alloc_rx_buf()
      
         - eth: mlx5: initialize flow steering during driver probe
      
         - eth: ice: fix crash when writing timestamp on RX rings
      
        Previous releases - always broken:
      
         - mptcp: fix checksum byte order
      
         - eth: lan966x: fix assignment of the MAC address
      
         - eth: mlx5: remove HW-GRO from reported features
      
         - eth: ftgmac100: disable hardware checksum on AST2600"
      
      * tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
        net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.
        ptp: ocp: change sysfs attr group handling
        selftests: forwarding: fix missing backslash
        netfilter: nf_tables: disable expression reduction infra
        netfilter: flowtable: move dst_check to packet path
        netfilter: flowtable: fix TCP flow teardown
        net: ftgmac100: Disable hardware checksum on AST2600
        igb: skip phy status check where unavailable
        nfc: pn533: Fix buggy cleanup order
        mptcp: Do TCP fallback on early DSS checksum failure
        mptcp: fix checksum byte order
        net: af_key: check encryption module availability consistency
        net: af_key: add check for pfkey_broadcast in function pfkey_process
        net/mlx5: Drain fw_reset when removing device
        net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
        net/mlx5e: CT: Fix support for GRE tuples
        net/mlx5e: Remove HW-GRO from reported features
        net/mlx5e: Properly block HW GRO when XDP is enabled
        net/mlx5e: Properly block LRO when XDP is enabled
        net/mlx5e: Block rx-gro-hw feature in switchdev mode
        ...
      d904c8cc
    • Andrew Lunn's avatar
      net: bridge: Clear offload_fwd_mark when passing frame up bridge interface. · fbb3abdf
      Andrew Lunn authored
      It is possible to stack bridges on top of each other. Consider the
      following which makes use of an Ethernet switch:
      
             br1
           /    \
          /      \
         /        \
       br0.11    wlan0
         |
         br0
       /  |  \
      p1  p2  p3
      
      br0 is offloaded to the switch. Above br0 is a vlan interface, for
      vlan 11. This vlan interface is then a slave of br1. br1 also has a
      wireless interface as a slave. This setup trunks wireless lan traffic
      over the copper network inside a VLAN.
      
      A frame received on p1 which is passed up to the bridge has the
      skb->offload_fwd_mark flag set to true, indicating that the switch has
      dealt with forwarding the frame out ports p2 and p3 as needed. This
      flag instructs the software bridge it does not need to pass the frame
      back down again. However, the flag is not getting reset when the frame
      is passed upwards. As a result br1 sees the flag, wrongly interprets
      it, and fails to forward the frame to wlan0.
      
      When passing a frame upwards, clear the flag. This is the Rx
      equivalent of br_switchdev_frame_unmark() in br_dev_xmit().
      
      Fixes: f1c2eddf ("bridge: switchdev: Use an helper to clear forward mark")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.chSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fbb3abdf
    • Jonathan Lemon's avatar
      ptp: ocp: change sysfs attr group handling · c2239294
      Jonathan Lemon authored
      In the detach path, the driver calls sysfs_remove_group() for the
      groups it believes has been registered.  However, if the group was
      never previously registered, then this causes a splat.
      
      Instead, compute the groups that should be registered in advance,
      and then call sysfs_create_groups(), which registers them all at once.
      
      Update the error handling appropriately.
      
      Fixes: c205d53c ("ptp: ocp: Add firmware capability bits for feature gating")
      Reported-by: default avatarZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Link: https://lore.kernel.org/r/20220517214600.10606-1-jonathan.lemon@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2239294
    • Joachim Wiberg's avatar
      selftests: forwarding: fix missing backslash · 090f9dd0
      Joachim Wiberg authored
      Fix missing backslash, introduced in f62c5acc.  Causes all tests to
      not be installed.
      
      Fixes: f62c5acc ("selftests/net/forwarding: add missing tests to Makefile")
      Signed-off-by: default avatarJoachim Wiberg <troglobit@gmail.com>
      Acked-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20220518151630.2747773-1-troglobit@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      090f9dd0
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 7dc02d7f
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Reduce number of hardware offload retries from flowtable datapath
         which might hog system with retries, from Felix Fietkau.
      
      2) Skip neighbour lookup for PPPoE device, fill_forward_path() already
         provides this and set on destination address from fill_forward_path for
         PPPoE device, also from Felix.
      
      4) When combining PPPoE on top of a VLAN device, set info->outdev to the
         PPPoE device so software offload works, from Felix.
      
      5) Fix TCP teardown flowtable state, races with conntrack gc might result
         in resetting the state to ESTABLISHED and the time to one day. Joint
         work with Oz Shlomo and Sven Auhagen.
      
      6) Call dst_check() from flowtable datapath to check if dst is stale
         instead of doing it from garbage collector path.
      
      7) Disable register tracking infrastructure, either user-space or
         kernel need to pre-fetch keys inconditionally, otherwise register
         tracking assumes data is already available in register that might
         not well be there, leading to incorrect reductions.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: disable expression reduction infra
        netfilter: flowtable: move dst_check to packet path
        netfilter: flowtable: fix TCP flow teardown
        netfilter: nft_flow_offload: fix offload with pppoe + vlan
        net: fix dev_fill_forward_path with pppoe + bridge
        netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices
        netfilter: flowtable: fix excessive hw offload attempts after failure
      ====================
      
      Link: https://lore.kernel.org/r/20220518213841.359653-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7dc02d7f
    • Linus Torvalds's avatar
      Merge tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block · f993aed4
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "Just a small fix for a missing fifo time assigment for the head
        insertion case in mq-deadline"
      
      * tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block:
        block/mq-deadline: Set the fifo_time member also if inserting at head
      f993aed4
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.18-2022-05-18' of git://git.kernel.dk/linux-block · 01464a73
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two small changes fixing issues from the 5.18 merge window:
      
         - Fix wrong ordering of a tracepoint (Dylan)
      
         - Fix MSG_RING on IOPOLL rings (me)"
      
      * tag 'io_uring-5.18-2022-05-18' of git://git.kernel.dk/linux-block:
        io_uring: don't attempt to IOPOLL for MSG_RING requests
        io_uring: fix ordering of args in io_uring_queue_async_work
      01464a73
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · 8194a008
      Linus Torvalds authored
      Pull audit fix from Paul Moore:
       "A single audit patch to fix a problem where a task's audit_context was
        not being properly reset with io_uring"
      
      * tag 'audit-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts
      8194a008
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 6899c161
      Linus Torvalds authored
      Pull selinux fix from Paul Moore:
       "A single SELinux patch to fix an error path that was doing the wrong
        thing with respect to freeing memory"
      
      * tag 'selinux-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix bad cleanup on error in hashtab_duplicate()
      6899c161
    • Linus Torvalds's avatar
      Merge branch 'arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 5494d0eb
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "The SoC bug fixes have calmed down sufficiently, there is one minor
        update for the MAINTAINERS file, and few bug fixes for dts
        descriptions:
      
         - Updates to the BananaPi R2-Pro (rk3568) dts to match production
           hardware rather than the prototype version.
      
         - Qualcomm sm8250 soundwire gets disabled on some machines to avoid
           crashes
      
         - A number of aspeed SoC specific fixes, addressing incorrect pin
           cotrol settings, some values in the romed8hm board, and a revert
           for an accidental removal of a DT node"
      
      * 'arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        MAINTAINERS: omap: remove me as a maintainer
        ARM: dts: aspeed: Add video engine to g6
        ARM: dts: aspeed: romed8hm3: Fix GPIOB0 name
        ARM: dts: aspeed: romed8hm3: Add lm25066 sense resistor values
        ARM: dts: aspeed-g6: fix SPI1/SPI2 quad pin group
        ARM: dts: aspeed-g6: add FWQSPI group in pinctrl dtsi
        dt-bindings: pinctrl: aspeed-g6: add FWQSPI function/group
        pinctrl: pinctrl-aspeed-g6: add FWQSPI function-group
        dt-bindings: pinctrl: aspeed-g6: remove FWQSPID group
        pinctrl: pinctrl-aspeed-g6: remove FWQSPID group in pinctrl
        ARM: dts: aspeed-g6: remove FWQSPID group in pinctrl dtsi
        arm64: dts: qcom: sm8250: don't enable rx/tx macro by default
        arm64: dts: rockchip: Add gmac1 and change network settings of bpi-r2-pro
        arm64: dts: rockchip: Change io-domains of bpi-r2-pro
      5494d0eb
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · dbd380bb
      Linus Torvalds authored
      Pull misc fixes from Al Viro:
       "vhost race fix and a percpu_ref_init-caused cgroup double-free fix.
      
        The latter had manifested as buggered struct mount refcounting - those
        are also using percpu data structures, but anything that does percpu
        allocations could be hit"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Fix double fget() in vhost_net_set_backend()
        percpu_ref_init(): clean ->percpu_count_ref on failure
      dbd380bb
  7. 18 May, 2022 3 commits
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · db1fd3fc
      Linus Torvalds authored
      Pull mlx5 fix from Michael Tsirkin:
       "One last minute fixup
      
        The patch has been on list for a while but as it was posted as part of
        a thread it was missed"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa/mlx5: Use consistent RQT size
      db1fd3fc
    • Al Viro's avatar
      Fix double fget() in vhost_net_set_backend() · fb4554c2
      Al Viro authored
      Descriptor table is a shared resource; two fget() on the same descriptor
      may return different struct file references.  get_tap_ptr_ring() is
      called after we'd found (and pinned) the socket we'll be using and it
      tries to find the private tun/tap data structures associated with it.
      Redoing the lookup by the same file descriptor we'd used to get the
      socket is racy - we need to same struct file.
      
      Thanks to Jason for spotting a braino in the original variant of patch -
      I'd missed the use of fd == -1 for disabling backend, and in that case
      we can end up with sock == NULL and sock != oldsock.
      
      Cc: stable@kernel.org
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      fb4554c2
    • Eli Cohen's avatar
      vdpa/mlx5: Use consistent RQT size · acde3929
      Eli Cohen authored
      The current code evaluates RQT size based on the configured number of
      virtqueues. This can raise an issue in the following scenario:
      
      Assume MQ was negotiated.
      1. mlx5_vdpa_set_map() gets called.
      2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower
         than the configured max VQs.
      3. A second set_map gets called, but now a smaller number of VQs is used
         to evaluate the size of the RQT.
      4. handle_ctrl_mq() is called with a value larger than what the RQT can
         hold. This will emit errors and the driver state is compromised.
      
      To fix this, we use a new field in struct mlx5_vdpa_net to hold the
      required number of entries in the RQT. This value is evaluated in
      mlx5_vdpa_set_driver_features() where we have the negotiated features
      all set up.
      
      In addition to that, we take into consideration the max capability of RQT
      entries early when the device is added so we don't need to take consider
      it when creating the RQT.
      
      Last, we remove the use of mlx5_vdpa_max_qps() which just returns the
      max_vas / 2 and make the code clearer.
      
      Fixes: 52893733 ("vdpa/mlx5: Add multiqueue support")
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      acde3929