1. 04 Jul, 2022 11 commits
  2. 03 Jul, 2022 1 commit
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 280e3a85
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Insufficient validation of element datatype and length in
         nft_setelem_parse_data(). At least commit 7d740264 updates
         maximum element data area up to 64 bytes when only 16 bytes
         where supported at the time. Support for larger element size
         came later in fdb9c405 though. Picking this older commit
         as Fixes: tag to be safe than sorry.
      
      2) Memleak in pipapo destroy path, reproducible when transaction
         in aborted. This is already triggering in the existing netfilter
         test infrastructure since more recent new tests are covering this
         path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      280e3a85
  3. 02 Jul, 2022 5 commits
  4. 01 Jul, 2022 6 commits
    • Daniel Borkmann's avatar
      bpf, selftests: Add verifier test case for jmp32's jeq/jne · a49b8ce7
      Daniel Borkmann authored
      Add a test case to trigger the verifier's incorrect conclusion in the
      case of jmp32's jeq/jne. Also here, make use of dead code elimination,
      so that we can see the verifier bailing out on unfixed kernels.
      
      Before:
      
        # ./test_verifier 724
        #724/p jeq32/jne32: bounds checking FAIL
        Failed to load prog 'Permission denied'!
        R4 !read_ok
        verification time 8 usec
        stack depth 0
        processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0
        Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
      
      After:
      
        # ./test_verifier 724
        #724/p jeq32/jne32: bounds checking OK
        Summary: 1 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-4-daniel@iogearbox.net
      a49b8ce7
    • Daniel Borkmann's avatar
      bpf, selftests: Add verifier test case for imm=0,umin=0,umax=1 scalar · 73c4936f
      Daniel Borkmann authored
      Add a test case to trigger the constant scalar issue which leaves the
      register in scalar(imm=0,umin=0,umax=1,var_off=(0x0; 0x0)) state. Make
      use of dead code elimination, so that we can see the verifier bailing
      out on unfixed kernels. For the condition, we use jle given it checks
      on umax bound.
      
      Before:
      
        # ./test_verifier 743
        #743/p jump & dead code elimination FAIL
        Failed to load prog 'Permission denied'!
        R4 !read_ok
        verification time 11 usec
        stack depth 0
        processed 13 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
        Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
      
      After:
      
        # ./test_verifier 743
        #743/p jump & dead code elimination OK
        Summary: 1 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-3-daniel@iogearbox.net
      73c4936f
    • Daniel Borkmann's avatar
      bpf: Fix insufficient bounds propagation from adjust_scalar_min_max_vals · 3844d153
      Daniel Borkmann authored
      Kuee reported a corner case where the tnum becomes constant after the call
      to __reg_bound_offset(), but the register's bounds are not, that is, its
      min bounds are still not equal to the register's max bounds.
      
      This in turn allows to leak pointers through turning a pointer register as
      is into an unknown scalar via adjust_ptr_min_max_vals().
      
      Before:
      
        func#0 @0
        0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        0: (b7) r0 = 1                        ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0))
        1: (b7) r3 = 0                        ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))
        2: (87) r3 = -r3                      ; R3_w=scalar()
        3: (87) r3 = -r3                      ; R3_w=scalar()
        4: (47) r3 |= 32767                   ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881)
        5: (75) if r3 s>= 0x0 goto pc+1       ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        6: (95) exit
      
        from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        7: (d5) if r3 s<= 0x8000 goto pc+1    ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        8: (95) exit
      
        from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        9: (07) r3 += -32767                  ; R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0))  <--- [*]
        10: (95) exit
      
      What can be seen here is that R3=scalar(umin=32767,umax=32768,var_off=(0x7fff;
      0x8000)) after the operation R3 += -32767 results in a 'malformed' constant, that
      is, R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0)). Intersecting with var_off has
      not been done at that point via __update_reg_bounds(), which would have improved
      the umax to be equal to umin.
      
      Refactor the tnum <> min/max bounds information flow into a reg_bounds_sync()
      helper and use it consistently everywhere. After the fix, bounds have been
      corrected to R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) and thus the register
      is regarded as a 'proper' constant scalar of 0.
      
      After:
      
        func#0 @0
        0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        0: (b7) r0 = 1                        ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0))
        1: (b7) r3 = 0                        ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))
        2: (87) r3 = -r3                      ; R3_w=scalar()
        3: (87) r3 = -r3                      ; R3_w=scalar()
        4: (47) r3 |= 32767                   ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881)
        5: (75) if r3 s>= 0x0 goto pc+1       ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        6: (95) exit
      
        from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        7: (d5) if r3 s<= 0x8000 goto pc+1    ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767)
        8: (95) exit
      
        from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0))
        9: (07) r3 += -32767                  ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0))  <--- [*]
        10: (95) exit
      
      Fixes: b03c9f9f ("bpf/verifier: track signed and unsigned min/max values")
      Reported-by: default avatarKuee K1r0a <liulin063@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-2-daniel@iogearbox.net
      3844d153
    • Daniel Borkmann's avatar
      bpf: Fix incorrect verifier simulation around jmp32's jeq/jne · a12ca627
      Daniel Borkmann authored
      Kuee reported a quirk in the jmp32's jeq/jne simulation, namely that the
      register value does not match expectations for the fall-through path. For
      example:
      
      Before fix:
      
        0: R1=ctx(off=0,imm=0) R10=fp0
        0: (b7) r2 = 0                        ; R2_w=P0
        1: (b7) r6 = 563                      ; R6_w=P563
        2: (87) r2 = -r2                      ; R2_w=Pscalar()
        3: (87) r2 = -r2                      ; R2_w=Pscalar()
        4: (4c) w2 |= w6                      ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563
        5: (56) if w2 != 0x8 goto pc+1        ; R2_w=P571  <--- [*]
        6: (95) exit
        R0 !read_ok
      
      After fix:
      
        0: R1=ctx(off=0,imm=0) R10=fp0
        0: (b7) r2 = 0                        ; R2_w=P0
        1: (b7) r6 = 563                      ; R6_w=P563
        2: (87) r2 = -r2                      ; R2_w=Pscalar()
        3: (87) r2 = -r2                      ; R2_w=Pscalar()
        4: (4c) w2 |= w6                      ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563
        5: (56) if w2 != 0x8 goto pc+1        ; R2_w=P8  <--- [*]
        6: (95) exit
        R0 !read_ok
      
      As can be seen on line 5 for the branch fall-through path in R2 [*] is that
      given condition w2 != 0x8 is false, verifier should conclude that r2 = 8 as
      upper 32 bit are known to be zero. However, verifier incorrectly concludes
      that r2 = 571 which is far off.
      
      The problem is it only marks false{true}_reg as known in the switch for JE/NE
      case, but at the end of the function, it uses {false,true}_{64,32}off to
      update {false,true}_reg->var_off and they still hold the prior value of
      {false,true}_reg->var_off before it got marked as known. The subsequent
      __reg_combine_32_into_64() then propagates this old var_off and derives new
      bounds. The information between min/max bounds on {false,true}_reg from
      setting the register to known const combined with the {false,true}_reg->var_off
      based on the old information then derives wrong register data.
      
      Fix it by detangling the BPF_JEQ/BPF_JNE cases and updating relevant
      {false,true}_{64,32}off tnums along with the register marking to known
      constant.
      
      Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking")
      Reported-by: default avatarKuee K1r0a <liulin063@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20220701124727.11153-1-daniel@iogearbox.net
      a12ca627
    • Li kunyu's avatar
      net: usb: Fix typo in code · 8dfeee9d
      Li kunyu authored
      Remove the repeated ';' from code.
      Signed-off-by: default avatarLi kunyu <kunyu@nfschina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8dfeee9d
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 71560d98
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-06-30
      
      This series contains updates to i40e driver only.
      
      Lukasz adds reporting of packets dropped for being too large into the Rx
      dropped statistics.
      
      Norbert clears VF filter and MAC address to resolve issue with older VFs
      being unable to change their MAC address.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71560d98
  5. 30 Jun, 2022 17 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 5e837935
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter.
      
        Current release - new code bugs:
      
         - clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()
      
         - mptcp:
            - invoke MP_FAIL response only when needed
            - fix shutdown vs fallback race
            - consistent map handling on failure
      
         - octeon_ep: use bitwise AND
      
        Previous releases - regressions:
      
         - tipc: move bc link creation back to tipc_node_create, fix NPD
      
        Previous releases - always broken:
      
         - tcp: add a missing nf_reset_ct() in 3WHS handling to prevent socket
           buffered skbs from keeping refcount on the conntrack module
      
         - ipv6: take care of disable_policy when restoring routes
      
         - tun: make sure to always disable and unlink NAPI instances
      
         - phy: don't trigger state machine while in suspend
      
         - netfilter: nf_tables: avoid skb access on nf_stolen
      
         - asix: fix "can't send until first packet is send" issue
      
         - usb: asix: do not force pause frames support
      
         - nxp-nci: don't issue a zero length i2c_master_read()
      
        Misc:
      
         - ncsi: allow use of proper "mellanox" DT vendor prefix
      
         - act_api: add a message for user space if any actions were already
           flushed before the error was hit"
      
      * tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
        net: dsa: felix: fix race between reading PSFP stats and port stats
        selftest: tun: add test for NAPI dismantle
        net: tun: avoid disabling NAPI twice
        net: sparx5: mdb add/del handle non-sparx5 devices
        net: sfp: fix memory leak in sfp_probe()
        mlxsw: spectrum_router: Fix rollback in tunnel next hop init
        net: rose: fix UAF bugs caused by timer handler
        net: usb: ax88179_178a: Fix packet receiving
        net: bonding: fix use-after-free after 802.3ad slave unbind
        ipv6: fix lockdep splat in in6_dump_addrs()
        net: phy: ax88772a: fix lost pause advertisement configuration
        net: phy: Don't trigger state machine while in suspend
        usbnet: fix memory allocation in helpers
        selftests net: fix kselftest net fatal error
        NFC: nxp-nci: don't print header length mismatch on i2c error
        NFC: nxp-nci: Don't issue a zero length i2c_master_read()
        net: tipc: fix possible refcount leak in tipc_sk_create()
        nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
        net: ipv6: unexport __init-annotated seg6_hmac_net_init()
        ipv6/sit: fix ipip6_tunnel_get_prl return value
        ...
      5e837935
    • Amir Goldstein's avatar
      vfs: fix copy_file_range() regression in cross-fs copies · 868f9f2f
      Amir Goldstein authored
      A regression has been reported by Nicolas Boichat, found while using the
      copy_file_range syscall to copy a tracefs file.
      
      Before commit 5dae222a ("vfs: allow copy_file_range to copy across
      devices") the kernel would return -EXDEV to userspace when trying to
      copy a file across different filesystems.  After this commit, the
      syscall doesn't fail anymore and instead returns zero (zero bytes
      copied), as this file's content is generated on-the-fly and thus reports
      a size of zero.
      
      Another regression has been reported by He Zhe - the assertion of
      WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
      copying from a sysfs file whose read operation may return -EOPNOTSUPP.
      
      Since we do not have test coverage for copy_file_range() between any two
      types of filesystems, the best way to avoid these sort of issues in the
      future is for the kernel to be more picky about filesystems that are
      allowed to do copy_file_range().
      
      This patch restores some cross-filesystem copy restrictions that existed
      prior to commit 5dae222a ("vfs: allow copy_file_range to copy across
      devices"), namely, cross-sb copy is not allowed for filesystems that do
      not implement ->copy_file_range().
      
      Filesystems that do implement ->copy_file_range() have full control of
      the result - if this method returns an error, the error is returned to
      the user.  Before this change this was only true for fs that did not
      implement the ->remap_file_range() operation (i.e.  nfsv3).
      
      Filesystems that do not implement ->copy_file_range() still fall-back to
      the generic_copy_file_range() implementation when the copy is within the
      same sb.  This helps the kernel can maintain a more consistent story
      about which filesystems support copy_file_range().
      
      nfsd and ksmbd servers are modified to fall-back to the
      generic_copy_file_range() implementation in case vfs_copy_file_range()
      fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
      server-side-copy.
      
      fall-back to generic_copy_file_range() is not implemented for the smb
      operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
      change of behavior.
      
      Fixes: 5dae222a ("vfs: allow copy_file_range to copy across devices")
      Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
      Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
      Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
      Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/Reported-by: default avatarNicolas Boichat <drinkcat@chromium.org>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Signed-off-by: default avatarLuis Henriques <lhenriques@suse.de>
      Fixes: 64bf5ff5 ("vfs: no fallback for ->copy_file_range")
      Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/Reported-by: default avatarHe Zhe <zhe.he@windriver.com>
      Tested-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Tested-by: default avatarLuis Henriques <lhenriques@suse.de>
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      868f9f2f
    • Norbert Zulinski's avatar
      i40e: Fix VF's MAC Address change on VM · fed0d9f1
      Norbert Zulinski authored
      Clear VF MAC from parent PF and remove VF filter from VSI when both
      conditions are true:
      -VIRTCHNL_VF_OFFLOAD_USO is not used
      -VM MAC was not set from PF level
      
      It affects older version of IAVF and it allow them to change MAC
      Address on VM, newer IAVF won't change their behaviour.
      
      Previously it wasn't possible to change VF's MAC Address on VM
      because there is flag on IAVF driver that won't allow to
      change MAC Address if this address is given from PF driver.
      
      Fixes: 155f0ac2 ("iavf: allow permanent MAC address to change")
      Signed-off-by: default avatarNorbert Zulinski <norbertx.zulinski@intel.com>
      Signed-off-by: default avatarJan Sokolowski <jan.sokolowski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      fed0d9f1
    • Lukasz Cieplicki's avatar
      i40e: Fix dropped jumbo frames statistics · 1adb1563
      Lukasz Cieplicki authored
      Dropped packets caused by too large frames were not included in
      dropped RX packets statistics.
      Issue was caused by not reading the GL_RXERR1 register. That register
      stores count of packet which was have been dropped due to too large
      size.
      
      Fix it by reading GL_RXERR1 register for each interface.
      
      Repro steps:
      Send a packet larger than the set MTU to SUT
      Observe rx statists: ethtool -S <interface> | grep rx | grep -v ": 0"
      
      Fixes: 41a9e55c ("i40e: add missing VSI statistics")
      Signed-off-by: default avatarLukasz Cieplicki <lukaszx.cieplicki@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      1adb1563
    • Vladimir Oltean's avatar
      net: dsa: felix: fix race between reading PSFP stats and port stats · 58bf4db6
      Vladimir Oltean authored
      Both PSFP stats and the port stats read by ocelot_check_stats_work() are
      indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW,
      read from SYS:STAT:CNT[n].
      
      It's just that for port stats, we write STAT_VIEW with the index of the
      port, and for PSFP stats, we write STAT_VIEW with the filter index.
      
      So if we allow them to run concurrently, ocelot_check_stats_work() may
      change the view from vsc9959_psfp_counters_get(), and vice versa.
      
      Fixes: 7d4b564d ("net: dsa: felix: support psfp filter on vsc9959")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      58bf4db6
    • Jakub Kicinski's avatar
      selftest: tun: add test for NAPI dismantle · 839b92fe
      Jakub Kicinski authored
      Being lazy does not pay, add the test for various
      ordering of tun queue close / detach / destroy.
      
      Link: https://lore.kernel.org/r/20220629181911.372047-2-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      839b92fe
    • Jakub Kicinski's avatar
      net: tun: avoid disabling NAPI twice · ff1fa208
      Jakub Kicinski authored
      Eric reports that syzbot made short work out of my speculative
      fix. Indeed when queue gets detached its tfile->tun remains,
      so we would try to stop NAPI twice with a detach(), close()
      sequence.
      
      Alternative fix would be to move tun_napi_disable() to
      tun_detach_all() and let the NAPI run after the queue
      has been detached.
      
      Fixes: a8fc8cb5 ("net: tun: stop NAPI when detaching queues")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220629181911.372047-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ff1fa208
    • Casper Andersson's avatar
      net: sparx5: mdb add/del handle non-sparx5 devices · 9c5de246
      Casper Andersson authored
      When adding/deleting mdb entries on other net_devices, eg., tap
      interfaces, it should not crash.
      
      Fixes: 3bacfccd ("net: sparx5: Add mdb handlers")
      Signed-off-by: default avatarCasper Andersson <casper.casan@gmail.com>
      Reviewed-by: default avatarSteen Hegelund <Steen.Hegelund@microchip.com>
      Link: https://lore.kernel.org/r/20220630122226.316812-1-casper.casan@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9c5de246
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 1a0e93df
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Three minor bug fixes:
      
         - qedr not setting the QP timeout properly toward userspace
      
         - Memory leak on error path in ib_cm
      
         - Divide by 0 in RDMA interrupt moderation"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        linux/dim: Fix divide by 0 in RDMA DIM
        RDMA/cm: Fix memory leak in ib_cm_insert_listen
        RDMA/qedr: Fix reporting QP timeout attribute
      1a0e93df
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 9fb3bb25
      Linus Torvalds authored
      Pull fanotify fix from Jan Kara:
       "A fix for recently added fanotify API to have stricter checks and
        refuse some invalid flag combinations to make our life easier in the
        future"
      
      * tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fanotify: refine the validation checks on non-dir inode mask
      9fb3bb25
    • Linus Torvalds's avatar
      Merge tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · f5da5ddf
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "Fix a regression that breaks the ccp driver"
      
      * tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: ccp - Fix device IRQ counting by using platform_irq_count()
      f5da5ddf
    • Jianglei Nie's avatar
      net: sfp: fix memory leak in sfp_probe() · 0a18d802
      Jianglei Nie authored
      sfp_probe() allocates a memory chunk from sfp with sfp_alloc(). When
      devm_add_action() fails, sfp is not freed, which leads to a memory leak.
      
      We should use devm_add_action_or_reset() instead of devm_add_action().
      Signed-off-by: default avatarJianglei Nie <niejianglei2021@163.com>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20220629075550.2152003-1-niejianglei2021@163.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0a18d802
    • Petr Machata's avatar
      mlxsw: spectrum_router: Fix rollback in tunnel next hop init · 665030fd
      Petr Machata authored
      In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
      linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
      that function results in an error, the next hop will not have been removed
      from the linked list. As the error is propagated upwards and the caller
      frees the next hop object, the linked list ends up holding an invalid
      object.
      
      A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
      block does exist, however does not include the linked list removal.
      
      Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
      rollbacks. As these were introduced in the same patchset as the next hop
      linked list, include the cleanup in this patch.
      
      Fixes: dbe4598c ("mlxsw: spectrum_router: Keep nexthops in a linked list")
      Fixes: a5390278 ("mlxsw: spectrum: Add support for setting counters on nexthops")
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      665030fd
    • Duoming Zhou's avatar
      net: rose: fix UAF bugs caused by timer handler · 9cc02ede
      Duoming Zhou authored
      There are UAF bugs in rose_heartbeat_expiry(), rose_timer_expiry()
      and rose_idletimer_expiry(). The root cause is that del_timer()
      could not stop the timer handler that is running and the refcount
      of sock is not managed properly.
      
      One of the UAF bugs is shown below:
      
          (thread 1)          |        (thread 2)
                              |  rose_bind
                              |  rose_connect
                              |    rose_start_heartbeat
      rose_release            |    (wait a time)
        case ROSE_STATE_0     |
        rose_destroy_socket   |  rose_heartbeat_expiry
          rose_stop_heartbeat |
          sock_put(sk)        |    ...
        sock_put(sk) // FREE  |
                              |    bh_lock_sock(sk) // USE
      
      The sock is deallocated by sock_put() in rose_release() and
      then used by bh_lock_sock() in rose_heartbeat_expiry().
      
      Although rose_destroy_socket() calls rose_stop_heartbeat(),
      it could not stop the timer that is running.
      
      The KASAN report triggered by POC is shown below:
      
      BUG: KASAN: use-after-free in _raw_spin_lock+0x5a/0x110
      Write of size 4 at addr ffff88800ae59098 by task swapper/3/0
      ...
      Call Trace:
       <IRQ>
       dump_stack_lvl+0xbf/0xee
       print_address_description+0x7b/0x440
       print_report+0x101/0x230
       ? irq_work_single+0xbb/0x140
       ? _raw_spin_lock+0x5a/0x110
       kasan_report+0xed/0x120
       ? _raw_spin_lock+0x5a/0x110
       kasan_check_range+0x2bd/0x2e0
       _raw_spin_lock+0x5a/0x110
       rose_heartbeat_expiry+0x39/0x370
       ? rose_start_heartbeat+0xb0/0xb0
       call_timer_fn+0x2d/0x1c0
       ? rose_start_heartbeat+0xb0/0xb0
       expire_timers+0x1f3/0x320
       __run_timers+0x3ff/0x4d0
       run_timer_softirq+0x41/0x80
       __do_softirq+0x233/0x544
       irq_exit_rcu+0x41/0xa0
       sysvec_apic_timer_interrupt+0x8c/0xb0
       </IRQ>
       <TASK>
       asm_sysvec_apic_timer_interrupt+0x1b/0x20
      RIP: 0010:default_idle+0xb/0x10
      RSP: 0018:ffffc9000012fea0 EFLAGS: 00000202
      RAX: 000000000000bcae RBX: ffff888006660f00 RCX: 000000000000bcae
      RDX: 0000000000000001 RSI: ffffffff843a11c0 RDI: ffffffff843a1180
      RBP: dffffc0000000000 R08: dffffc0000000000 R09: ffffed100da36d46
      R10: dfffe9100da36d47 R11: ffffffff83cf0950 R12: 0000000000000000
      R13: 1ffff11000ccc1e0 R14: ffffffff8542af28 R15: dffffc0000000000
      ...
      Allocated by task 146:
       __kasan_kmalloc+0xc4/0xf0
       sk_prot_alloc+0xdd/0x1a0
       sk_alloc+0x2d/0x4e0
       rose_create+0x7b/0x330
       __sock_create+0x2dd/0x640
       __sys_socket+0xc7/0x270
       __x64_sys_socket+0x71/0x80
       do_syscall_64+0x43/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Freed by task 152:
       kasan_set_track+0x4c/0x70
       kasan_set_free_info+0x1f/0x40
       ____kasan_slab_free+0x124/0x190
       kfree+0xd3/0x270
       __sk_destruct+0x314/0x460
       rose_release+0x2fa/0x3b0
       sock_close+0xcb/0x230
       __fput+0x2d9/0x650
       task_work_run+0xd6/0x160
       exit_to_user_mode_loop+0xc7/0xd0
       exit_to_user_mode_prepare+0x4e/0x80
       syscall_exit_to_user_mode+0x20/0x40
       do_syscall_64+0x4f/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      This patch adds refcount of sock when we use functions
      such as rose_start_heartbeat() and so on to start timer,
      and decreases the refcount of sock when timer is finished
      or deleted by functions such as rose_stop_heartbeat()
      and so on. As a result, the UAF bugs could be mitigated.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Tested-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Link: https://lore.kernel.org/r/20220629002640.5693-1-duoming@zju.edu.cnSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9cc02ede
    • Jose Alonso's avatar
      net: usb: ax88179_178a: Fix packet receiving · f8ebb3ac
      Jose Alonso authored
      This patch corrects packet receiving in ax88179_rx_fixup.
      
      - problem observed:
        ifconfig shows allways a lot of 'RX Errors' while packets
        are received normally.
      
        This occurs because ax88179_rx_fixup does not recognise properly
        the usb urb received.
        The packets are normally processed and at the end, the code exits
        with 'return 0', generating RX Errors.
        (pkt_cnt==-2 and ptk_hdr over field rx_hdr trying to identify
         another packet there)
      
        This is a usb urb received by "tcpdump -i usbmon2 -X" on a
        little-endian CPU:
        0x0000:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
                 ^         packet 1 start (pkt_len = 0x05ec)
                 ^^^^      IP alignment pseudo header
                      ^    ethernet packet start
                 last byte ethernet packet   v
                 padding (8-bytes aligned)     vvvv vvvv
        0x05e0:  c92d d444 1420 8a69 83dd 272f e82b 9811
        0x05f0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
        ...      ^ packet 2
        0x0be0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
        ...
        0x1130:  9d41 9171 8a38 0ec5 eeee f8e3 3b19 87a0
        ...
        0x1720:  8cfc 15ff 5e4c e85c eeee f8e3 3b19 87a0
        ...
        0x1d10:  ecfa 2a3a 19ab c78c eeee f8e3 3b19 87a0
        ...
        0x2070:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
        ...      ^ packet 7
        0x2120:  7c88 4ca5 5c57 7dcc 0d34 7577 f778 7e0a
        0x2130:  f032 e093 7489 0740 3008 ec05 0000 0080
                                     ====1==== ====2====
                 hdr_off             ^
                 pkt_len = 0x05ec         ^^^^
                 AX_RXHDR_*=0x00830  ^^^^   ^
                 pkt_len = 0                        ^^^^
                 AX_RXHDR_DROP_ERR=0x80000000  ^^^^   ^
        0x2140:  3008 ec05 0000 0080 3008 5805 0000 0080
        0x2150:  3008 ec05 0000 0080 3008 ec05 0000 0080
        0x2160:  3008 5803 0000 0080 3008 c800 0000 0080
                 ===11==== ===12==== ===13==== ===14====
        0x2170:  0000 0000 0e00 3821
                           ^^^^ ^^^^ rx_hdr
                           ^^^^      pkt_cnt=14
                                ^^^^ hdr_off=0x2138
                 ^^^^ ^^^^           padding
      
        The dump shows that pkt_cnt is the number of entrys in the
        per-packet metadata. It is "2 * packet count".
        Each packet have two entrys. The first have a valid
        value (pkt_len and AX_RXHDR_*) and the second have a
        dummy-header 0x80000000 (pkt_len=0 with AX_RXHDR_DROP_ERR).
        Why exists dummy-header for each packet?!?
        My guess is that this was done probably to align the
        entry for each packet to 64-bits and maintain compatibility
        with old firmware.
        There is also a padding (0x00000000) before the rx_hdr to
        align the end of rx_hdr to 64-bit.
        Note that packets have a alignment of 64-bits (8-bytes).
      
        This patch assumes that the dummy-header and the last
        padding are optional. So it preserves semantics and
        recognises the same valid packets as the current code.
      
        This patch was made using only the dumpfile information and
        tested with only one device:
        0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
      
      Fixes: 57bc3d3a ("net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup")
      Fixes: e2ca90c2 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver")
      Signed-off-by: default avatarJose Alonso <joalonsof@gmail.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/d6970bb04bf67598af4d316eaeb1792040b18cfd.camel@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f8ebb3ac
    • Yevhen Orlov's avatar
      net: bonding: fix use-after-free after 802.3ad slave unbind · 050133e1
      Yevhen Orlov authored
      commit 0622cab0 ("bonding: fix 802.3ad aggregator reselection"),
      resolve case, when there is several aggregation groups in the same bond.
      bond_3ad_unbind_slave will invalidate (clear) aggregator when
      __agg_active_ports return zero. So, ad_clear_agg can be executed even, when
      num_of_ports!=0. Than bond_3ad_unbind_slave can be executed again for,
      previously cleared aggregator. NOTE: at this time bond_3ad_unbind_slave
      will not update slave ports list, because lag_ports==NULL. So, here we
      got slave ports, pointing to freed aggregator memory.
      
      Fix with checking actual number of ports in group (as was before
      commit 0622cab0 ("bonding: fix 802.3ad aggregator reselection") ),
      before ad_clear_agg().
      
      The KASAN logs are as follows:
      
      [  767.617392] ==================================================================
      [  767.630776] BUG: KASAN: use-after-free in bond_3ad_state_machine_handler+0x13dc/0x1470
      [  767.638764] Read of size 2 at addr ffff00011ba9d430 by task kworker/u8:7/767
      [  767.647361] CPU: 3 PID: 767 Comm: kworker/u8:7 Tainted: G           O 5.15.11 #15
      [  767.655329] Hardware name: DNI AmazonGo1 A7040 board (DT)
      [  767.660760] Workqueue: lacp_1 bond_3ad_state_machine_handler
      [  767.666468] Call trace:
      [  767.668930]  dump_backtrace+0x0/0x2d0
      [  767.672625]  show_stack+0x24/0x30
      [  767.675965]  dump_stack_lvl+0x68/0x84
      [  767.679659]  print_address_description.constprop.0+0x74/0x2b8
      [  767.685451]  kasan_report+0x1f0/0x260
      [  767.689148]  __asan_load2+0x94/0xd0
      [  767.692667]  bond_3ad_state_machine_handler+0x13dc/0x1470
      
      Fixes: 0622cab0 ("bonding: fix 802.3ad aggregator reselection")
      Co-developed-by: default avatarMaksym Glubokiy <maksym.glubokiy@plvision.eu>
      Signed-off-by: default avatarMaksym Glubokiy <maksym.glubokiy@plvision.eu>
      Signed-off-by: default avatarYevhen Orlov <yevhen.orlov@plvision.eu>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/20220629012914.361-1-yevhen.orlov@plvision.euSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      050133e1
    • Eric Dumazet's avatar
      ipv6: fix lockdep splat in in6_dump_addrs() · 4e43e64d
      Eric Dumazet authored
      As reported by syzbot, we should not use rcu_dereference()
      when rcu_read_lock() is not held.
      
      WARNING: suspicious RCU usage
      5.19.0-rc2-syzkaller #0 Not tainted
      
      net/ipv6/addrconf.c:5175 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by syz-executor326/3617:
       #0: ffffffff8d5848e8 (rtnl_mutex){+.+.}-{3:3}, at: netlink_dump+0xae/0xc20 net/netlink/af_netlink.c:2223
      
      stack backtrace:
      CPU: 0 PID: 3617 Comm: syz-executor326 Not tainted 5.19.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       in6_dump_addrs+0x12d1/0x1790 net/ipv6/addrconf.c:5175
       inet6_dump_addr+0x9c1/0xb50 net/ipv6/addrconf.c:5300
       netlink_dump+0x541/0xc20 net/netlink/af_netlink.c:2275
       __netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2380
       netlink_dump_start include/linux/netlink.h:245 [inline]
       rtnetlink_rcv_msg+0x73e/0xc90 net/core/rtnetlink.c:6046
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:734
       ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
       __sys_sendmsg net/socket.c:2575 [inline]
       __do_sys_sendmsg net/socket.c:2584 [inline]
       __se_sys_sendmsg net/socket.c:2582 [inline]
       __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: 88e2ca30 ("mld: convert ifmcaddr6 to RCU")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Taehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20220628121248.858695-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4e43e64d