1. 28 Jun, 2024 8 commits
  2. 27 Jun, 2024 32 commits
    • Breno Leitao's avatar
      net: thunderx: Unembed netdev structure · 94833add
      Breno Leitao authored
      Embedding net_device into structures prohibits the usage of flexible
      arrays in the net_device structure. For more details, see the discussion
      at [1].
      
      Un-embed the net_devices from struct lmac by converting them
      into pointers, and allocating them dynamically. Use the leverage
      alloc_netdev() to allocate the net_device object at
      bgx_lmac_enable().
      
      The free of the device occurs at bgx_lmac_disable().
      
       Do not free_netdevice() if bgx_lmac_enable() fails after lmac->netdev
      is allocated, since bgx_lmac_disable() is called if bgx_lmac_enable()
      fails, and lmac->netdev will be freed there (similarly to lmac->dmacs).
      
      Link: https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ [1]
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Link: https://patch.msgid.link/20240626173503.87636-1-leitao@debian.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      94833add
    • Sagi Grimberg's avatar
      Revert "net: micro-optimize skb_datagram_iter" · 2d5f6801
      Sagi Grimberg authored
      This reverts commit 934c2999.
      This triggered a usercopy BUG() in systems with HIGHMEM, reported
      by the test robot in:
       https://lore.kernel.org/oe-lkp/202406161539.b5ff7b20-oliver.sang@intel.comSigned-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://patch.msgid.link/20240626070153.759257-1-sagi@grimberg.meSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d5f6801
    • Jakub Kicinski's avatar
      Merge branch 'selftests-net-switch-pmtu-sh-to-use-the-internal-ovs-script' · 3a158e2e
      Jakub Kicinski authored
      Aaron Conole says:
      
      ====================
      selftests: net: Switch pmtu.sh to use the internal ovs script.
      
      Currently, if a user wants to run pmtu.sh and cover all the provided test
      cases, they need to install the Open vSwitch userspace utilities.  This
      dependency is difficult for users as well as CI environments, because the
      userspace build and setup may require lots of support and devel packages
      to be installed, system setup to be correct, and things like permissions
      and selinux policies to be properly configured.
      
      The kernel selftest suite includes an ovs-dpctl.py utility which can
      interact with the openvswitch module directly.  This lets developers and
      CI environments run without needing too many extra dependencies - just
      the pyroute2 python package.
      
      This series enhances the ovs-dpctl utility to provide support for set()
      and tunnel() flow specifiers, better ipv6 handling support, and the
      ability to add tunnel vports, and LWT interfaces.  Finally, it modifies
      the pmtu.sh script to call the ovs-dpctl.py utility rather than the
      typical OVS userspace utilities.  The pmtu.sh can still fall back on
      the Open vSwitch userspace utilities if the ovs-dpctl.py script can't
      be used.
      ====================
      
      Link: https://patch.msgid.link/20240625172245.233874-1-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3a158e2e
    • Aaron Conole's avatar
      selftests: net: add config for openvswitch · 6f437f5c
      Aaron Conole authored
      The pmtu testing will require that the OVS module is installed,
      so do that.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20240625172245.233874-8-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6f437f5c
    • Aaron Conole's avatar
      selftests: net: Use the provided dpctl rather than the vswitchd for tests. · b7ce46fc
      Aaron Conole authored
      The current pmtu test infrastucture requires an installed copy of the
      ovs-vswitchd userspace.  This means that any automated or constrained
      environments may not have the requisite tools to run the tests.  However,
      the pmtu tests don't require any special classifier processing.  Indeed
      they are only using the vswitchd in the most basic mode - as a NORMAL
      switch.
      
      However, the ovs-dpctl kernel utility can now program all the needed basic
      flows to allow traffic to traverse the tunnels and provide support for at
      least testing some basic pmtu scenarios.  More complicated flow pipelines
      can be added to the internal ovs test infrastructure, but that is work for
      the future.  For now, enable the most common cases - wide mega flows with
      no other prerequisites.
      
      Enhance the pmtu testing to try testing using the internal utility, first.
      As a fallback, if the internal utility isn't running, then try with the
      ovs-vswitchd userspace tools.
      
      Additionally, make sure that when the pyroute2 package is not available
      the ovs-dpctl utility will error out to properly signal an error has
      occurred and skip using the internal utility.
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240625172245.233874-7-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7ce46fc
    • Aaron Conole's avatar
      selftests: openvswitch: Support implicit ipv6 arguments. · 51458e10
      Aaron Conole authored
      The current iteration of IPv6 support requires explicit fields to be set
      in addition to not properly support the actual IPv6 addresses properly.
      With this change, make it so that the ipv6() bare option is usable to
      create wildcarded flows to match broad swaths of ipv6 traffic.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20240625172245.233874-6-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      51458e10
    • Aaron Conole's avatar
      selftests: openvswitch: Add support for tunnel() key. · fefe3b7d
      Aaron Conole authored
      This will be used when setting details about the tunnel to use as
      transport.  There is a difference between the ODP format between tunnel():
      the 'key' flag is not actually a flag field, so we don't support it in the
      same way that the vswitchd userspace supports displaying it.
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240625172245.233874-5-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fefe3b7d
    • Aaron Conole's avatar
      selftests: openvswitch: Add set() and set_masked() support. · a4126f90
      Aaron Conole authored
      These will be used in upcoming commits to set specific attributes for
      interacting with tunnels.  Since set() will use the key parsing routine, we
      also make sure to prepend it with an open paren, for the action parsing to
      properly understand it.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20240625172245.233874-4-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a4126f90
    • Aaron Conole's avatar
      selftests: openvswitch: Refactor actions parsing. · 37de65a7
      Aaron Conole authored
      Until recently, the ovs-dpctl utility was used with a limited actions set
      and didn't need to have support for multiple similar actions.  However,
      when adding support for tunnels, it will be important to support multiple
      set() actions in a single flow.  When printing these actions, the existing
      code will be unable to print all of the sets - it will only print the
      first.
      
      Refactor this code to be easier to read and support multiple actions of the
      same type in an action list.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20240625172245.233874-3-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      37de65a7
    • Aaron Conole's avatar
      selftests: openvswitch: Support explicit tunnel port creation. · f94ecbc9
      Aaron Conole authored
      The OVS module can operate in conjunction with various types of
      tunnel ports.  These are created as either explicit tunnel vport
      types, OR by creating a tunnel interface which acts as an anchor
      for the lightweight tunnel support.
      
      This patch adds the ability to add tunnel ports to an OVS
      datapath for testing various scenarios with tunnel ports.  With
      this addition, the vswitch "plumbing" will at least be able to
      push packets around using the tunnel vports.  Future patches
      will add support for setting required tunnel metadata for lwts
      in the datapath.  The end goal will be to push packets via these
      tunnels, and will be used in an upcoming commit for testing the
      path MTU.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://patch.msgid.link/20240625172245.233874-2-aconole@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f94ecbc9
    • Jeff Johnson's avatar
      s390/lcs: add missing MODULE_DESCRIPTION() macro · 346a03e5
      Jeff Johnson authored
      With ARCH=s390, make allmodconfig && make W=1 C=1 reports:
      WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/s390/net/lcs.o
      
      Add the missing invocation of the MODULE_DESCRIPTION() macro.
      Signed-off-by: default avatarJeff Johnson <quic_jjohnson@quicinc.com>
      Acked-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Link: https://patch.msgid.link/20240625-md-s390-drivers-s390-net-v2-1-5a8a2b2f2ae3@quicinc.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      346a03e5
    • Jakub Kicinski's avatar
      tools: ynl: use display hints for formatting of scalar attrs · 2a901623
      Jakub Kicinski authored
      Use display hints for formatting scalar attrs. This is specifically
      useful for formatting IPv4 addresses carried typically as u32.
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://patch.msgid.link/20240626201234.2572964-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2a901623
    • Jakub Kicinski's avatar
      Merge tag 'wireless-next-2024-06-27' of... · 56bf02c2
      Jakub Kicinski authored
      Merge tag 'wireless-next-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
      
      Johannes Berg says:
      
      ====================
      Highlights this time are:
      
       - cfg80211/nl80211:
          * improvements for 6 GHz regulatory flexibility
      
       - mac80211:
          * use generic netdev stats
          * multi-link improvements/fixes
      
       - brcmfmac:
          * MFP support (to enable WPA3)
      
       - wilc1000:
          * suspend/resume improvements
      
       - iwlwifi:
          * remove support for older FW for new devices
          * fast resume (keeping the device configured)
      
       - wl18xx:
          * support newer firmware versions
      
      * tag 'wireless-next-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (100 commits)
        wifi: brcmfmac: of: Support interrupts-extended
        wifi: brcmsmac: advertise MFP_CAPABLE to enable WPA3
        net: rfkill: Correct return value in invalid parameter case
        wifi: mac80211: fix NULL dereference at band check in starting tx ba session
        wifi: iwlwifi: mvm: fix rs.h kernel-doc
        wifi: iwlwifi: fw: api: datapath: fix kernel-doc
        wifi: iwlwifi: fix remaining mistagged kernel-doc comments
        wifi: iwlwifi: fix prototype mismatch kernel-doc warnings
        wifi: iwlwifi: fix kernel-doc in iwl-fh.h
        wifi: iwlwifi: fix kernel-doc in iwl-trans.h
        wifi: iwlwifi: pcie: fix kernel-doc
        wifi: iwlwifi: dvm: fix kernel-doc warnings
        wifi: iwlwifi: mvm: don't log error for failed UATS table read
        wifi: iwlwifi: trans: make bad state warnings
        wifi: iwlwifi: fw: api: fix some kernel-doc
        wifi: iwlwifi: mvm: remove init_dbg module parameter
        wifi: iwlwifi: update the BA notification API
        wifi: iwlwifi: mvm: always unblock EMLSR on ROC end
        wifi: iwlwifi: mvm: use IWL_FW_CHECK for link ID check
        wifi: iwlwifi: mvm: don't flush BSSes on restart with MLD API
        ...
      ====================
      
      Link: https://patch.msgid.link/20240627114135.28507-3-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      56bf02c2
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 193b9b20
      Jakub Kicinski authored
      Cross-merge networking fixes after downstream PR.
      
      No conflicts.
      
      Adjacent changes:
        e3f02f32 ("ionic: fix kernel panic due to multi-buffer handling")
        d9c04209 ("ionic: Mark error paths in the data path as unlikely")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      193b9b20
    • Jakub Kicinski's avatar
      Merge tag 'wireless-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · ffb7aa9f
      Jakub Kicinski authored
      Johannes Berg says:
      
      ====================
      Just a few changes:
       - maintainers: Larry Finger sadly passed away
       - maintainers: ath trees are in their group now
       - TXQ FQ quantum configuration fix
       - TI wl driver: work around stuck FW in AP mode
       - mac80211: disable softirqs in some new code
         needing that
      
      * tag 'wireless-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        MAINTAINERS: wifi: update ath.git location
        MAINTAINERS: Remembering Larry Finger
        wifi: mac80211: disable softirqs for queued frame handling
        wifi: cfg80211: restrict NL80211_ATTR_TXQ_QUANTUM values
        wifi: wlcore: fix wlcore AP mode
      ====================
      
      Link: https://patch.msgid.link/20240627083627.15312-3-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffb7aa9f
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · fd19d4a4
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, bpf and netfilter.
      
        There are a bunch of regressions addressed here, but hopefully nothing
        spectacular. We are still waiting the driver fix from Intel, mentioned
        by Jakub in the previous networking pull.
      
        Current release - regressions:
      
         - core: add softirq safety to netdev_rename_lock
      
         - tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed
           TFO
      
         - batman-adv: fix RCU race at module unload time
      
        Previous releases - regressions:
      
         - openvswitch: get related ct labels from its master if it is not
           confirmed
      
         - eth: bonding: fix incorrect software timestamping report
      
         - eth: mlxsw: fix memory corruptions on spectrum-4 systems
      
         - eth: ionic: use dev_consume_skb_any outside of napi
      
        Previous releases - always broken:
      
         - netfilter: fully validate NFT_DATA_VALUE on store to data registers
      
         - unix: several fixes for OoB data
      
         - tcp: fix race for duplicate reqsk on identical SYN
      
         - bpf:
             - fix may_goto with negative offset
             - fix the corner case with may_goto and jump to the 1st insn
             - fix overrunning reservations in ringbuf
      
         - can:
             - j1939: recover socket queue on CAN bus error during BAM
               transmission
             - mcp251xfd: fix infinite loop when xmit fails
      
         - dsa: microchip: monitor potential faults in half-duplex mode
      
         - eth: vxlan: pull inner IP header in vxlan_xmit_one()
      
         - eth: ionic: fix kernel panic due to multi-buffer handling
      
        Misc:
      
         - selftest: unix tests refactor and a lot of new cases added"
      
      * tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
        net: mana: Fix possible double free in error handling path
        selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c.
        af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head.
        selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c
        selftest: af_unix: Check SIGURG after every send() in msg_oob.c
        selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c
        af_unix: Don't stop recv() at consumed ex-OOB skb.
        selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c.
        af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.
        af_unix: Stop recv(MSG_PEEK) at consumed OOB skb.
        selftest: af_unix: Add msg_oob.c.
        selftest: af_unix: Remove test_unix_oob.c.
        tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
        netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
        net: usb: qmi_wwan: add Telit FN912 compositions
        tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFO
        ionic: use dev_consume_skb_any outside of napi
        net: dsa: microchip: fix wrong register write when masking interrupt
        Fix race for duplicate reqsk on identical SYN
        ibmvnic: Add tx check to prevent skb leak
        ...
      fd19d4a4
    • Linus Torvalds's avatar
      Merge tag 'sound-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 3c1d29e5
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "This became bigger than usual, as it receives a pile of pending ASoC
        fixes. Most of changes are for device-specific issues while there are
        a few core fixes that are all rather trivial:
      
         - DMA-engine sync fixes
      
         - Continued MIDI2 conversion fixes
      
         - Various ASoC Intel SOF fixes
      
         - A series of ASoC topology fixes for memory handling
      
         - AMD ACP fix, curing a recent regression, too
      
         - Platform / codec-specific fixes for mediatek, atmel, realtek, etc"
      
      * tag 'sound-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (40 commits)
        ASoC: rt5645: fix issue of random interrupt from push-button
        ALSA: seq: Fix missing MSB in MIDI2 SPP conversion
        ASoC: amd: yc: Fix non-functional mic on ASUS M5602RA
        ALSA: hda/realtek: fix mute/micmute LEDs don't work for EliteBook 645/665 G11.
        ALSA: hda/realtek: Fix conflicting quirk for PCI SSID 17aa:3820
        ALSA: dmaengine_pcm: terminate dmaengine before synchronize
        ALSA: hda/relatek: Enable Mute LED on HP Laptop 15-gw0xxx
        ALSA: PCM: Allow resume only for suspended streams
        ALSA: seq: Fix missing channel at encoding RPN/NRPN MIDI2 messages
        ASoC: mediatek: mt8195: Add platform entry for ETDM1_OUT_BE dai link
        ASoC: fsl-asoc-card: set priv->pdev before using it
        ASoC: amd: acp: move chip->flag variable assignment
        ASoC: amd: acp: remove i2s configuration check in acp_i2s_probe()
        ASoC: amd: acp: add a null check for chip_pdev structure
        ASoC: Intel: soc-acpi: mtl: fix speaker no sound on Dell SKU 0C64
        ASoC: q6apm-lpass-dai: close graph on prepare errors
        ASoC: cs35l56: Disconnect ASP1 TX sources when ASP1 DAI is hooked up
        ASoC: topology: Fix route memory corruption
        ASoC: rt722-sdca-sdw: add debounce time for type detection
        ASoC: SOF: sof-audio: Skip unprepare for in-use widgets on error rollback
        ...
      3c1d29e5
    • Paolo Abeni's avatar
      Merge tag 'nf-24-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · b62cb6a7
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains two Netfilter fixes for net:
      
      Patch #1 fixes CONFIG_SYSCTL=n for a patch coming in the previous PR
      	 to move the sysctl toggle to enable SRv6 netfilter hooks from
      	 nf_conntrack to the core, from Jianguo Wu.
      
      Patch #2 fixes a possible pointer leak to userspace due to insufficient
      	 validation of NFT_DATA_VALUE.
      
      Linus found this pointer leak to userspace via zdi-disclosures@ and
      forwarded the notice to Netfilter maintainers, he appears as reporter
      because whoever found this issue never approached Netfilter
      maintainers neither via security@ nor in private.
      
      netfilter pull request 24-06-27
      
      * tag 'nf-24-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
        netfilter: fix undefined reference to 'netfilter_lwtunnel_*' when CONFIG_SYSCTL=n
      ====================
      
      Link: https://patch.msgid.link/20240626233845.151197-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b62cb6a7
    • Ma Ke's avatar
      net: mana: Fix possible double free in error handling path · 1864b822
      Ma Ke authored
      When auxiliary_device_add() returns error and then calls
      auxiliary_device_uninit(), callback function adev_release
      calls kfree(madev). We shouldn't call kfree(madev) again
      in the error handling path. Set 'madev' to NULL.
      
      Fixes: a69839d4 ("net: mana: Add support for auxiliary device")
      Signed-off-by: default avatarMa Ke <make24@iscas.ac.cn>
      Link: https://patch.msgid.link/20240625130314.2661257-1-make24@iscas.ac.cnSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1864b822
    • Paolo Abeni's avatar
      Merge branch 'af_unix-fix-bunch-of-msg_oob-bugs-and-add-new-tests' · 3f4d9e4f
      Paolo Abeni authored
      Kuniyuki Iwashima says:
      
      ====================
      af_unix: Fix bunch of MSG_OOB bugs and add new tests.
      
      This series rewrites the selftest for AF_UNIX MSG_OOB and fixes
      bunch of bugs that AF_UNIX behaves differently compared to TCP.
      
      Note that the test discovered few more bugs in TCP side, which
      will be fixed in another series.
      ====================
      
      Link: https://lore.kernel.org/r/20240625013645.45034-1-kuniyu@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3f4d9e4f
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c. · 91b7186c
      Kuniyuki Iwashima authored
      To catch regression, let's check ioctl(SIOCATMARK) after every
      send() and recv() calls.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      91b7186c
    • Kuniyuki Iwashima's avatar
      af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head. · e400cfa3
      Kuniyuki Iwashima authored
      Even if OOB data is recv()ed, ioctl(SIOCATMARK) must return 1 when the
      OOB skb is at the head of the receive queue and no new OOB data is queued.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.oob ...
        # msg_oob.c:305:oob:Expected answ[0] (0) == oob_head (1)
        # oob: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.oob
        not ok 2 msg_oob.no_peek.oob
      
      With fix:
      
        #  RUN           msg_oob.no_peek.oob ...
        #            OK  msg_oob.no_peek.oob
        ok 2 msg_oob.no_peek.oob
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e400cfa3
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c · 48a99837
      Kuniyuki Iwashima authored
      When OOB data is in recvq, we can detect it with epoll by checking
      EPOLLPRI.
      
      This patch add checks for EPOLLPRI after every send() and recv() in
      all test cases.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      48a99837
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Check SIGURG after every send() in msg_oob.c · d02689e6
      Kuniyuki Iwashima authored
      When data is sent with MSG_OOB, SIGURG is sent to a process if the
      receiver socket has set its owner to the process by ioctl(FIOSETOWN)
      or fcntl(F_SETOWN).
      
      This patch adds SIGURG check after every send(MSG_OOB) call.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d02689e6
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c · 436352e8
      Kuniyuki Iwashima authored
      When SO_OOBINLINE is enabled on a socket, MSG_OOB can be recv()ed
      without MSG_OOB flag, and ioctl(SIOCATMARK) will behaves differently.
      
      This patch adds some test cases for SO_OOBINLINE.
      
      Note the new test cases found two bugs in TCP.
      
        1) After reading OOB data with non-inline mode, we can re-read
           the data by setting SO_OOBINLINE.
      
        #  RUN           msg_oob.no_peek.inline_oob_ahead_break ...
        # msg_oob.c:146:inline_oob_ahead_break:AF_UNIX :world
        # msg_oob.c:147:inline_oob_ahead_break:TCP     :oworld
        #            OK  msg_oob.no_peek.inline_oob_ahead_break
        ok 14 msg_oob.no_peek.inline_oob_ahead_break
      
        2) The head OOB data is dropped if SO_OOBINLINE is disabled
           if a new OOB data is queued.
      
        #  RUN           msg_oob.no_peek.inline_ex_oob_drop ...
        # msg_oob.c:171:inline_ex_oob_drop:AF_UNIX :x
        # msg_oob.c:172:inline_ex_oob_drop:TCP     :y
        # msg_oob.c:146:inline_ex_oob_drop:AF_UNIX :y
        # msg_oob.c:147:inline_ex_oob_drop:TCP     :Resource temporarily unavailable
        #            OK  msg_oob.no_peek.inline_ex_oob_drop
        ok 17 msg_oob.no_peek.inline_ex_oob_drop
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      436352e8
    • Kuniyuki Iwashima's avatar
      af_unix: Don't stop recv() at consumed ex-OOB skb. · 36893ef0
      Kuniyuki Iwashima authored
      Currently, recv() is stopped at a consumed OOB skb even if a new
      OOB skb is queued and we can ignore the old OOB skb.
      
        >>> from socket import *
        >>> c1, c2 = socket(AF_UNIX, SOCK_STREAM)
        >>> c1.send(b'hellowor', MSG_OOB)
        8
        >>> c2.recv(1, MSG_OOB)  # consume OOB data stays at middle of recvq.
        b'r'
        >>> c1.send(b'ld', MSG_OOB)
        2
        >>> c2.recv(10)          # recv() stops at the old consumed OOB
        b'hellowo'               # should be 'hellowol'
      
      manage_oob() should not stop recv() at the old consumed OOB skb if
      there is a new OOB data queued.
      
      Note that TCP behaviour is apparently wrong in this test case because
      we can recv() the same OOB data twice.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
        # msg_oob.c:138:ex_oob_ahead_break:AF_UNIX :hellowo
        # msg_oob.c:139:ex_oob_ahead_break:Expected:hellowol
        # msg_oob.c:141:ex_oob_ahead_break:Expected ret[0] (7) == expected_len (8)
        # ex_oob_ahead_break: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.ex_oob_ahead_break
        not ok 11 msg_oob.no_peek.ex_oob_ahead_break
      
      With fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
        # msg_oob.c:146:ex_oob_ahead_break:AF_UNIX :hellowol
        # msg_oob.c:147:ex_oob_ahead_break:TCP     :helloworl
        #            OK  msg_oob.no_peek.ex_oob_ahead_break
        ok 11 msg_oob.no_peek.ex_oob_ahead_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      36893ef0
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c. · f5ea0768
      Kuniyuki Iwashima authored
      While testing, I found some weird behaviour on the TCP side as well.
      
      For example, TCP drops the preceding OOB data when queueing a new
      OOB data if the old OOB data is at the head of recvq.
      
        #  RUN           msg_oob.no_peek.ex_oob_drop ...
        # msg_oob.c:146:ex_oob_drop:AF_UNIX :x
        # msg_oob.c:147:ex_oob_drop:TCP     :Resource temporarily unavailable
        # msg_oob.c:146:ex_oob_drop:AF_UNIX :y
        # msg_oob.c:147:ex_oob_drop:TCP     :Invalid argument
        #            OK  msg_oob.no_peek.ex_oob_drop
        ok 9 msg_oob.no_peek.ex_oob_drop
      
        #  RUN           msg_oob.no_peek.ex_oob_drop_2 ...
        # msg_oob.c:146:ex_oob_drop_2:AF_UNIX :x
        # msg_oob.c:147:ex_oob_drop_2:TCP     :Resource temporarily unavailable
        #            OK  msg_oob.no_peek.ex_oob_drop_2
        ok 10 msg_oob.no_peek.ex_oob_drop_2
      
      This patch allows AF_UNIX's MSG_OOB implementation to produce different
      results from TCP when operations are guarded with tcp_incompliant{}.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f5ea0768
    • Kuniyuki Iwashima's avatar
      af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head. · 93c99f21
      Kuniyuki Iwashima authored
      Let's say a socket send()s "hello" with MSG_OOB and "world" without flags,
      
        >>> from socket import *
        >>> c1, c2 = socketpair(AF_UNIX)
        >>> c1.send(b'hello', MSG_OOB)
        5
        >>> c1.send(b'world')
        5
      
      and its peer recv()s "hell" and "o".
      
        >>> c2.recv(10)
        b'hell'
        >>> c2.recv(1, MSG_OOB)
        b'o'
      
      Now the consumed OOB skb stays at the head of recvq to return a correct
      value for ioctl(SIOCATMARK), which is broken now and fixed by a later
      patch.
      
      Then, if peer issues recv() with MSG_DONTWAIT, manage_oob() returns NULL,
      so recv() ends up with -EAGAIN.
      
        >>> c2.setblocking(False)  # This causes -EAGAIN even with available data
        >>> c2.recv(5)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        BlockingIOError: [Errno 11] Resource temporarily unavailable
      
      However, next recv() will return the following available data, "world".
      
        >>> c2.recv(5)
        b'world'
      
      When the consumed OOB skb is at the head of the queue, we need to fetch
      the next skb to fix the weird behaviour.
      
      Note that the issue does not happen without MSG_DONTWAIT because we can
      retry after manage_oob().
      
      This patch also adds a test case that covers the issue.
      
      Without fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_break ...
        # msg_oob.c:134:ex_oob_break:AF_UNIX :Resource temporarily unavailable
        # msg_oob.c:135:ex_oob_break:Expected:ld
        # msg_oob.c:137:ex_oob_break:Expected ret[0] (-1) == expected_len (2)
        # ex_oob_break: Test terminated by assertion
        #          FAIL  msg_oob.no_peek.ex_oob_break
        not ok 8 msg_oob.no_peek.ex_oob_break
      
      With fix:
      
        #  RUN           msg_oob.no_peek.ex_oob_break ...
        #            OK  msg_oob.no_peek.ex_oob_break
        ok 8 msg_oob.no_peek.ex_oob_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      93c99f21
    • Kuniyuki Iwashima's avatar
      af_unix: Stop recv(MSG_PEEK) at consumed OOB skb. · b94038d8
      Kuniyuki Iwashima authored
      After consuming OOB data, recv() reading the preceding data must break at
      the OOB skb regardless of MSG_PEEK.
      
      Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is
      not compliant with TCP.
      
        >>> from socket import *
        >>> c1, c2 = socketpair(AF_UNIX)
        >>> c1.send(b'hello', MSG_OOB)
        5
        >>> c1.send(b'world')
        5
        >>> c2.recv(1, MSG_OOB)
        b'o'
        >>> c2.recv(9, MSG_PEEK)  # This should return b'hell'
        b'hellworld'              # even with enough buffer.
      
      Let's fix it by returning NULL for consumed skb and unlinking it only if
      MSG_PEEK is not specified.
      
      This patch also adds test cases that add recv(MSG_PEEK) before each recv().
      
      Without fix:
      
        #  RUN           msg_oob.peek.oob_ahead_break ...
        # msg_oob.c:134:oob_ahead_break:AF_UNIX :hellworld
        # msg_oob.c:135:oob_ahead_break:Expected:hell
        # msg_oob.c:137:oob_ahead_break:Expected ret[0] (9) == expected_len (4)
        # oob_ahead_break: Test terminated by assertion
        #          FAIL  msg_oob.peek.oob_ahead_break
        not ok 13 msg_oob.peek.oob_ahead_break
      
      With fix:
      
        #  RUN           msg_oob.peek.oob_ahead_break ...
        #            OK  msg_oob.peek.oob_ahead_break
        ok 13 msg_oob.peek.oob_ahead_break
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b94038d8
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Add msg_oob.c. · d098d772
      Kuniyuki Iwashima authored
      AF_UNIX's MSG_OOB functionality lacked thorough testing, and we found
      some bizarre behaviour.
      
      The new selftest validates every MSG_OOB operation against TCP as a
      reference implementation.
      
      This patch adds only a few tests with basic send() and recv() that
      do not fail.
      
      The following patches will add more test cases for SO_OOBINLINE, SIGURG,
      EPOLLPRI, and SIOCATMARK.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d098d772
    • Kuniyuki Iwashima's avatar
      selftest: af_unix: Remove test_unix_oob.c. · 7d139181
      Kuniyuki Iwashima authored
      test_unix_oob.c does not fully cover AF_UNIX's MSG_OOB functionality,
      thus there are discrepancies between TCP behaviour.
      
      Also, the test uses fork() to create message producer, and it's not
      easy to understand and add more test cases.
      
      Let's remove test_unix_oob.c and rewrite a new test.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7d139181
    • Yunseong Kim's avatar
      tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset() · bab49231
      Yunseong Kim authored
      In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from
      
       qdisc->dev_queue->dev <NULL> ->name
      
      This situation simulated from bunch of veths and Bluetooth disconnection
      and reconnection.
      
      During qdisc initialization, qdisc was being set to noop_queue.
      In veth_init_queue, the initial tx_num was reduced back to one,
      causing the qdisc reset to be called with noop, which led to the kernel
      panic.
      
      I've attached the GitHub gist link that C converted syz-execprogram
      source code and 3 log of reproduced vmcore-dmesg.
      
       https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740
      
      Yeoreum and I use two fuzzing tool simultaneously.
      
      One process with syz-executor : https://github.com/google/syzkaller
      
       $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \
          -enable=none -collide=false log1
      
      The other process with perf fuzzer:
       https://github.com/deater/perf_event_tests/tree/master/fuzzer
      
       $ perf_event_tests/fuzzer/perf_fuzzer
      
      I think this will happen on the kernel version.
      
       Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10.
      
      This occurred from 51270d57. I think this patch is absolutely
      necessary. Previously, It was showing not intended string value of name.
      
      I've reproduced 3 time from my fedora 40 Debug Kernel with any other module
      or patched.
      
       version: 6.10.0-0.rc2.20240608gitdc772f82.29.fc41.aarch64+debug
      
      [ 5287.164555] veth0_vlan: left promiscuous mode
      [ 5287.164929] veth1_macvtap: left promiscuous mode
      [ 5287.164950] veth0_macvtap: left promiscuous mode
      [ 5287.164983] veth1_vlan: left promiscuous mode
      [ 5287.165008] veth0_vlan: left promiscuous mode
      [ 5287.165450] veth1_macvtap: left promiscuous mode
      [ 5287.165472] veth0_macvtap: left promiscuous mode
      [ 5287.165502] veth1_vlan: left promiscuous mode
      …
      [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state
      [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state
      [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state
      [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state
      [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0
      [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state
      [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state
      [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0
      [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state
      …
      [ 5298.002798] bridge_slave_0: left promiscuous mode
      [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state
      [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
      [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
      [ 5298.320207] bond0 (unregistering): Released all slaves
      [ 5298.354296] hsr_slave_0: left promiscuous mode
      [ 5298.360750] hsr_slave_1: left promiscuous mode
      [ 5298.374889] veth1_macvtap: left promiscuous mode
      [ 5298.374931] veth0_macvtap: left promiscuous mode
      [ 5298.374988] veth1_vlan: left promiscuous mode
      [ 5298.375024] veth0_vlan: left promiscuous mode
      [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed
      [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed
      …
      [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1
      [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9
      [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9
      ….
      [ 5301.075531] team0: Port device team_slave_1 added
      [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state
      [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state
      [ 5301.085588] bridge_slave_0: entered allmulticast mode
      [ 5301.085800] bridge_slave_0: entered promiscuous mode
      [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state
      [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state
      …
      [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
      [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
      [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
      [ 5301.193481] hsr_slave_0: entered promiscuous mode
      [ 5301.204425] hsr_slave_1: entered promiscuous mode
      [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present!
      [ 5301.210185] Cannot create hsr debugfs directory
      [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
      [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
      [ 5301.255934] team0: Port device team_slave_0 added
      [ 5301.256480] team0: Port device team_slave_1 added
      [ 5301.256948] team0: Port device team_slave_0 added
      …
      [ 5301.435928] hsr_slave_0: entered promiscuous mode
      [ 5301.446029] hsr_slave_1: entered promiscuous mode
      [ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present!
      [ 5301.455884] Cannot create hsr debugfs directory
      [ 5301.502664] hsr_slave_0: entered promiscuous mode
      [ 5301.513675] hsr_slave_1: entered promiscuous mode
      [ 5301.526155] debugfs: Directory 'hsr0' with parent 'hsr' already present!
      [ 5301.526164] Cannot create hsr debugfs directory
      [ 5301.563662] hsr_slave_0: entered promiscuous mode
      [ 5301.576129] hsr_slave_1: entered promiscuous mode
      [ 5301.580259] debugfs: Directory 'hsr0' with parent 'hsr' already present!
      [ 5301.580270] Cannot create hsr debugfs directory
      [ 5301.590269] 8021q: adding VLAN 0 to HW filter on device bond0
      
      [ 5301.595872] KASAN: null-ptr-deref in range [0x0000000000000130-0x0000000000000137]
      [ 5301.595877] Mem abort info:
      [ 5301.595881]   ESR = 0x0000000096000006
      [ 5301.595885]   EC = 0x25: DABT (current EL), IL = 32 bits
      [ 5301.595889]   SET = 0, FnV = 0
      [ 5301.595893]   EA = 0, S1PTW = 0
      [ 5301.595896]   FSC = 0x06: level 2 translation fault
      [ 5301.595900] Data abort info:
      [ 5301.595903]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
      [ 5301.595907]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      [ 5301.595911]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [ 5301.595915] [dfff800000000026] address between user and kernel address ranges
      [ 5301.595971] Internal error: Oops: 0000000096000006 [#1] SMP
      …
      [ 5301.596076] CPU: 2 PID: 102769 Comm:
      syz-executor.3 Kdump: loaded Tainted:
       G        W         -------  ---  6.10.0-0.rc2.20240608gitdc772f82.29.fc41.aarch64+debug #1
      [ 5301.596080] Hardware name: VMware, Inc. VMware20,1/VBSA,
       BIOS VMW201.00V.21805430.BA64.2305221830 05/22/2023
      [ 5301.596082] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
      [ 5301.596085] pc : strnlen+0x40/0x88
      [ 5301.596114] lr : trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
      [ 5301.596124] sp : ffff8000beef6b40
      [ 5301.596126] x29: ffff8000beef6b40 x28: dfff800000000000 x27: 0000000000000001
      [ 5301.596131] x26: 6de1800082c62bd0 x25: 1ffff000110aa9e0 x24: ffff800088554f00
      [ 5301.596136] x23: ffff800088554ec0 x22: 0000000000000130 x21: 0000000000000140
      [ 5301.596140] x20: dfff800000000000 x19: ffff8000beef6c60 x18: ffff7000115106d8
      [ 5301.596143] x17: ffff800121bad000 x16: ffff800080020000 x15: 0000000000000006
      [ 5301.596147] x14: 0000000000000002 x13: ffff0001f3ed8d14 x12: ffff700017ddeda5
      [ 5301.596151] x11: 1ffff00017ddeda4 x10: ffff700017ddeda4 x9 : ffff800082cc5eec
      [ 5301.596155] x8 : 0000000000000004 x7 : 00000000f1f1f1f1 x6 : 00000000f2f2f200
      [ 5301.596158] x5 : 00000000f3f3f3f3 x4 : ffff700017dded80 x3 : 00000000f204f1f1
      [ 5301.596162] x2 : 0000000000000026 x1 : 0000000000000000 x0 : 0000000000000130
      [ 5301.596166] Call trace:
      [ 5301.596175]  strnlen+0x40/0x88
      [ 5301.596179]  trace_event_get_offsets_qdisc_reset+0x6c/0x2b0
      [ 5301.596182]  perf_trace_qdisc_reset+0xb0/0x538
      [ 5301.596184]  __traceiter_qdisc_reset+0x68/0xc0
      [ 5301.596188]  qdisc_reset+0x43c/0x5e8
      [ 5301.596190]  netif_set_real_num_tx_queues+0x288/0x770
      [ 5301.596194]  veth_init_queues+0xfc/0x130 [veth]
      [ 5301.596198]  veth_newlink+0x45c/0x850 [veth]
      [ 5301.596202]  rtnl_newlink_create+0x2c8/0x798
      [ 5301.596205]  __rtnl_newlink+0x92c/0xb60
      [ 5301.596208]  rtnl_newlink+0xd8/0x130
      [ 5301.596211]  rtnetlink_rcv_msg+0x2e0/0x890
      [ 5301.596214]  netlink_rcv_skb+0x1c4/0x380
      [ 5301.596225]  rtnetlink_rcv+0x20/0x38
      [ 5301.596227]  netlink_unicast+0x3c8/0x640
      [ 5301.596231]  netlink_sendmsg+0x658/0xa60
      [ 5301.596234]  __sock_sendmsg+0xd0/0x180
      [ 5301.596243]  __sys_sendto+0x1c0/0x280
      [ 5301.596246]  __arm64_sys_sendto+0xc8/0x150
      [ 5301.596249]  invoke_syscall+0xdc/0x268
      [ 5301.596256]  el0_svc_common.constprop.0+0x16c/0x240
      [ 5301.596259]  do_el0_svc+0x48/0x68
      [ 5301.596261]  el0_svc+0x50/0x188
      [ 5301.596265]  el0t_64_sync_handler+0x120/0x130
      [ 5301.596268]  el0t_64_sync+0x194/0x198
      [ 5301.596272] Code: eb15001f 54000120 d343fc02 12000801 (38f46842)
      [ 5301.596285] SMP: stopping secondary CPUs
      [ 5301.597053] Starting crashdump kernel...
      [ 5301.597057] Bye!
      
      After applying our patch, I didn't find any kernel panic errors.
      
      We've found a simple reproducer
      
       # echo 1 > /sys/kernel/debug/tracing/events/qdisc/qdisc_reset/enable
      
       # ip link add veth0 type veth peer name veth1
      
       Error: Unknown device type.
      
      However, without our patch applied, I tested upstream 6.10.0-rc3 kernel
      using the qdisc_reset event and the ip command on my qemu virtual machine.
      
      This 2 commands makes always kernel panic.
      
      Linux version: 6.10.0-rc3
      
      [    0.000000] Linux version 6.10.0-rc3-00164-g44ef20ba-dirty
      (paran@fedora) (gcc (GCC) 14.1.1 20240522 (Red Hat 14.1.1-4), GNU ld
      version 2.41-34.fc40) #20 SMP PREEMPT Sat Jun 15 16:51:25 KST 2024
      
      Kernel panic message:
      
      [  615.236484] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
      [  615.237250] Dumping ftrace buffer:
      [  615.237679]    (ftrace buffer empty)
      [  615.238097] Modules linked in: veth crct10dif_ce virtio_gpu
      virtio_dma_buf drm_shmem_helper drm_kms_helper zynqmp_fpga xilinx_can
      xilinx_spi xilinx_selectmap xilinx_core xilinx_pr_decoupler versal_fpga
      uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev
      videobuf2_common mc usbnet deflate zstd ubifs ubi rcar_canfd rcar_can
      omap_mailbox ntb_msi_test ntb_hw_epf lattice_sysconfig_spi
      lattice_sysconfig ice40_spi gpio_xilinx dwmac_altr_socfpga mdio_regmap
      stmmac_platform stmmac pcs_xpcs dfl_fme_region dfl_fme_mgr dfl_fme_br
      dfl_afu dfl fpga_region fpga_bridge can can_dev br_netfilter bridge stp
      llc atl1c ath11k_pci mhi ath11k_ahb ath11k qmi_helpers ath10k_sdio
      ath10k_pci ath10k_core ath mac80211 libarc4 cfg80211 drm fuse backlight ipv6
      Jun 22 02:36:5[3   6k152.62-4sm98k4-0k]v  kCePUr:n e1l :P IUDn:a b4le6
      8t oC ohmma: nidpl eN oketr nteali nptaedg i6n.g1 0re.0q-urecs3t- 0at0
      1v6i4r-tgu4a4le fa2d0dbraeeds0se-dir tyd f#f2f08
        615.252376] Hardware name: linux,dummy-virt (DT)
      [  615.253220] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS
      BTYPE=--)
      [  615.254433] pc : strnlen+0x6c/0xe0
      [  615.255096] lr : trace_event_get_offsets_qdisc_reset+0x94/0x3d0
      [  615.256088] sp : ffff800080b269a0
      [  615.256615] x29: ffff800080b269a0 x28: ffffc070f3f98500 x27:
      0000000000000001
      [  615.257831] x26: 0000000000000010 x25: ffffc070f3f98540 x24:
      ffffc070f619cf60
      [  615.259020] x23: 0000000000000128 x22: 0000000000000138 x21:
      dfff800000000000
      [  615.260241] x20: ffffc070f631ad00 x19: 0000000000000128 x18:
      ffffc070f448b800
      [  615.261454] x17: 0000000000000000 x16: 0000000000000001 x15:
      ffffc070f4ba2a90
      [  615.262635] x14: ffff700010164d73 x13: 1ffff80e1e8d5eb3 x12:
      1ffff00010164d72
      [  615.263877] x11: ffff700010164d72 x10: dfff800000000000 x9 :
      ffffc070e85d6184
      [  615.265047] x8 : ffffc070e4402070 x7 : 000000000000f1f1 x6 :
      000000001504a6d3
      [  615.266336] x5 : ffff28ca21122140 x4 : ffffc070f5043ea8 x3 :
      0000000000000000
      [  615.267528] x2 : 0000000000000025 x1 : 0000000000000000 x0 :
      0000000000000000
      [  615.268747] Call trace:
      [  615.269180]  strnlen+0x6c/0xe0
      [  615.269767]  trace_event_get_offsets_qdisc_reset+0x94/0x3d0
      [  615.270716]  trace_event_raw_event_qdisc_reset+0xe8/0x4e8
      [  615.271667]  __traceiter_qdisc_reset+0xa0/0x140
      [  615.272499]  qdisc_reset+0x554/0x848
      [  615.273134]  netif_set_real_num_tx_queues+0x360/0x9a8
      [  615.274050]  veth_init_queues+0x110/0x220 [veth]
      [  615.275110]  veth_newlink+0x538/0xa50 [veth]
      [  615.276172]  __rtnl_newlink+0x11e4/0x1bc8
      [  615.276944]  rtnl_newlink+0xac/0x120
      [  615.277657]  rtnetlink_rcv_msg+0x4e4/0x1370
      [  615.278409]  netlink_rcv_skb+0x25c/0x4f0
      [  615.279122]  rtnetlink_rcv+0x48/0x70
      [  615.279769]  netlink_unicast+0x5a8/0x7b8
      [  615.280462]  netlink_sendmsg+0xa70/0x1190
      
      Yeoreum and I don't know if the patch we wrote will fix the underlying
      cause, but we think that priority is to prevent kernel panic happening.
      So, we're sending this patch.
      
      Fixes: 51270d57 ("tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string")
      Link: https://lore.kernel.org/lkml/20240229143432.273b4871@gandalf.local.home/t/
      Cc: netdev@vger.kernel.org
      Tested-by: default avatarYunseong Kim <yskelg@gmail.com>
      Signed-off-by: default avatarYunseong Kim <yskelg@gmail.com>
      Signed-off-by: default avatarYeoreum Yun <yeoreum.yun@arm.com>
      Link: https://lore.kernel.org/r/20240624173320.24945-4-yskelg@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      bab49231