1. 23 May, 2024 1 commit
  2. 22 May, 2024 2 commits
  3. 21 May, 2024 3 commits
  4. 18 May, 2024 14 commits
    • Mohammad Shehar Yaar Tausif's avatar
      bpf: Fix order of args in call to bpf_map_kvcalloc · 6f130e4d
      Mohammad Shehar Yaar Tausif authored
      The original function call passed size of smap->bucket before the number of
      buckets which raises the error 'calloc-transposed-args' on compilation.
      Signed-off-by: default avatarMohammad Shehar Yaar Tausif <sheharyaar48@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240516072411.42016-1-sheharyaar48@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6f130e4d
    • Alan Maguire's avatar
      kbuild, bpf: Use test-ge check for v1.25-only pahole · 34021cae
      Alan Maguire authored
      There is no need to set the pahole v1.25-only flags in an
      "ifeq" version clause; we are already in a <= v1.25 branch
      of "ifeq", so that combined with a "test-ge" v1.25 ensures the
      flags will be applied for v1.25 only.
      Suggested-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240514162716.2448265-1-alan.maguire@oracle.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      34021cae
    • Artem Savkov's avatar
      bpftool: Fix make dependencies for vmlinux.h · e7b64f9d
      Artem Savkov authored
      With pre-generated vmlinux.h there is no dependency on neither vmlinux
      nor bootstrap bpftool. Define dependencies separately for both modes.
      This avoids needless rebuilds in some corner cases.
      Suggested-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarArtem Savkov <asavkov@redhat.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarQuentin Monnet <qmo@kernel.org>
      Link: https://lore.kernel.org/bpf/20240513112658.43691-1-asavkov@redhat.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e7b64f9d
    • Mykyta Yatsenko's avatar
      bpftool: Introduce btf c dump sorting · 94133cf2
      Mykyta Yatsenko authored
      Sort bpftool c dump output; aiming to simplify vmlinux.h diffing and
      forcing more natural type definitions ordering.
      
      Definitions are sorted first by their BTF kind ranks, then by their base
      type name and by their own name.
      
      Type ranks
      
      Assign ranks to btf kinds (defined in function btf_type_rank) to set
      next order:
      1. Anonymous enums/enums64
      2. Named enums/enums64
      3. Trivial types typedefs (ints, then floats)
      4. Structs/Unions
      5. Function prototypes
      6. Forward declarations
      
      Type rank is set to maximum for unnamed reference types, structs and
      unions to avoid emitting those types early. They will be emitted as
      part of the type chain starting with named type.
      
      Lexicographical ordering
      
      Each type is assigned a sort_name and own_name.
      sort_name is the resolved name of the final base type for reference
      types (typedef, pointer, array etc). Sorting by sort_name allows to
      group typedefs of the same base type. sort_name for non-reference type
      is the same as own_name. own_name is a direct name of particular type,
      is used as final sorting step.
      Signed-off-by: default avatarMykyta Yatsenko <yatsenko@meta.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Tested-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarQuentin Monnet <qmo@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240514131221.20585-1-yatsenko@meta.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      94133cf2
    • Linus Torvalds's avatar
      kprobe/ftrace: fix build error due to bad function definition · 4b377b48
      Linus Torvalds authored
      Commit 1a7d0890 ("kprobe/ftrace: bail out if ftrace was killed")
      introduced a bad K&R function definition, which we haven't accepted in a
      long long time.
      
      Gcc seems to let it slide, but clang notices with the appropriate error:
      
        kernel/kprobes.c:1140:24: error: a function declaration without a prototype is deprecated in all >
         1140 | void kprobe_ftrace_kill()
              |                        ^
              |                         void
      
      but this commit was apparently never in linux-next before it was sent
      upstream, so it didn't get the appropriate build test coverage.
      
      Fixes: 1a7d0890 kprobe/ftrace: bail out if ftrace was killed
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b377b48
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f08a1e91
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Current release - regressions:
      
         - virtio_net: fix missed error path rtnl_unlock after control queue
           locking rework
      
        Current release - new code bugs:
      
         - bpf: fix KASAN slab-out-of-bounds in percpu_array_map_gen_lookup,
           caused by missing nested map handling
      
         - drv: dsa: correct initialization order for KSZ88x3 ports
      
        Previous releases - regressions:
      
         - af_packet: do not call packet_read_pending() from
           tpacket_destruct_skb() fix performance regression
      
         - ipv6: fix route deleting failure when metric equals 0, don't assume
           0 means not set / default in this case
      
        Previous releases - always broken:
      
         - bridge: couple of syzbot-driven fixes"
      
      * tag 'net-6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (30 commits)
        selftests: net: local_termination: annotate the expected failures
        net: dsa: microchip: Correct initialization order for KSZ88x3 ports
        MAINTAINERS: net: Update reviewers for TI's Ethernet drivers
        dt-bindings: net: ti: Update maintainers list
        l2tp: fix ICMP error handling for UDP-encap sockets
        net: txgbe: fix to control VLAN strip
        net: wangxun: match VLAN CTAG and STAG features
        net: wangxun: fix to change Rx features
        af_packet: do not call packet_read_pending() from tpacket_destruct_skb()
        virtio_net: Fix missed rtnl_unlock
        netrom: fix possible dead-lock in nr_rt_ioctl()
        idpf: don't skip over ethtool tcp-data-split setting
        dt-bindings: net: qcom: ethernet: Allow dma-coherent
        bonding: fix oops during rmmod
        net/ipv6: Fix route deleting failure when metric equals 0
        selftests/net: reduce xfrm_policy test time
        selftests/bpf: Adjust btf_dump test to reflect recent change in file_operations
        selftests/bpf: Adjust test_access_variable_array after a kernel function name change
        selftests/net/lib: no need to record ns name if it already exist
        net: qrtr: ns: Fix module refcnt
        ...
      f08a1e91
    • Linus Torvalds's avatar
      Merge tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 26aa834f
      Linus Torvalds authored
      Pull tracing tool updates from Steven Rostedt:
       "Specific for timerlat:
      
         - Improve the output of timerlat top by adding a missing \n, and by
           avoiding printing color-formatting characters where they are
           translated to regular characters.
      
         - Improve timerlat auto-analysis output by replacing '\t' with spaces
           to avoid copy-and-paste issues when reporting problems.
      
         - Make the user-space (-u) option the default, as it is the most
           complete test. Add a -k option to use the in-kernel workload.
      
         - On timerlat top and hist, add a summary with the overall results.
           For instance, the minimum value for all CPUs, the overall average
           and the maximum value from all CPUs.
      
         - timerlat hist was printing initial values (i.e., 0 as max, and ~0
           as min) if the trace stopped before the first Ret-User event. This
           problem was fixed by printing the " - " no value string to the
           output if that was the case.
      
        For all RTLA tools:
      
         - Add a --warm-up <seconds> option, allowing the workload to run for
           <seconds> before starting to collect results.
      
         - Add a --trace-buffer-size option, allowing the user to set the
           tracing buffer size for -t option. This option is mainly useful for
           reducing the trace file. Now rtla depends on libtracefs >= 1.6.
      
         - Fix the -t [trace_file] parsing, now it does not require the '='
           before the option parameter, and better handles the multiple ways a
           user can pass the trace_file.txt"
      
      * tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        rtla: Documentation: Fix -t, --trace
        rtla: Fix -t\--trace[=file]
        rtla/timerlat: Fix histogram report when a cpu count is 0
        rtla: Add --trace-buffer-size option
        rtla/timerlat: Make user-space threads the default
        rtla: Add the --warm-up option
        rtla/timerlat: Add a summary for hist mode
        rtla/timerlat: Add a summary for top mode
        rtla/timerlat: Use pretty formatting only on interactive tty
        rtla/auto-analysis: Replace \t with spaces
        rtla/timerlat: Simplify "no value" printing on top
      26aa834f
    • Linus Torvalds's avatar
      Merge tag 'trace-user-events-v6.10' of... · fa3889d9
      Linus Torvalds authored
      Merge tag 'trace-user-events-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull tracing user-event updates from Steven Rostedt:
      
       - Minor update to the user_events interface
      
        The ABI of creating a user event states that the fields are separated
        by semicolons, and spaces should be ignored.
      
        But the parsing expected at least one space to be there (which was
        incorrect). Fix the reading of the string to handle fields separated
        by semicolons but no space between them.
      
        This does extend the API sightly as now "field;field" will now be
        parsed and not cause an error. But it should not cause any regressions
        as no logic should expect it to fail.
      
        Note, that the logic that parses the event fields to create the
        trace_event works with no spaces after the semi-colon. It is
        the logic that tests against existing events that is inconsistent.
        This causes registering an event without using spaces to succeed
        if it doesn't exist, but makes the same call that tries to register
        to the same event, but doesn't use spaces, fail.
      
      * tag 'trace-user-events-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        selftests/user_events: Add non-spacing separator check
        tracing/user_events: Fix non-spaced field matching
      fa3889d9
    • Linus Torvalds's avatar
      Merge tag 'trace-ringbuffer-v6.10' of... · 53683e40
      Linus Torvalds authored
      Merge tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull tracing ring buffer updates from Steven Rostedt:
       "Add ring_buffer memory mappings.
      
        The tracing ring buffer was created based on being mostly used with
        the splice system call. It is broken up into page ordered sub-buffers
        and the reader swaps a new sub-buffer with an existing sub-buffer
        that's part of the write buffer. It then has total access to the
        swapped out sub-buffer and can do copyless movements of the memory
        into other mediums (file system, network, etc).
      
        The buffer is great for passing around the ring buffer contents in the
        kernel, but is not so good for when the consumer is the user space
        task itself.
      
        A new interface is added that allows user space to memory map the ring
        buffer. It will get all the write sub-buffers as well as reader
        sub-buffer (that is not written to). It can send an ioctl to change
        which sub-buffer is the new reader sub-buffer.
      
        The ring buffer is read only to user space. It only needs to call the
        ioctl when it is finished with a sub-buffer and needs a new sub-buffer
        that the writer will not write over.
      
        A self test program was also created for testing and can be used as an
        example for the interface to user space. The libtracefs (external to
        the kernel) also has code that interacts with this, although it is
        disabled until the interface is in a official release. It can be
        enabled by compiling the library with a special flag. This was used
        for testing applications that perform better with the buffer being
        mapped.
      
        Memory mapped buffers have limitations. The main one is that it can
        not be used with the snapshot logic. If the buffer is mapped,
        snapshots will be disabled. If any logic is set to trigger snapshots
        on a buffer, that buffer will not be allowed to be mapped"
      
      * tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        ring-buffer: Add cast to unsigned long addr passed to virt_to_page()
        ring-buffer: Have mmapped ring buffer keep track of missed events
        ring-buffer/selftest: Add ring-buffer mapping test
        Documentation: tracing: Add ring-buffer mapping
        tracing: Allow user-space mapping of the ring-buffer
        ring-buffer: Introducing ring-buffer mapping functions
        ring-buffer: Allocate sub-buffers with __GFP_COMP
      53683e40
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 594d2815
      Linus Torvalds authored
      Pull tracing updates from Steven Rostedt:
      
       - Remove unused ftrace_direct_funcs variables
      
       - Fix a possible NULL pointer dereference race in eventfs
      
       - Update do_div() usage in trace event benchmark test
      
       - Speedup direct function registration with asynchronous RCU callback.
      
         The synchronization was done in the registration code and this caused
         delays when registering direct callbacks. Move the freeing to a
         call_rcu() that will prevent delaying of the registering.
      
       - Replace simple_strtoul() usage with kstrtoul()
      
      * tag 'trace-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        eventfs: Fix a possible null pointer dereference in eventfs_find_events()
        ftrace: Fix possible use-after-free issue in ftrace_location()
        ftrace: Remove unused global 'ftrace_direct_func_count'
        ftrace: Remove unused list 'ftrace_direct_funcs'
        tracing: Improve benchmark test performance by using do_div()
        ftrace: Use asynchronous grace period for register_ftrace_direct()
        ftrace: Replaces simple_strtoul in ftrace
      594d2815
    • Linus Torvalds's avatar
      Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 70a66320
      Linus Torvalds authored
      Pull probes updates from Masami Hiramatsu:
      
       - tracing/probes: Add new pseudo-types %pd and %pD support for dumping
         dentry name from 'struct dentry *' and file name from 'struct file *'
      
       - uprobes performance optimizations:
          - Speed up the BPF uprobe event by delaying the fetching of the
            uprobe event arguments that are not used in BPF
          - Avoid locking by speculatively checking whether uprobe event is
            valid
          - Reduce lock contention by using read/write_lock instead of
            spinlock for uprobe list operation. This improved BPF uprobe
            benchmark result 43% on average
      
       - rethook: Remove non-fatal warning messages when tracing stack from
         BPF and skip rcu_is_watching() validation in rethook if possible
      
       - objpool: Optimize objpool (which is used by kretprobes and fprobe as
         rethook backend storage) by inlining functions and avoid caching
         nr_cpu_ids because it is a const value
      
       - fprobe: Add entry/exit callbacks types (code cleanup)
      
       - kprobes: Check ftrace was killed in kprobes if it uses ftrace
      
      * tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        kprobe/ftrace: bail out if ftrace was killed
        selftests/ftrace: Fix required features for VFS type test case
        objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
        objpool: enable inlining objpool_push() and objpool_pop() operations
        rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get()
        ftrace: make extra rcu_is_watching() validation check optional
        uprobes: reduce contention on uprobes_tree access
        rethook: Remove warning messages printed for finding return address of a frame.
        fprobe: Add entry/exit callbacks types
        selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
        selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
        Documentation: tracing: add new type '%pd' and '%pD' for kprobe
        tracing/probes: support '%pD' type for print struct file's name
        tracing/probes: support '%pd' type for print struct dentry's name
        uprobes: add speculative lockless system-wide uprobe filter check
        uprobes: prepare uprobe args buffer lazily
        uprobes: encapsulate preparation of uprobe args buffer
      70a66320
    • Linus Torvalds's avatar
      Merge tag 'bootconfig-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · e9d68251
      Linus Torvalds authored
      Pull bootconfig updates from Masami Hiramatsu:
      
       - Do not put unneeded quotes on the extra command line items which was
         inserted from the bootconfig.
      
       - Remove redundant spaces from the extra command line.
      
      * tag 'bootconfig-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        init/main.c: Minor cleanup for the setup_command_line() function
        init/main.c: Remove redundant space from saved_command_line
        bootconfig: do not put quotes on cmdline items unless necessary
      e9d68251
    • Linus Torvalds's avatar
      Merge tag 'sysctl-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl · 91b6163b
      Linus Torvalds authored
      Pull sysctl updates from Joel Granados:
      
       - Remove sentinel elements from ctl_table structs in kernel/*
      
         Removing sentinels in ctl_table arrays reduces the build time size
         and runtime memory consumed by ~64 bytes per array. Removals for
         net/, io_uring/, mm/, ipc/ and security/ are set to go into mainline
         through their respective subsystems making the next release the most
         likely place where the final series that removes the check for
         proc_name == NULL will land.
      
         This adds to removals already in arch/, drivers/ and fs/.
      
       - Adjust ctl_table definitions and references to allow constification
           - Remove unused ctl_table function arguments
           - Move non-const elements from ctl_table to ctl_table_header
           - Make ctl_table pointers const in ctl_table_root structure
      
         Making the static ctl_table structs const will increase safety by
         keeping the pointers to proc_handler functions in .rodata. Though no
         ctl_tables where made const in this PR, the ground work for making
         that possible has started with these changes sent by Thomas
         Weißschuh.
      
      * tag 'sysctl-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
        sysctl: drop now unnecessary out-of-bounds check
        sysctl: move sysctl type to ctl_table_header
        sysctl: drop sysctl_is_perm_empty_ctl_table
        sysctl: treewide: constify argument ctl_table_root::permissions(table)
        sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)
        bpf: Remove the now superfluous sentinel elements from ctl_table array
        delayacct: Remove the now superfluous sentinel elements from ctl_table array
        kprobes: Remove the now superfluous sentinel elements from ctl_table array
        printk: Remove the now superfluous sentinel elements from ctl_table array
        scheduler: Remove the now superfluous sentinel elements from ctl_table array
        seccomp: Remove the now superfluous sentinel elements from ctl_table array
        timekeeping: Remove the now superfluous sentinel elements from ctl_table array
        ftrace: Remove the now superfluous sentinel elements from ctl_table array
        umh: Remove the now superfluous sentinel elements from ctl_table array
        kernel misc: Remove the now superfluous sentinel elements from ctl_table array
      91b6163b
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 06f054b1
      Linus Torvalds authored
      Pull devicetree updates from Rob Herring:
       "DT Bindings:
      
         - Convert samsung,exynos5-dp, atmel,lcdc, aspeed,ast2400-wdt bindings
           to schemas
      
         - Add bindings for Allwinner H616 NMI controller, Renesas r8a779g0
           irqc, Renesas R-Car V4M TMU and CMT timers, Freescale S32G3
           linflexuart, and Mediatek MT7988 XHCI
      
         - Add 'reg' constraints on DSI and SPI display panels
      
         - More dropping of unnecessary quotes in schemas
      
         - Use full paths rather than relative paths in schema $refs
      
         - Drop redundant storing of phandle for reserved memory
      
        DT Core:
      
         - Use scope based cleanups for kfree() and of_node_put()
      
         - Track interrupt-map and power-supplies for fw_devlink
      
         - Add buffer overflow check in of_modalias()
      
         - Add and use __of_prop_free() helper for freeing struct property"
      
      * tag 'devicetree-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (25 commits)
        of: property: Add fw_devlink support for interrupt-map property
        dt-bindings: display: panel: constrain 'reg' in DSI panels
        dt-bindings: display: panel: constrain 'reg' in SPI panels
        dt-bindings: display: samsung,ams495qa01: add missing SPI properties ref
        dt-bindings: Use full path to other schemas
        dt-bindings: PCI: qcom,pcie-sm8350: Drop redundant 'oneOf' sub-schema
        of: module: add buffer overflow check in of_modalias()
        dt-bindings: PCI: microchip: increase number of items in ranges property
        dt-bindings: Drop unnecessary quotes on keys
        dt-bindings: interrupt-controller: mediatek,mt6577-sysirq: Drop unnecessary quotes
        of: property: Use scope based cleanup on port_node
        of: reserved_mem: Remove the use of phandle from the reserved_mem APIs
        of: property: fw_devlink: Add support for "power-supplies" binding
        dt-bindings: watchdog: aspeed,ast2400-wdt: Convert to DT schema
        dt-bindings: irq: sun7i-nmi: Add binding for the H616 NMI controller
        dt-bindings: interrupt-controller: renesas,irqc: Add r8a779g0 support
        dt-bindings: timer: renesas,tmu: Add R-Car V4M support
        dt-bindings: timer: renesas,cmt: Add R-Car V4M support
        of: Use scope based of_node_put() cleanups
        of: Use scope based kfree() cleanups
        ...
      06f054b1
  5. 17 May, 2024 20 commits
    • Jakub Kicinski's avatar
      selftests: net: local_termination: annotate the expected failures · fe56d6e4
      Jakub Kicinski authored
      Vladimir said when adding this test:
      
        The bridge driver fares particularly badly [...] mainly because
        it does not implement IFF_UNICAST_FLT.
      
      See commit 90b9566a ("selftests: forwarding: add a test for
      local_termination.sh").
      
      We don't want to hide the known gaps, but having a test which
      always fails prevents us from catching regressions. Report
      the cases we know may fail as XFAIL.
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/20240516152513.1115270-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fe56d6e4
    • Oleksij Rempel's avatar
      net: dsa: microchip: Correct initialization order for KSZ88x3 ports · f0fa8411
      Oleksij Rempel authored
      Adjust the initialization sequence of KSZ88x3 switches to enable
      802.1p priority control on Port 2 before configuring Port 1. This
      change ensures the apptrust functionality on Port 1 operates
      correctly, as it depends on the priority settings of Port 2. The
      prior initialization sequence incorrectly configured Port 1 first,
      which could lead to functional discrepancies.
      
      Fixes: a1ea5771 ("net: dsa: microchip: dcb: add special handling for KSZ88X3 family")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Acked-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Link: https://lore.kernel.org/r/20240517050121.2174412-1-o.rempel@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f0fa8411
    • Ravi Gunasekaran's avatar
    • Ravi Gunasekaran's avatar
      dt-bindings: net: ti: Update maintainers list · ce08eeb5
      Ravi Gunasekaran authored
      Update the list with the current maintainers of TI's CPSW ethernet
      peripheral.
      Signed-off-by: default avatarRavi Gunasekaran <r-gunasekaran@ti.com>
      Acked-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Acked-by: default avatarRoger Quadros <rogerq@kernel.org>
      Link: https://lore.kernel.org/r/20240516054932.27597-1-r-gunasekaran@ti.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ce08eeb5
    • Tom Parkin's avatar
      l2tp: fix ICMP error handling for UDP-encap sockets · 6e828dc6
      Tom Parkin authored
      Since commit a36e185e
      ("udp: Handle ICMP errors for tunnels with same destination port on both endpoints")
      UDP's handling of ICMP errors has allowed for UDP-encap tunnels to
      determine socket associations in scenarios where the UDP hash lookup
      could not.
      
      Subsequently, commit d26796ae
      ("udp: check udp sock encap_type in __udp_lib_err")
      subtly tweaked the approach such that UDP ICMP error handling would be
      skipped for any UDP socket which has encapsulation enabled.
      
      In the case of L2TP tunnel sockets using UDP-encap, this latter
      modification effectively broke ICMP error reporting for the L2TP
      control plane.
      
      To a degree this isn't catastrophic inasmuch as the L2TP control
      protocol defines a reliable transport on top of the underlying packet
      switching network which will eventually detect errors and time out.
      
      However, paying attention to the ICMP error reporting allows for more
      timely detection of errors in L2TP userspace, and aids in debugging
      connectivity issues.
      
      Reinstate ICMP error handling for UDP encap L2TP tunnels:
      
       * implement struct udp_tunnel_sock_cfg .encap_err_rcv in order to allow
         the L2TP code to handle ICMP errors;
      
       * only implement error-handling for tunnels which have a managed
         socket: unmanaged tunnels using a kernel socket have no userspace to
         report errors back to;
      
       * flag the error on the socket, which allows for userspace to get an
         error such as -ECONNREFUSED back from sendmsg/recvmsg;
      
       * pass the error into ip[v6]_icmp_error() which allows for userspace to
         get extended error information via. MSG_ERRQUEUE.
      
      Fixes: d26796ae ("udp: check udp sock encap_type in __udp_lib_err")
      Signed-off-by: default avatarTom Parkin <tparkin@katalix.com>
      Link: https://lore.kernel.org/r/20240513172248.623261-1-tparkin@katalix.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6e828dc6
    • Linus Torvalds's avatar
      Merge tag 'parisc-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 7ee332c9
      Linus Torvalds authored
      Pull parisc updates from Helge Deller:
      
       -  define sigset_t in parisc uapi header to fix build of util-linux
      
       -  define HAVE_ARCH_HUGETLB_UNMAPPED_AREA to avoid compiler warning
      
       -  drop unused 'exc_reg' struct in math-emu code
      
      * tag 'parisc-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
        parisc/math-emu: Remove unused struct 'exc_reg'
        parisc: Define sigset_t in parisc uapi header
      7ee332c9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ff2632d7
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Enable BPF Kernel Functions (kfuncs) in the powerpc BPF JIT.
      
       - Allow per-process DEXCR (Dynamic Execution Control Register) settings
         via prctl, notably NPHIE which controls hashst/hashchk for ROP
         protection.
      
       - Install powerpc selftests in sub-directories. Note this changes the
         way run_kselftest.sh needs to be invoked for powerpc selftests.
      
       - Change fadump (Firmware Assisted Dump) to better handle memory
         add/remove.
      
       - Add support for passing additional parameters to the fadump kernel.
      
       - Add support for updating the kdump image on CPU/memory add/remove
         events.
      
       - Other small features, cleanups and fixes.
      
      Thanks to Andrew Donnellan, Andy Shevchenko, Aneesh Kumar K.V, Arnd
      Bergmann, Benjamin Gray, Bjorn Helgaas, Christian Zigotzky, Christophe
      Jaillet, Christophe Leroy, Colin Ian King, Cédric Le Goater, Dr. David
      Alan Gilbert, Erhard Furtner, Frank Li, GUO Zihua, Ganesh Goudar, Geoff
      Levand, Ghanshyam Agrawal, Greg Kurz, Hari Bathini, Joel Stanley, Justin
      Stitt, Kunwu Chan, Li Yang, Lidong Zhong, Madhavan Srinivasan, Mahesh
      Salgaonkar, Masahiro Yamada, Matthias Schiffer, Naresh Kamboju, Nathan
      Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Miehlbradt, Ran Wang,
      Randy Dunlap, Ritesh Harjani, Sachin Sant, Shirisha Ganta, Shrikanth
      Hegde, Sourabh Jain, Stephen Rothwell, sundar, Thorsten Blum, Vaibhav
      Jain, Xiaowei Bao, Yang Li, and Zhao Chenhui.
      
      * tag 'powerpc-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (85 commits)
        powerpc/fadump: Fix section mismatch warning
        powerpc/85xx: fix compile error without CONFIG_CRASH_DUMP
        powerpc/fadump: update documentation about bootargs_append
        powerpc/fadump: pass additional parameters when fadump is active
        powerpc/fadump: setup additional parameters for dump capture kernel
        powerpc/pseries/fadump: add support for multiple boot memory regions
        selftests/powerpc/dexcr: Fix spelling mistake "predicition" -> "prediction"
        KVM: PPC: Book3S HV nestedv2: Fix an error handling path in gs_msg_ops_kvmhv_nestedv2_config_fill_info()
        KVM: PPC: Fix documentation for ppc mmu caps
        KVM: PPC: code cleanup for kvmppc_book3s_irqprio_deliver
        KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exception
        powerpc/xmon: Check cpu id in commands "c#", "dp#" and "dx#"
        powerpc/code-patching: Use dedicated memory routines for patching
        powerpc/code-patching: Test patch_instructions() during boot
        powerpc64/kasan: Pass virtual addresses to kasan_init_phys_region()
        powerpc: rename SPRN_HID2 define to SPRN_HID2_750FX
        powerpc: Fix typos
        powerpc/eeh: Fix spelling of the word "auxillary" and update comment
        macintosh/ams: Fix unused variable warning
        powerpc/Makefile: Remove bits related to the previous use of -mcmodel=large
        ...
      ff2632d7
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux · 4853f1f6
      Linus Torvalds authored
      Pull ARM updates from Russell King:
      
       - Updates to AMBA bus subsystem to drop .owner struct device_driver
         initialisations, moving that to code instead.
      
       - Add LPAE privileged-access-never support
      
       - Add support for Clang CFI
      
       - clkdev: report over-sized device or connection strings
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: (36 commits)
        ARM: 9398/1: Fix userspace enter on LPAE with CC_OPTIMIZE_FOR_SIZE=y
        clkdev: report over-sized strings when creating clkdev entries
        ARM: 9393/1: mm: Use conditionals for CFI branches
        ARM: 9392/2: Support CLANG CFI
        ARM: 9391/2: hw_breakpoint: Handle CFI breakpoints
        ARM: 9390/2: lib: Annotate loop delay instructions for CFI
        ARM: 9389/2: mm: Define prototypes for all per-processor calls
        ARM: 9388/2: mm: Type-annotate all per-processor assembly routines
        ARM: 9387/2: mm: Rewrite cacheflush vtables in CFI safe C
        ARM: 9386/2: mm: Use symbol alias for cache functions
        ARM: 9385/2: mm: Type-annotate all cache assembly routines
        ARM: 9384/2: mm: Make tlbflush routines CFI safe
        ARM: 9382/1: ftrace: Define ftrace_stub_graph
        ARM: 9358/2: Implement PAN for LPAE by TTBR0 page table walks disablement
        ARM: 9357/2: Reduce the number of #ifdef CONFIG_CPU_SW_DOMAIN_PAN
        ARM: 9356/2: Move asm statements accessing TTBCR into C functions
        ARM: 9355/2: Add TTBCR_* definitions to pgtable-3level-hwdef.h
        ARM: 9379/1: coresight: tpda: drop owner assignment
        ARM: 9378/1: coresight: etm4x: drop owner assignment
        ARM: 9377/1: hwrng: nomadik: drop owner assignment
        ...
      4853f1f6
    • David S. Miller's avatar
      Merge branch 'wangxun-fixes' · f6f25eeb
      David S. Miller authored
      Jiawen Wu says:
      
      ====================
      Wangxun fixes
      
      Fixed some bugs when using ethtool to operate network devices.
      
      v4 -> v5:
      - Simplify if...else... to fix features.
      
      v3 -> v4:
      - Require both ctag and stag to be enabled or disabled.
      
      v2 -> v3:
      - Drop the first patch.
      
      v1 -> v2:
      - Factor out the same code.
      - Remove statistics printing with more than 64 queues.
      - Detail the commit logs to describe issues.
      - Remove reset flag check in wx_update_stats().
      - Change to set VLAN CTAG and STAG to be consistent.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6f25eeb
    • Jiawen Wu's avatar
      net: txgbe: fix to control VLAN strip · 1d3c6414
      Jiawen Wu authored
      When VLAN tag strip is changed to enable or disable, the hardware requires
      the Rx ring to be in a disabled state, otherwise the feature cannot be
      changed.
      
      Fixes: f3b03c65 ("net: wangxun: Implement vlan add and kill functions")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d3c6414
    • Jiawen Wu's avatar
      net: wangxun: match VLAN CTAG and STAG features · ac71ab78
      Jiawen Wu authored
      Hardware requires VLAN CTAG and STAG configuration always matches. And
      whether VLAN CTAG or STAG changes, the configuration needs to be changed
      as well.
      
      Fixes: 6670f1ec ("net: txgbe: Add netdev features support")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSai Krishna <saikrishnag@marvell.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac71ab78
    • Jiawen Wu's avatar
      net: wangxun: fix to change Rx features · 68067f06
      Jiawen Wu authored
      Fix the issue where some Rx features cannot be changed.
      
      When using ethtool -K to turn off rx offload, it returns error and
      displays "Could not change any device features". And netdev->features
      is not assigned a new value to actually configure the hardware.
      
      Fixes: 6dbedcff ("net: libwx: Implement xx_set_features ops")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68067f06
    • Eric Dumazet's avatar
      af_packet: do not call packet_read_pending() from tpacket_destruct_skb() · 581073f6
      Eric Dumazet authored
      trafgen performance considerably sank on hosts with many cores
      after the blamed commit.
      
      packet_read_pending() is very expensive, and calling it
      in af_packet fast path defeats Daniel intent in commit
      b0138408 ("packet: use percpu mmap tx frame pending refcount")
      
      tpacket_destruct_skb() makes room for one packet, we can immediately
      wakeup a producer, no need to completely drain the tx ring.
      
      Fixes: 89ed5b51 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20240515163358.4105915-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      581073f6
    • Daniel Jurgens's avatar
      virtio_net: Fix missed rtnl_unlock · fa033def
      Daniel Jurgens authored
      The rtnl_lock would stay locked if allocating promisc_allmulti failed.
      Also changed the allocation to GFP_KERNEL.
      
      Fixes: ff7c7d9f ("virtio_net: Remove command data from control_buf")
      Reported-by: default avatarEric Dumazet <edumaset@google.com>
      Link: https://lore.kernel.org/netdev/CANn89iLazVaUCvhPm6RPJJ0owra_oFnx7Fhc8d60gV-65ad3WQ@mail.gmail.com/Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Reviewed-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Link: https://lore.kernel.org/r/20240515163125.569743-1-danielj@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fa033def
    • Eric Dumazet's avatar
      netrom: fix possible dead-lock in nr_rt_ioctl() · e03e7f20
      Eric Dumazet authored
      syzbot loves netrom, and found a possible deadlock in nr_rt_ioctl [1]
      
      Make sure we always acquire nr_node_list_lock before nr_node_lock(nr_node)
      
      [1]
      WARNING: possible circular locking dependency detected
      6.9.0-rc7-syzkaller-02147-g654de42f #0 Not tainted
      ------------------------------------------------------
      syz-executor350/5129 is trying to acquire lock:
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_node_lock include/net/netrom.h:152 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:464 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
      
      but task is already holding lock:
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (nr_node_list_lock){+...}-{2:2}:
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_remove_node net/netrom/nr_route.c:299 [inline]
              nr_del_node+0x4b4/0x820 net/netrom/nr_route.c:355
              nr_rt_ioctl+0xa95/0x1090 net/netrom/nr_route.c:683
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      -> #0 (&nr_node->node_lock){+...}-{2:2}:
              check_prev_add kernel/locking/lockdep.c:3134 [inline]
              check_prevs_add kernel/locking/lockdep.c:3253 [inline]
              validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
              __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_node_lock include/net/netrom.h:152 [inline]
              nr_dec_obs net/netrom/nr_route.c:464 [inline]
              nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(nr_node_list_lock);
                                     lock(&nr_node->node_lock);
                                     lock(nr_node_list_lock);
        lock(&nr_node->node_lock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor350/5129:
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      stack backtrace:
      CPU: 0 PID: 5129 Comm: syz-executor350 Not tainted 6.9.0-rc7-syzkaller-02147-g654de42f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
        check_prev_add kernel/locking/lockdep.c:3134 [inline]
        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
        lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
        _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
        spin_lock_bh include/linux/spinlock.h:356 [inline]
        nr_node_lock include/net/netrom.h:152 [inline]
        nr_dec_obs net/netrom/nr_route.c:464 [inline]
        nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
        sock_do_ioctl+0x158/0x460 net/socket.c:1222
        sock_ioctl+0x629/0x8e0 net/socket.c:1341
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:904 [inline]
        __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240515142934.3708038-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e03e7f20
    • Michal Schmidt's avatar
      idpf: don't skip over ethtool tcp-data-split setting · 67708158
      Michal Schmidt authored
      Disabling tcp-data-split on idpf silently fails:
        # ethtool -G $NETDEV tcp-data-split off
        # ethtool -g $NETDEV | grep 'TCP data split'
        TCP data split:        on
      
      But it works if you also change 'tx' or 'rx':
        # ethtool -G $NETDEV tcp-data-split off tx 256
        # ethtool -g $NETDEV | grep 'TCP data split'
        TCP data split:        off
      
      The bug is in idpf_set_ringparam, where it takes a shortcut out if the
      TX and RX sizes are not changing. Fix it by checking also if the
      tcp-data-split setting remains unchanged. Only then can the soft reset
      be skipped.
      
      Fixes: 9b1aa3ef ("idpf: add get/set for Ethtool's header split ringparam")
      Reported-by: default avatarXu Du <xudu@redhat.com>
      Closes: https://issues.redhat.com/browse/RHEL-36182Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarAlexander Lobakin <aleksander.lobakin@intel.com>
      Link: https://lore.kernel.org/r/20240515092414.158079-1-mschmidt@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      67708158
    • Sagar Cheluvegowda's avatar
    • Tony Battersby's avatar
      bonding: fix oops during rmmod · a45835a0
      Tony Battersby authored
      "rmmod bonding" causes an oops ever since commit cc317ea3 ("bonding:
      remove redundant NULL check in debugfs function").  Here are the relevant
      functions being called:
      
      bonding_exit()
        bond_destroy_debugfs()
          debugfs_remove_recursive(bonding_debug_root);
          bonding_debug_root = NULL; <--------- SET TO NULL HERE
        bond_netlink_fini()
          rtnl_link_unregister()
            __rtnl_link_unregister()
              unregister_netdevice_many_notify()
                bond_uninit()
                  bond_debug_unregister()
                    (commit removed check for bonding_debug_root == NULL)
                    debugfs_remove()
                    simple_recursive_removal()
                      down_write() -> OOPS
      
      However, reverting the bad commit does not solve the problem completely
      because the original code contains a race that could cause the same
      oops, although it was much less likely to be triggered unintentionally:
      
      CPU1
        rmmod bonding
          bonding_exit()
            bond_destroy_debugfs()
              debugfs_remove_recursive(bonding_debug_root);
      
      CPU2
        echo -bond0 > /sys/class/net/bonding_masters
          bond_uninit()
            bond_debug_unregister()
              if (!bonding_debug_root)
      
      CPU1
              bonding_debug_root = NULL;
      
      So do NOT revert the bad commit (since the removed checks were racy
      anyway), and instead change the order of actions taken during module
      removal.  The same oops can also happen if there is an error during
      module init, so apply the same fix there.
      
      Fixes: cc317ea3 ("bonding: remove redundant NULL check in debugfs function")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/641f914f-3216-4eeb-87dd-91b78aa97773@cybernetics.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a45835a0
    • xu xin's avatar
      net/ipv6: Fix route deleting failure when metric equals 0 · bb487272
      xu xin authored
      Problem
      =========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      we noticed that the logic of assigning the default value of fc_metirc
      changed in the ioctl process. That is, when users use ioctl(fd, SIOCADDRT,
      rt) with a non-zero metric to add a route,  then they may fail to delete a
      route with passing in a metric value of 0 to the kernel by ioctl(fd,
      SIOCDELRT, rt). But iproute can succeed in deleting it.
      
      As a reference, when using iproute tools by netlink to delete routes with
      a metric parameter equals 0, like the command as follows:
      
      	ip -6 route del fe80::/64 via fe81::5054:ff:fe11:3451 dev eth0 metric 0
      
      the user can still succeed in deleting the route entry with the smallest
      metric.
      
      Root Reason
      ===========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      When ioctl() pass in SIOCDELRT with a zero metric, rtmsg_to_fib6_config()
      will set a defalut value (1024) to cfg->fc_metric in kernel, and in
      ip6_route_del() and the line 4074 at net/ipv3/route.c, it will check by
      
      	if (cfg->fc_metric && cfg->fc_metric != rt->fib6_metric)
      		continue;
      
      and the condition is true and skip the later procedure (deleting route)
      because cfg->fc_metric != rt->fib6_metric. But before that commit,
      cfg->fc_metric is still zero there, so the condition is false and it
      will do the following procedure (deleting).
      
      Solution
      ========
      In order to keep a consistent behaviour across netlink() and ioctl(), we
      should allow to delete a route with a metric value of 0. So we only do
      the default setting of fc_metric in route adding.
      
      CC: stable@vger.kernel.org # 5.4+
      Fixes: 67f69513 ("ipv6: Move setting default metric for routes")
      Co-developed-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarxu xin <xu.xin16@zte.com.cn>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240514201102055dD2Ba45qKbLlUMxu_DTHP@zte.com.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bb487272
    • Hangbin Liu's avatar
      selftests/net: reduce xfrm_policy test time · 988af276
      Hangbin Liu authored
      The check_random_order test add/get plenty of xfrm rules, which consume
      a lot time on debug kernel and always TIMEOUT. Let's reduce the test
      loop and see if it works.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240514095227.2597730-1-liuhangbin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      988af276