1. 09 Sep, 2022 7 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 2fc1171d
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
      
       - Fix crashes on bare metal due to the new plkps driver trying to probe
         and call the hypervisor on non-pseries machines.
      
      Thanks to Nathan Chancellor and Dan Horák.
      
      * tag 'powerpc-6.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/pseries: Fix plpks crash on non-pseries
      2fc1171d
    • Linus Torvalds's avatar
      Merge tag 'for-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 9b450949
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A few more fixes to zoned mode and one regression fix for chunk limit:
      
          - Zoned mode fixes:
              - fix how wait/wake up is done when finishing zone
              - fix zone append limit in emulated mode
              - fix mount on devices with conventional zones
      
         - fix regression, user settable data chunk limit got accidentally
           lowered and causes allocation problems on some profiles (raid0,
           raid1)"
      
      * tag 'for-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix the max chunk size and stripe length calculation
        btrfs: zoned: fix mounting with conventional zones
        btrfs: zoned: set pseudo max append zone limit in zone emulation mode
        btrfs: zoned: fix API misuse of zone finish waiting
      9b450949
    • Linus Torvalds's avatar
      Merge tag 'vfio-v6.0-rc5' of https://github.com/awilliam/linux-vfio · 725f3f3b
      Linus Torvalds authored
      Pull VFIO fix from Alex Williamson:
      
       - Fix zero page refcount leak (Alex Williamson)
      
      * tag 'vfio-v6.0-rc5' of https://github.com/awilliam/linux-vfio:
        vfio/type1: Unpin zero pages
      725f3f3b
    • Linus Torvalds's avatar
      Merge tag 'sound-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 83dfc0e2
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Lots of small fixes for various drivers at this time, hopefully it
        will be the last big bump before 6.0 release.
      
        The significant changes are regression fixes for (yet again) HD-audio
        memory allocations and USB-audio PCM parameter handling, while there
        are many small ASoC device-specific fixes as well as a few
        out-of-bounds and race issues spotted by fuzzers"
      
      * tag 'sound-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
        ALSA: usb-audio: Clear fixed clock rate at closing EP
        ALSA: emu10k1: Fix out of bounds access in snd_emu10k1_pcm_channel_alloc()
        ALSA: hda: Once again fix regression of page allocations with IOMMU
        ALSA: usb-audio: Fix an out-of-bounds bug in __snd_usb_parse_audio_interface()
        ALSA: hda/tegra: Align BDL entry to 4KB boundary
        ALSA: hda/sigmatel: Fix unused variable warning for beep power change
        ALSA: pcm: oss: Fix race at SNDCTL_DSP_SYNC
        ALSA: hda/sigmatel: Keep power up while beep is enabled
        ALSA: aloop: Fix random zeros in capture data when using jiffies timer
        ALSA: usb-audio: Split endpoint setups for hw_params and prepare
        ALSA: usb-audio: Register card again for iface over delayed_register option
        ALSA: usb-audio: Inform the delayed registration more properly
        ASoC: fsl_aud2htx: Add error handler for pm_runtime_enable
        ASoC: fsl_aud2htx: register platform component before registering cpu dai
        ASoC: SOF: ipc4-topology: fix alh_group_ida max value
        ASoC: mchp-spdiftx: Fix clang -Wbitfield-constant-conversion
        ASoC: SOF: Kconfig: Make IPC_MESSAGE_INJECTOR depend on SND_SOC_SOF
        ASoC: SOF: Kconfig: Make IPC_FLOOD_TEST depend on SND_SOC_SOF
        ASoC: fsl_mqs: Fix supported clock DAI format
        ASoC: nau8540: Implement hw constraint for rates
        ...
      83dfc0e2
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-08' of... · d8a450a8
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.0-2022-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix per-thread mmaps for multi-threaded targets, noticed with
         'perf top --pid' with multithreaded targets
      
       - Fix synthesis failure warnings in 'perf record'
      
       - Fix L2 Topdown metrics disappearance for raw events in 'perf stat'
      
       - Fix out of bound access in some CPU masks
      
       - Fix segfault if there is no CPU PMU table and a metric is sought,
         noticed when building with NO_JEVENTS=1
      
       - Skip dummy event attr check in 'perf script' fixing nonsensical
         warning about UREGS attribute not set, as 'dummy' events have no
         samples
      
       - Fix 'iregs' field handling with dummy events on hybrid systems in
         'perf script'
      
       - Prevent potential memory leak in c2c_he_zalloc() in 'perf c2c'
      
       - Don't install data files with x permissions
      
       - Fix types for print format in dlfilter-show-cycles
      
       - Switch deprecated openssl MD5_* functions to new EVP API in 'genelf'
      
       - Remove redundant word 'contention' in 'perf lock' help message
      
      * tag 'perf-tools-fixes-for-v6.0-2022-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf record: Fix synthesis failure warnings
        perf tools: Don't install data files with x permissions
        perf script: Fix Cannot print 'iregs' field for hybrid systems
        perf lock: Remove redundant word 'contention' in help message
        perf dlfilter dlfilter-show-cycles: Fix types for print format
        libperf evlist: Fix per-thread mmaps for multi-threaded targets
        perf c2c: Prevent potential memory leak in c2c_he_zalloc()
        perf genelf: Switch deprecated openssl MD5_* functions to new EVP API
        tools/perf: Fix out of bound access to cpu mask array
        perf affinity: Fix out of bound access to "sched_cpus" mask
        perf stat: Fix L2 Topdown metrics disappear for raw events
        perf script: Skip dummy event attr check
        perf metric: Return early if no CPU PMU table exists
      d8a450a8
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 460a75a6
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Do not stop trace events in modules if TAINT_TEST is set
      
       - Do not clobber mount options when tracefs is mounted a second time
      
       - Prevent crash of kprobes in gate area
      
       - Add static annotation to some non global functions
      
       - Add some entries into the MAINTAINERS file
      
       - Fix check of event_mutex held when accessing trigger list
      
       - Add some __init/__exit annotations
      
       - Fix reporting of what called hardirq_{enable,disable}_ip function
      
      * tag 'trace-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracefs: Only clobber mode/uid/gid on remount if asked
        kprobes: Prohibit probes in gate area
        rv/reactor: add __init/__exit annotations to module init/exit funcs
        tracing: Fix to check event_mutex is held while accessing trigger list
        tracing: hold caller_addr to hardirq_{enable,disable}_ip
        tracepoint: Allow trace events in modules with TAINT_TEST
        MAINTAINERS: add scripts/tracing/ to TRACING
        MAINTAINERS: Add Runtime Verification (RV) entry
        rv/monitors: Make monitor's automata definition static
      460a75a6
    • Linus Torvalds's avatar
      Merge tag 'asm-generic-fixes-6.0-rc4' of... · f448dda8
      Linus Torvalds authored
      Merge tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
      
      Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann:
       "Just one fixup patch, reworking the softirq_on_own_stack logic for
        preempt-rt kernels as discussed in
      
          https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/"
      
      * tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
        asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
      f448dda8
  2. 08 Sep, 2022 18 commits
    • Brian Norris's avatar
      tracefs: Only clobber mode/uid/gid on remount if asked · 47311db8
      Brian Norris authored
      Users may have explicitly configured their tracefs permissions; we
      shouldn't overwrite those just because a second mount appeared.
      
      Only clobber if the options were provided at mount time.
      
      Note: the previous behavior was especially surprising in the presence of
      automounted /sys/kernel/debug/tracing/.
      
      Existing behavior:
      
        ## Pre-existing status: tracefs is 0755.
        # stat -c '%A' /sys/kernel/tracing/
        drwxr-xr-x
      
        ## (Re)trigger the automount.
        # umount /sys/kernel/debug/tracing
        # stat -c '%A' /sys/kernel/debug/tracing/.
        drwx------
      
        ## Unexpected: the automount changed mode for other mount instances.
        # stat -c '%A' /sys/kernel/tracing/
        drwx------
      
      New behavior (after this change):
      
        ## Pre-existing status: tracefs is 0755.
        # stat -c '%A' /sys/kernel/tracing/
        drwxr-xr-x
      
        ## (Re)trigger the automount.
        # umount /sys/kernel/debug/tracing
        # stat -c '%A' /sys/kernel/debug/tracing/.
        drwxr-xr-x
      
        ## Expected: the automount does not change other mount instances.
        # stat -c '%A' /sys/kernel/tracing/
        drwxr-xr-x
      
      Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226790d44406@changeid
      
      Cc: stable@vger.kernel.org
      Fixes: 4282d606 ("tracefs: Add new tracefs file system")
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      47311db8
    • Christian A. Ehrhardt's avatar
      kprobes: Prohibit probes in gate area · 1efda38d
      Christian A. Ehrhardt authored
      The system call gate area counts as kernel text but trying
      to install a kprobe in this area fails with an Oops later on.
      To fix this explicitly disallow the gate area for kprobes.
      
      Found by syzkaller with the following reproducer:
      perf_event_open$cgroup(&(0x7f00000001c0)={0x6, 0x80, 0x0, 0x0, 0x0, 0x0, 0x80ffff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext={0x0, 0xffffffffff600000}}, 0xffffffffffffffff, 0x0, 0xffffffffffffffff, 0x0)
      
      Sample report:
      BUG: unable to handle page fault for address: fffffbfff3ac6000
      PGD 6dfcb067 P4D 6dfcb067 PUD 6df8f067 PMD 6de4d067 PTE 0
      Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
      CPU: 0 PID: 21978 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3-dirty #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:__insn_get_emulate_prefix arch/x86/lib/insn.c:91 [inline]
      RIP: 0010:insn_get_emulate_prefix arch/x86/lib/insn.c:106 [inline]
      RIP: 0010:insn_get_prefixes.part.0+0xa8/0x1110 arch/x86/lib/insn.c:134
      Code: 49 be 00 00 00 00 00 fc ff df 48 8b 40 60 48 89 44 24 08 e9 81 00 00 00 e8 e5 4b 39 ff 4c 89 fa 4c 89 f9 48 c1 ea 03 83 e1 07 <42> 0f b6 14 32 38 ca 7f 08 84 d2 0f 85 06 10 00 00 48 89 d8 48 89
      RSP: 0018:ffffc900088bf860 EFLAGS: 00010246
      RAX: 0000000000040000 RBX: ffffffff9b9bebc0 RCX: 0000000000000000
      RDX: 1ffffffff3ac6000 RSI: ffffc90002d82000 RDI: ffffc900088bf9e8
      RBP: ffffffff9d630001 R08: 0000000000000000 R09: ffffc900088bf9e8
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
      R13: ffffffff9d630000 R14: dffffc0000000000 R15: ffffffff9d630000
      FS:  00007f63eef63640(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffffbfff3ac6000 CR3: 0000000029d90005 CR4: 0000000000770ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       insn_get_prefixes arch/x86/lib/insn.c:131 [inline]
       insn_get_opcode arch/x86/lib/insn.c:272 [inline]
       insn_get_modrm+0x64a/0x7b0 arch/x86/lib/insn.c:343
       insn_get_sib+0x29a/0x330 arch/x86/lib/insn.c:421
       insn_get_displacement+0x350/0x6b0 arch/x86/lib/insn.c:464
       insn_get_immediate arch/x86/lib/insn.c:632 [inline]
       insn_get_length arch/x86/lib/insn.c:707 [inline]
       insn_decode+0x43a/0x490 arch/x86/lib/insn.c:747
       can_probe+0xfc/0x1d0 arch/x86/kernel/kprobes/core.c:282
       arch_prepare_kprobe+0x79/0x1c0 arch/x86/kernel/kprobes/core.c:739
       prepare_kprobe kernel/kprobes.c:1160 [inline]
       register_kprobe kernel/kprobes.c:1641 [inline]
       register_kprobe+0xb6e/0x1690 kernel/kprobes.c:1603
       __register_trace_kprobe kernel/trace/trace_kprobe.c:509 [inline]
       __register_trace_kprobe+0x26a/0x2d0 kernel/trace/trace_kprobe.c:477
       create_local_trace_kprobe+0x1f7/0x350 kernel/trace/trace_kprobe.c:1833
       perf_kprobe_init+0x18c/0x280 kernel/trace/trace_event_perf.c:271
       perf_kprobe_event_init+0xf8/0x1c0 kernel/events/core.c:9888
       perf_try_init_event+0x12d/0x570 kernel/events/core.c:11261
       perf_init_event kernel/events/core.c:11325 [inline]
       perf_event_alloc.part.0+0xf7f/0x36a0 kernel/events/core.c:11619
       perf_event_alloc kernel/events/core.c:12059 [inline]
       __do_sys_perf_event_open+0x4a8/0x2a00 kernel/events/core.c:12157
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f63ef7efaed
      Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f63eef63028 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
      RAX: ffffffffffffffda RBX: 00007f63ef90ff80 RCX: 00007f63ef7efaed
      RDX: 0000000000000000 RSI: ffffffffffffffff RDI: 00000000200001c0
      RBP: 00007f63ef86019c R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000002 R14: 00007f63ef90ff80 R15: 00007f63eef43000
       </TASK>
      Modules linked in:
      CR2: fffffbfff3ac6000
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:__insn_get_emulate_prefix arch/x86/lib/insn.c:91 [inline]
      RIP: 0010:insn_get_emulate_prefix arch/x86/lib/insn.c:106 [inline]
      RIP: 0010:insn_get_prefixes.part.0+0xa8/0x1110 arch/x86/lib/insn.c:134
      Code: 49 be 00 00 00 00 00 fc ff df 48 8b 40 60 48 89 44 24 08 e9 81 00 00 00 e8 e5 4b 39 ff 4c 89 fa 4c 89 f9 48 c1 ea 03 83 e1 07 <42> 0f b6 14 32 38 ca 7f 08 84 d2 0f 85 06 10 00 00 48 89 d8 48 89
      RSP: 0018:ffffc900088bf860 EFLAGS: 00010246
      RAX: 0000000000040000 RBX: ffffffff9b9bebc0 RCX: 0000000000000000
      RDX: 1ffffffff3ac6000 RSI: ffffc90002d82000 RDI: ffffc900088bf9e8
      RBP: ffffffff9d630001 R08: 0000000000000000 R09: ffffc900088bf9e8
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
      R13: ffffffff9d630000 R14: dffffc0000000000 R15: ffffffff9d630000
      FS:  00007f63eef63640(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffffbfff3ac6000 CR3: 0000000029d90005 CR4: 0000000000770ef0
      PKRU: 55555554
      ==================================================================
      
      Link: https://lkml.kernel.org/r/20220907200917.654103-1-lk@c--e.de
      
      cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
      cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      cc: "David S. Miller" <davem@davemloft.net>
      Cc: stable@vger.kernel.org
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarChristian A. Ehrhardt <lk@c--e.de>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1efda38d
    • Adrian Hunter's avatar
      perf record: Fix synthesis failure warnings · faf59ec8
      Adrian Hunter authored
      Some calls to synthesis functions set err < 0 but only warn about the
      failure and continue.  However they do not set err back to zero, relying
      on subsequent code to do that.
      
      That changed with the introduction of option --synth. When --synth=no
      subsequent functions that set err back to zero are not called.
      
      Fix by setting err = 0 in those cases.
      
      Example:
      
       Before:
      
         $ perf record --no-bpf-event --synth=all -o /tmp/huh uname
         Couldn't synthesize bpf events.
         Linux
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]
         $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
         Couldn't synthesize bpf events.
      
       After:
      
         $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
         Couldn't synthesize bpf events.
         Linux
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]
      
      Fixes: 41b740b6 ("perf record: Add --synth option")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220907162458.72817-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      faf59ec8
    • Jiri Slaby's avatar
      perf tools: Don't install data files with x permissions · 0a9eaf61
      Jiri Slaby authored
      install(1), by default, installs with rwxr-xr-x permissions. Modify
      perf's Makefile to pass '-m 644' when installing:
      
        * Documentation/tips.txt
        * examples/bpf/*
        * perf-completion.sh
        * perf_dlfilter.h header
        * scripts/perl/Perf-Trace-Util/lib/Perf/Trace/*
        * scripts/perl/*.pl
        * tests/attr/*
        * tests/attr.py
        * tests/shell/lib/*.sh
        * trace/strace/groups/*
      
      All those are supposed to be non-executable. Either they are not scripts
      at all, or they don't have shebang.
      
      Signed-off-by: <jslaby@suse.cz>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220908060426.9619-1-jslaby@suse.czSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a9eaf61
    • Zhengjun Xing's avatar
      perf script: Fix Cannot print 'iregs' field for hybrid systems · 82b2425f
      Zhengjun Xing authored
      Commit b91e5492 ("perf record: Add a dummy event on hybrid
      systems to collect metadata records") adds a dummy event on hybrid
      systems to fix the symbol "unknown" issue when the workload is created
      in a P-core but runs on an E-core. The added dummy event will cause
      "perf script -F iregs" to fail. Dummy events do not have "iregs"
      attribute set, so when we do evsel__check_attr, the "iregs" attribute
      check will fail, so the issue happened.
      
      The following commit [1] has fixed a similar issue by skipping the attr
      check for the dummy event because it does not have any samples anyway. It
      works okay for the normal mode, but the issue still happened when running
      the test in the pipe mode. In the pipe mode, it calls process_attr() which
      still checks the attr for the dummy event. This commit fixed the issue by
      skipping the attr check for the dummy event in the API evsel__check_attr,
      Otherwise, we have to patch everywhere when evsel__check_attr() is called.
      
      Before:
      
        #./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5
        Samples for 'dummy:HG' event do not have IREGS attribute set. Cannot print 'iregs' field.
        0x120 [0x90]: failed to process type: 64
        #
      
      After:
      
        # ./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5
        ABI:2    CX:0x55b8efa87000    DX:0x55b8efa7e000    DI:0xffffba5e625efbb0    R8:0xffff90e51f8ae100
        ABI:2    CX:0x7f1dae1e4000    DX:0xd0    DI:0xffff90e18c675ac0    R8:0x71
        ABI:2    CX:0xcc0    DX:0x1    DI:0xffff90e199880240    R8:0x0
        ABI:2    CX:0xffff90e180dd7500    DX:0xffff90e180dd7500    DI:0xffff90e180043500    R8:0x1
        ABI:2    CX:0x50    DX:0xffff90e18c583bd0    DI:0xffff90e1998803c0    R8:0x58
        #
      
      [1]https://lore.kernel.org/lkml/20220831124041.219925-1-jolsa@kernel.org/
      
      Fixes: b91e5492 ("perf record: Add a dummy event on hybrid systems to collect metadata records")
      Suggested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarXing Zhengjun <zhengjun.xing@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220908070030.3455164-1-zhengjun.xing@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      82b2425f
    • Yang Jihong's avatar
      perf lock: Remove redundant word 'contention' in help message · 3705a6ef
      Yang Jihong authored
      Before:
        # perf lock -h
      
         Usage: perf lock [<options>] {record|report|script|info|contention|contention}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -i, --input <file>    input file name
            -v, --verbose         be more verbose (show symbol address, etc)
                --kallsyms <file>
                                  kallsyms pathname
                --vmlinux <file>  vmlinux pathname
      
      After:
        # perf lock -h
      
         Usage: perf lock [<options>] {record|report|script|info|contention}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -i, --input <file>    input file name
            -v, --verbose         be more verbose (show symbol address, etc)
                --kallsyms <file>
                                  kallsyms pathname
                --vmlinux <file>  vmlinux pathname
      
      Fixes: 528b9cab ("perf lock: Add 'contention' subcommand")
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220908014854.151203-1-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3705a6ef
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 50635787
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "Several fixes that came in since the merge window, the major one being
        a fix for the spi-mux driver which was broken by the performance
        optimisations due to it peering inside the core's data structures more
        than it should"
      
      * tag 'spi-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spi: Fix queue hang if previous transfer failed
        spi: mux: Fix mux interaction with fast path optimisations
        spi: cadence-quadspi: Disable irqs during indirect reads
        spi: bitbang: Fix lsb-first Rx
      50635787
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.0-rc4' of... · c5e68c4f
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "One core fix here improving the error handling on enable failure, plus
        smaller fixes for the pfuze100 drive and the SPMI DT bindings"
      
      * tag 'regulator-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: Fix qcom,spmi-regulator schema
        regulator: pfuze100: Fix the global-out-of-bounds access in pfuze100_regulator_probe()
        regulator: core: Clean up on enable failure
      c5e68c4f
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · b1d27aa3
      Linus Torvalds authored
      Pull regmap fix from Mark Brown:
       "A fix for how we handle controller constraints on SPI message sizes,
        only impacting systems with SPI controllers with very low limits like
        the AMD controller used in the Steam Deck"
      
      * tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: spi: Reserve space for register address/padding
      b1d27aa3
    • Adrian Hunter's avatar
      perf dlfilter dlfilter-show-cycles: Fix types for print format · 1706623e
      Adrian Hunter authored
      Avoid compiler warning about format %llu that expects long long unsigned
      int but argument has type __u64.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Fixes: c3afd6e5 ("perf dlfilter: Add dlfilter-show-cycles")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220905074735.4513-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1706623e
    • Adrian Hunter's avatar
      libperf evlist: Fix per-thread mmaps for multi-threaded targets · 7864d8f7
      Adrian Hunter authored
      The offending commit removed mmap_per_thread(), which did not consider
      the different set-output rules for per-thread mmaps i.e. in the per-thread
      case set-output is used for file descriptors of the same thread not the
      same cpu.
      
      This was not immediately noticed because it only happens with
      multi-threaded targets and we do not have a test for that yet.
      
      Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
      events i.e. to continue to allow the mixing of per-thread and per-cpu
      mmaps.
      
      Debug messages (with -vv) show the file descriptors that are opened with
      sys_perf_event_open. New debug messages are added (needs -vvv) that show
      also which file descriptors are mmapped and which are redirected with
      set-output.
      
      In the per-cpu case (cpu != -1) file descriptors for the same CPU are
      set-output to the first file descriptor for that CPU.
      
      In the per-thread case (cpu == -1) file descriptors for the same thread are
      set-output to the first file descriptor for that thread.
      
      Example (process 17489 has 2 threads):
      
       Before (but with new debug prints):
      
         $ perf record --no-bpf-event -vvv --per-thread -p 17489
         <SNIP>
         sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
         sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
         <SNIP>
         libperf: idx 0: mmapping fd 5
         libperf: idx 0: set output fd 6 -> 5
         failed to mmap with 22 (Invalid argument)
      
       After:
      
         $ perf record --no-bpf-event -vvv --per-thread -p 17489
         <SNIP>
         sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
         sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
         <SNIP>
         libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
         libperf: idx 0: mmapping fd 5
         libperf: idx 1: mmapping fd 6
         <SNIP>
         [ perf record: Woken up 2 times to write data ]
         [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]
      
      Per-cpu example (process 20341 has 2 threads, same as above):
      
         $ perf record --no-bpf-event -vvv -p 20341
         <SNIP>
         sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
         sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
         sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
         sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
         sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
         sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
         sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
         sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
         sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
         sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
         sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
         sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
         sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
         sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
         sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
         sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
         <SNIP>
         libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
         libperf: idx 0: mmapping fd 5
         libperf: idx 0: set output fd 6 -> 5
         libperf: idx 1: mmapping fd 7
         libperf: idx 1: set output fd 8 -> 7
         libperf: idx 2: mmapping fd 9
         libperf: idx 2: set output fd 10 -> 9
         libperf: idx 3: mmapping fd 11
         libperf: idx 3: set output fd 12 -> 11
         libperf: idx 4: mmapping fd 13
         libperf: idx 4: set output fd 14 -> 13
         libperf: idx 5: mmapping fd 15
         libperf: idx 5: set output fd 16 -> 15
         libperf: idx 6: mmapping fd 17
         libperf: idx 6: set output fd 18 -> 17
         libperf: idx 7: mmapping fd 19
         libperf: idx 7: set output fd 20 -> 19
         <SNIP>
         [ perf record: Woken up 7 times to write data ]
         [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
      
      Fixes: ae4f8ae1 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
      Reported-by: default avatarTomáš Trnka <trnka@scm.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216441Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220905114209.8389-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7864d8f7
    • Takashi Iwai's avatar
      Merge tag 'asoc-fix-v6.0-rc4' of... · 09e3e315
      Takashi Iwai authored
      Merge tag 'asoc-fix-v6.0-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
      
      ASoC: Fixes for v6.0
      
      Quite a few fixes here, all driver specific and fairly small.
      09e3e315
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 26b12249
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from rxrpc, netfilter, wireless and bluetooth
        subtrees.
      
        Current release - regressions:
      
         - skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
      
         - bluetooth: fix regression preventing ACL packet transmission
      
        Current release - new code bugs:
      
         - dsa: microchip: fix kernel oops on ksz8 switches
      
         - dsa: qca8k: fix NULL pointer dereference for
           of_device_get_match_data
      
        Previous releases - regressions:
      
         - netfilter: clean up hook list when offload flags check fails
      
         - wifi: mt76: fix crash in chip reset fail
      
         - rxrpc: fix ICMP/ICMP6 error handling
      
         - ice: fix DMA mappings leak
      
         - i40e: fix kernel crash during module removal
      
        Previous releases - always broken:
      
         - ipv6: sr: fix out-of-bounds read when setting HMAC data.
      
         - tcp: TX zerocopy should not sense pfmemalloc status
      
         - sch_sfb: don't assume the skb is still around after
           enqueueing to child
      
         - netfilter: drop dst references before setting
      
         - wifi: wilc1000: fix DMA on stack objects
      
         - rxrpc: fix an insufficiently large sglist in
           rxkad_verify_packet_2()
      
         - fec: use a spinlock to guard `fep->ptp_clk_on`
      
        Misc:
      
         - usb: qmi_wwan: add Quectel RM520N"
      
      * tag 'net-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
        sch_sfb: Also store skb len before calling child enqueue
        net: phy: lan87xx: change interrupt src of link_up to comm_ready
        net/smc: Fix possible access to freed memory in link clear
        net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb
        net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
        net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear
        net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set
        net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio
        net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet
        net: usb: qmi_wwan: add Quectel RM520N
        net: dsa: qca8k: fix NULL pointer dereference for of_device_get_match_data
        tcp: fix early ETIMEDOUT after spurious non-SACK RTO
        stmmac: intel: Simplify intel_eth_pci_remove()
        net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()
        ipv6: sr: fix out-of-bounds read when setting HMAC data.
        bonding: accept unsolicited NA message
        bonding: add all node mcast address when slave up
        bonding: use unspecified address if no available link local address
        wifi: use struct_group to copy addresses
        wifi: mac80211_hwsim: check length for virtio packets
        ...
      26b12249
    • Linus Torvalds's avatar
      fs: only do a memory barrier for the first set_buffer_uptodate() · 2f79cdfe
      Linus Torvalds authored
      Commit d4252071 ("add barriers to buffer_uptodate and
      set_buffer_uptodate") added proper memory barriers to the buffer head
      BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
      will be guaranteed to actually see initialized state.
      
      However, that commit didn't _just_ add the memory barrier, it also ended
      up dropping the "was it already set" logic that the BUFFER_FNS() macro
      had.
      
      That's conceptually the right thing for a generic "this is a memory
      barrier" operation, but in the case of the buffer contents, we really
      only care about the memory barrier for the _first_ time we set the bit,
      in that the only memory ordering protection we need is to avoid anybody
      seeing uninitialized memory contents.
      
      Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
      and would require some other proper lock (typically BH_Lock or the folio
      lock).  A reader that races with somebody invalidating the buffer head
      isn't an issue wrt the memory ordering, it's a serialization issue.
      
      Now, you'd think that the buffer head operations don't matter in this
      day and age (and I certainly thought so), but apparently some loads
      still end up being heavy users of buffer heads.  In particular, the
      kernel test robot reported that not having this bit access optimization
      in place caused a noticeable direct IO performance regression on ext4:
      
        fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression
      
      although you presumably need a fast disk and a lot of cores to actually
      notice.
      
      Link: https://lore.kernel.org/all/Yw8L7HTZ%2FdE2%2Fo9C@xsang-OptiPlex-9020/Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Tested-by: default avatarFengwei Yin <fengwei.yin@intel.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2f79cdfe
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-for-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · f280b987
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
       "A couple of low-priority EFI fixes:
      
         - prevent the randstruct plugin from re-ordering EFI protocol
           definitions
      
         - fix a use-after-free in the capsule loader
      
         - drop unused variable"
      
      * tag 'efi-urgent-for-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: capsule-loader: Fix use-after-free in efi_capsule_write
        efi/x86: libstub: remove unused variable
        efi: libstub: Disable struct randomization
      f280b987
    • Toke Høiland-Jørgensen's avatar
      sch_sfb: Also store skb len before calling child enqueue · 2f09707d
      Toke Høiland-Jørgensen authored
      Cong Wang noticed that the previous fix for sch_sfb accessing the queued
      skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue
      function was also calling qdisc_qstats_backlog_inc() after enqueue, which
      reads the pkt len from the skb cb field. Fix this by also storing the skb
      len, and using the stored value to increment the backlog after enqueueing.
      
      Fixes: 9efd2329 ("sch_sfb: Don't assume the skb is still around after enqueueing to child")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@toke.dk>
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dkSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2f09707d
    • Arun Ramadoss's avatar
      net: phy: lan87xx: change interrupt src of link_up to comm_ready · 5382033a
      Arun Ramadoss authored
      Currently phy link up/down interrupt is enabled using the
      LAN87xx_INTERRUPT_MASK register. In the lan87xx_read_status function,
      phy link is determined using the T1_MODE_STAT_REG register comm_ready bit.
      comm_ready bit is set using the loc_rcvr_status & rem_rcvr_status.
      Whenever the phy link is up, LAN87xx_INTERRUPT_SOURCE link_up bit is set
      first but comm_ready bit takes some time to set based on local and
      remote receiver status.
      As per the current implementation, interrupt is triggered using link_up
      but the comm_ready bit is still cleared in the read_status function. So,
      link is always down.  Initially tested with the shared interrupt
      mechanism with switch and internal phy which is working, but after
      implementing interrupt controller it is not working.
      It can fixed either by updating the read_status function to read from
      LAN87XX_INTERRUPT_SOURCE register or enable the interrupt mask for
      comm_ready bit. But the validation team recommends the use of comm_ready
      for link detection.
      This patch fixes by enabling the comm_ready bit for link_up in the
      LAN87XX_INTERRUPT_MASK_2 register (MISC Bank) and link_down in
      LAN87xx_INTERRUPT_MASK register.
      
      Fixes: 8a1b415d ("net: phy: added ethtool master-slave configuration support")
      Signed-off-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220905152750.5079-1-arun.ramadoss@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5382033a
    • Michael Ellerman's avatar
      powerpc/pseries: Fix plpks crash on non-pseries · a66de528
      Michael Ellerman authored
      As reported[1] by Nathan, the recently added plpks driver will crash if
      it's built into the kernel and booted on a non-pseries machine, eg
      powernv:
      
        kernel BUG at arch/powerpc/kernel/syscall.c:39!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
        ...
        NIP system_call_exception+0x90/0x3d0
        LR  system_call_common+0xec/0x250
        Call Trace:
          0xc0000000035c3e10 (unreliable)
          system_call_common+0xec/0x250
        --- interrupt: c00 at plpar_hcall+0x38/0x60
        NIP:  c0000000000e4300 LR: c00000000202945c CTR: 0000000000000000
        REGS: c0000000035c3e80 TRAP: 0c00   Not tainted  (6.0.0-rc4)
        MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000284  XER: 00000000
        ...
        NIP plpar_hcall+0x38/0x60
        LR  pseries_plpks_init+0x64/0x23c
        --- interrupt: c00
      
      On powernv Linux is the hypervisor, so a hypercall just ends up going to
      the syscall path, which BUGs if the syscall (hypercall) didn't come from
      userspace.
      
      The fix is simply to not probe the plpks driver on non-pseries machines.
      
      [1] https://lore.kernel.org/linuxppc-dev/Yxe06fbq18Wv9y3W@dev-arch.thelio-3990X/
      
      Fixes: 2454a7af ("powerpc/pseries: define driver for Platform KeyStore")
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: default avatarDan Horák <dan@danny.cz>
      Reviewed-by: default avatarDan Horák <dan@danny.cz>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/r/20220907065038.1604504-1-mpe@ellerman.id.au
      a66de528
  3. 07 Sep, 2022 15 commits
    • Hyunwoo Kim's avatar
      efi: capsule-loader: Fix use-after-free in efi_capsule_write · 9cb636b5
      Hyunwoo Kim authored
      A race condition may occur if the user calls close() on another thread
      during a write() operation on the device node of the efi capsule.
      
      This is a race condition that occurs between the efi_capsule_write() and
      efi_capsule_flush() functions of efi_capsule_fops, which ultimately
      results in UAF.
      
      So, the page freeing process is modified to be done in
      efi_capsule_release() instead of efi_capsule_flush().
      
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: default avatarHyunwoo Kim <imv4bel@gmail.com>
      Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      9cb636b5
    • Yacan Liu's avatar
      net/smc: Fix possible access to freed memory in link clear · e9b1a4f8
      Yacan Liu authored
      After modifying the QP to the Error state, all RX WR would be completed
      with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
      wait for it is done, but destroy the QP and free the link group directly.
      So there is a risk that accessing the freed memory in tasklet context.
      
      Here is a crash example:
      
       BUG: unable to handle page fault for address: ffffffff8f220860
       #PF: supervisor write access in kernel mode
       #PF: error_code(0x0002) - not-present page
       PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060
       Oops: 0002 [#1] SMP PTI
       CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S         OE     5.10.0-0607+ #23
       Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018
       RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0
       Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
       RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086
       RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000
       RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00
       RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b
       R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010
       R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040
       FS:  0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        <IRQ>
        _raw_spin_lock_irqsave+0x30/0x40
        mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib]
        smc_wr_rx_tasklet_fn+0x56/0xa0 [smc]
        tasklet_action_common.isra.21+0x66/0x100
        __do_softirq+0xd5/0x29c
        asm_call_irq_on_stack+0x12/0x20
        </IRQ>
        do_softirq_own_stack+0x37/0x40
        irq_exit_rcu+0x9d/0xa0
        sysvec_call_function_single+0x34/0x80
        asm_sysvec_call_function_single+0x12/0x20
      
      Fixes: bd4ad577 ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR")
      Signed-off-by: default avatarYacan Liu <liuyacan@corp.netease.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9b1a4f8
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb · f27b405e
      Lorenzo Bianconi authored
      Even if max hash configured in hw in mtk_ppe_hash_entry is
      MTK_PPE_ENTRIES - 1, check theoretical OOB accesses in
      mtk_ppe_check_skb routine
      
      Fixes: c4f033d9 ("net: ethernet: mtk_eth_soc: rework hardware flow table management")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f27b405e
    • Menglong Dong's avatar
      net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM · 9cb252c4
      Menglong Dong authored
      As Eric reported, the 'reason' field is not presented when trace the
      kfree_skb event by perf:
      
      $ perf record -e skb:kfree_skb -a sleep 10
      $ perf script
        ip_defrag 14605 [021]   221.614303:   skb:kfree_skb:
        skbaddr=0xffff9d2851242700 protocol=34525 location=0xffffffffa39346b1
        reason:
      
      The cause seems to be passing kernel address directly to TP_printk(),
      which is not right. As the enum 'skb_drop_reason' is not exported to
      user space through TRACE_DEFINE_ENUM(), perf can't get the drop reason
      string from the 'reason' field, which is a number.
      
      Therefore, we introduce the macro DEFINE_DROP_REASON(), which is used
      to define the trace enum by TRACE_DEFINE_ENUM(). With the help of
      DEFINE_DROP_REASON(), now we can remove the auto-generate that we
      introduced in the commit ec43908d
      ("net: skb: use auto-generation to convert skb drop reason to string"),
      and define the string array 'drop_reasons'.
      
      Hmmmm...now we come back to the situation that have to maintain drop
      reasons in both enum skb_drop_reason and DEFINE_DROP_REASON. But they
      are both in dropreason.h, which makes it easier.
      
      After this commit, now the format of kfree_skb is like this:
      
      $ cat /tracing/events/skb/kfree_skb/format
      name: kfree_skb
      ID: 1524
      format:
              field:unsigned short common_type;       offset:0;       size:2; signed:0;
              field:unsigned char common_flags;       offset:2;       size:1; signed:0;
              field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
              field:int common_pid;   offset:4;       size:4; signed:1;
      
              field:void * skbaddr;   offset:8;       size:8; signed:0;
              field:void * location;  offset:16;      size:8; signed:0;
              field:unsigned short protocol;  offset:24;      size:2; signed:0;
              field:enum skb_drop_reason reason;      offset:28;      size:4; signed:0;
      
      print fmt: "skbaddr=%p protocol=%u location=%p reason: %s", REC->skbaddr, REC->protocol, REC->location, __print_symbolic(REC->reason, { 1, "NOT_SPECIFIED" }, { 2, "NO_SOCKET" } ......
      
      Fixes: ec43908d ("net: skb: use auto-generation to convert skb drop reason to string")
      Link: https://lore.kernel.org/netdev/CANn89i+bx0ybvE55iMYf5GJM48WwV1HNpdm9Q6t-HaEstqpCSA@mail.gmail.com/Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cb252c4
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear · 0e80707d
      Lorenzo Bianconi authored
      Set ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine.
      
      Fixes: 33fc42de ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e80707d
    • David S. Miller's avatar
      Merge branch 'dsa-felix-fixes' · 0f51fa2a
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Fixes for Felix DSA driver calculation of tc-taprio guard bands
      
      This series fixes some bugs which are not quite new, but date from v5.13
      when static guard bands were enabled by Michael Walle to prevent
      tc-taprio overruns.
      
      The investigation started when Xiaoliang asked privately what is the
      expected max SDU for a traffic class when its minimum gate interval is
      10 us. The answer, as it turns out, is not an L1 size of 1250 octets,
      but 1245 octets, since otherwise, the switch will not consider frames
      for egress scheduling, because the static guard band is exactly as large
      as the time interval. The switch needs a minimum of 33 ns outside of the
      guard band to consider a frame for scheduling, and the reduction of the
      max SDU by 5 provides exactly for that.
      
      The fix for that (patch 1/3) is relatively small, but during testing, it
      became apparent that cut-through forwarding prevents oversized frame
      dropping from working properly. This is solved through the larger patch
      2/3. Finally, patch 3/3 fixes one more tc-taprio locking problem found
      through code inspection.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f51fa2a
    • Vladimir Oltean's avatar
      net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set · a4bb481a
      Vladimir Oltean authored
      The read-modify-write of QSYS_TAG_CONFIG from vsc9959_sched_speed_set()
      runs unlocked with respect to the other functions that access it, which
      are vsc9959_tas_guard_bands_update(), vsc9959_qos_port_tas_set() and
      vsc9959_tas_clock_adjust(). All the others are under ocelot->tas_lock,
      so move the vsc9959_sched_speed_set() access under that lock as well, to
      resolve the concurrency.
      
      Fixes: 55a515b1 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4bb481a
    • Vladimir Oltean's avatar
      net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio · 843794bb
      Vladimir Oltean authored
      Experimentally, it looks like when QSYS_QMAXSDU_CFG_7 is set to 605,
      frames even way larger than 601 octets are transmitted even though these
      should be considered as oversized, according to the documentation, and
      dropped.
      
      Since oversized frame dropping depends on frame size, which is only
      known at the EOF stage, and therefore not at SOF when cut-through
      forwarding begins, it means that the switch cannot take QSYS_QMAXSDU_CFG_*
      into consideration for traffic classes that are cut-through.
      
      Since cut-through forwarding has no UAPI to control it, and the driver
      enables it based on the mantra "if we can, then why not", the strategy
      is to alter vsc9959_cut_through_fwd() to take into consideration which
      tc's have oversize frame dropping enabled, and disable cut-through for
      them. Then, from vsc9959_tas_guard_bands_update(), we re-trigger the
      cut-through determination process.
      
      There are 2 strategies for vsc9959_cut_through_fwd() to determine
      whether a tc has oversized dropping enabled or not. One is to keep a bit
      mask of traffic classes per port, and the other is to read back from the
      hardware registers (a non-zero value of QSYS_QMAXSDU_CFG_* means the
      feature is enabled). We choose reading back from registers, because
      struct ocelot_port is shared with drivers (ocelot, seville) that don't
      support either cut-through nor tc-taprio, and we don't have a felix
      specific extension of struct ocelot_port. Furthermore, reading registers
      from the Felix hardware is quite cheap, since they are memory-mapped.
      
      Fixes: 55a515b1 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      843794bb
    • Vladimir Oltean's avatar
      net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet · 11afdc65
      Vladimir Oltean authored
      The blamed commit broke tc-taprio schedules such as this one:
      
      tc qdisc replace dev $swp1 root taprio \
              num_tc 8 \
              map 0 1 2 3 4 5 6 7 \
              queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
              base-time 0 \
              sched-entry S 0x7f 990000 \
              sched-entry S 0x80  10000 \
              flags 0x2
      
      because the gate entry for TC 7 (S 0x80 10000 ns) now has a static guard
      band added earlier than its 'gate close' event, such that packet
      overruns won't occur in the worst case of the largest packet possible.
      
      Since guard bands are statically determined based on the per-tc
      QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU,
      we need to discuss what happens with TC 7 depending on kernel version,
      since the driver, prior to commit 55a515b1 ("net: dsa: felix: drop
      oversized frames with tc-taprio instead of hanging the port"), did not
      touch QSYS_QMAXSDU_CFG_*, and therefore relied on QSYS_PORT_MAX_SDU.
      
      1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults to
        1518, and at gigabit this introduces a static guard band (independent
        of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit
        time of 20 octets => 160 ns). But this is larger than the time window
        itself, of 10000 ns. So, the queue system never considers a frame with
        TC 7 as eligible for transmission, since the gate practically never
        opens, and these frames are forever stuck in the TX queues and hang
        the port.
      
      2 (after vsc9959_tas_guard_bands_update): Under the sole goal of
        enabling oversized frame dropping, we make an effort to set
        QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays
        one more role, which we did not take into account: per-tc static guard
        band, expressed in L2 byte time (auto-adjusted for FCS and L1 overhead).
        There is a discrepancy between what the driver thinks (that there is
        no guard band, and 100% of min_gate_len[tc] is available for egress
        scheduling) and what the hardware actually does (crops the equivalent
        of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this
        means that the hardware thinks it has exactly 0 ns for scheduling tc 7.
      
      In both cases, even minimum sized Ethernet frames are stuck on egress
      rather than being considered for scheduling on TC 7, even if they would
      fit given a proper configuration. Considering the current situation,
      with vsc9959_tas_guard_bands_update(), frames between 60 octets and 1230
      octets in size are not eligible for oversized dropping (because they are
      smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible
      for scheduling either, because the min_gate_len[7] (10000 ns) minus the
      guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets * 8 ns per
      octet == 9840 ns) minus the guard band auto-added for L1 overhead by
      QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets)
      leaves 0 ns for scheduling in the queue system proper.
      
      Investigating the hardware behavior, it becomes apparent that the queue
      system needs precisely 33 ns of 'gate open' time in order to consider a
      frame as eligible for scheduling to a tc. So the solution to this
      problem is to amend vsc9959_tas_guard_bands_update(), by giving the
      per-tc guard bands less space by exactly 33 ns, just enough for one
      frame to be scheduled in that interval. This allows the queue system to
      make forward progress for that port-tc, and prevents it from hanging.
      
      Fixes: 297c4de6 ("net: dsa: felix: re-enable TAS guard band mode")
      Reported-by: default avatarXiaoliang Yang <xiaoliang.yang_1@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11afdc65
    • Takashi Iwai's avatar
      ALSA: usb-audio: Clear fixed clock rate at closing EP · 809f44a0
      Takashi Iwai authored
      The recent commit c11117b6 ("ALSA: usb-audio: Refcount multiple
      accesses on the single clock") tries to manage the clock rate shared
      by several endpoints.  This was intended for avoiding the unmatched
      rate by a different endpoint, but unfortunately, it introduced a
      regression for PulseAudio and pipewire, too; those applications try to
      probe the multiple possible rates (44.1k and 48kHz) and setting up the
      normal rate fails but only the last rate is applied.
      
      The cause is that the last sample rate is still left to the clock
      reference even after closing the endpoint, and this value is still
      used at the next open.  It happens only when applications set up via
      PCM prepare but don't start/stop the stream; the rate is reset when
      the stream is stopped, but it's not cleared at close.
      
      This patch addresses the issue above, simply by clearing the rate set
      in the clock reference at the last close of each endpoint.
      
      Fixes: c11117b6 ("ALSA: usb-audio: Refcount multiple accesses on the single clock")
      Reported-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Tested-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/all/YxXIWv8dYmg1tnXP@zx2c4.com/
      Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/2620
      Link: https://lore.kernel.org/r/20220907100421.6443-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      809f44a0
    • chen zhang's avatar
      efi/x86: libstub: remove unused variable · 7a1ec84f
      chen zhang authored
      The variable "has_system_memory" is unused in function
      ‘adjust_memory_range_protection’, remove it.
      Signed-off-by: default avatarchen zhang <chenzhang@kylinos.cn>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      7a1ec84f
    • Tasos Sahanidis's avatar
      ALSA: emu10k1: Fix out of bounds access in snd_emu10k1_pcm_channel_alloc() · d29f5905
      Tasos Sahanidis authored
      The voice allocator sometimes begins allocating from near the end of the
      array and then wraps around, however snd_emu10k1_pcm_channel_alloc()
      accesses the newly allocated voices as if it never wrapped around.
      
      This results in out of bounds access if the first voice has a high enough
      index so that first_voice + requested_voice_count > NUM_G (64).
      The more voices are requested, the more likely it is for this to occur.
      
      This was initially discovered using PipeWire, however it can be reproduced
      by calling aplay multiple times with 16 channels:
      aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero
      
      UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40
      index 65 is out of range for type 'snd_emu10k1_voice [64]'
      CPU: 1 PID: 31977 Comm: aplay Tainted: G        W IOE      6.0.0-rc2-emu10k1+ #7
      Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002    07/22/2010
      Call Trace:
      <TASK>
      dump_stack_lvl+0x49/0x63
      dump_stack+0x10/0x16
      ubsan_epilogue+0x9/0x3f
      __ubsan_handle_out_of_bounds.cold+0x44/0x49
      snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1]
      snd_pcm_hw_params+0x29f/0x600 [snd_pcm]
      snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm]
      ? exit_to_user_mode_prepare+0x35/0x170
      ? do_syscall_64+0x69/0x90
      ? syscall_exit_to_user_mode+0x26/0x50
      ? do_syscall_64+0x69/0x90
      ? exit_to_user_mode_prepare+0x35/0x170
      snd_pcm_ioctl+0x27/0x40 [snd_pcm]
      __x64_sys_ioctl+0x95/0xd0
      do_syscall_64+0x5c/0x90
      ? do_syscall_64+0x69/0x90
      ? do_syscall_64+0x69/0x90
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      Signed-off-by: default avatarTasos Sahanidis <tasos@tasossah.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/3707dcab-320a-62ff-63c0-73fc201ef756@tasossah.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      d29f5905
    • Xiu Jianfeng's avatar
      rv/reactor: add __init/__exit annotations to module init/exit funcs · 93d71986
      Xiu Jianfeng authored
      Add missing __init/__exit annotations to module init/exit funcs.
      
      Link: https://lkml.kernel.org/r/20220906141210.132607-1-xiujianfeng@huawei.com
      
      Fixes: 135b881e ("rv/reactor: Add the printk reactor")
      Fixes: e88043c0 ("rv/reactor: Add the panic reactor")
      Signed-off-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
      Acked-by: default avatarDaniel Bristot de Oliveira <bristot@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      93d71986
    • Masami Hiramatsu (Google)'s avatar
      tracing: Fix to check event_mutex is held while accessing trigger list · cecf8e12
      Masami Hiramatsu (Google) authored
      Since the check_user_trigger() is called outside of RCU
      read lock, this list_for_each_entry_rcu() caused a suspicious
      RCU usage warning.
      
       # echo hist:keys=pid > events/sched/sched_stat_runtime/trigger
       # cat events/sched/sched_stat_runtime/trigger
      [   43.167032]
      [   43.167418] =============================
      [   43.167992] WARNING: suspicious RCU usage
      [   43.168567] 5.19.0-rc5-00029-g19ebe4651abf #59 Not tainted
      [   43.169283] -----------------------------
      [   43.169863] kernel/trace/trace_events_trigger.c:145 RCU-list traversed in non-reader section!!
      ...
      
      However, this file->triggers list is safe when it is accessed
      under event_mutex is held.
      To fix this warning, adds a lockdep_is_held check to the
      list_for_each_entry_rcu().
      
      Link: https://lkml.kernel.org/r/166226474977.223837.1992182913048377113.stgit@devnote2
      
      Cc: stable@vger.kernel.org
      Fixes: 7491e2c4 ("tracing: Add a probe that attaches to trace events")
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      cecf8e12
    • Yipeng Zou's avatar
      tracing: hold caller_addr to hardirq_{enable,disable}_ip · 54c39319
      Yipeng Zou authored
      Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
      in CALLER_ADDR0.
      The function trace_hardirqs_on_caller should have been intended to use
      caller_addr to represent the address that caller wants to be traced.
      
      For example, lockdep log in riscv showing the last {enabled,disabled} at
      __trace_hardirqs_{on,off} all the time(if called by):
      [   57.853175] hardirqs last  enabled at (2519): __trace_hardirqs_on+0xc/0x14
      [   57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
      
      After use trace_hardirqs_xx_caller, we can get more effective information:
      [   53.781428] hardirqs last  enabled at (2595): restore_all+0xe/0x66
      [   53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
      
      Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: c3bc8fd6 ("tracing: Centralize preemptirq tracepoints and unify their usage")
      Signed-off-by: default avatarYipeng Zou <zouyipeng@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      54c39319