1. 04 Aug, 2021 1 commit
    • Paolo Bonzini's avatar
      KVM: Do not leak memory for duplicate debugfs directories · 85cd39af
      Paolo Bonzini authored
      KVM creates a debugfs directory for each VM in order to store statistics
      about the virtual machine.  The directory name is built from the process
      pid and a VM fd.  While generally unique, it is possible to keep a
      file descriptor alive in a way that causes duplicate directories, which
      manifests as these messages:
      
        [  471.846235] debugfs: Directory '20245-4' with parent 'kvm' already present!
      
      Even though this should not happen in practice, it is more or less
      expected in the case of KVM for testcases that call KVM_CREATE_VM and
      close the resulting file descriptor repeatedly and in parallel.
      
      When this happens, debugfs_create_dir() returns an error but
      kvm_create_vm_debugfs() goes on to allocate stat data structs which are
      later leaked.  The slow memory leak was spotted by syzkaller, where it
      caused OOM reports.
      
      Since the issue only affects debugfs, do a lookup before calling
      debugfs_create_dir, so that the message is downgraded and rate-limited.
      While at it, ensure kvm->debugfs_dentry is NULL rather than an error
      if it is not created.  This fixes kvm_destroy_vm_debugfs, which was not
      checking IS_ERR_OR_NULL correctly.
      
      Cc: stable@vger.kernel.org
      Fixes: 536a6f88 ("KVM: Create debugfs dir and stat files for each VM")
      Reported-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Suggested-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      85cd39af
  2. 03 Aug, 2021 4 commits
  3. 30 Jul, 2021 1 commit
    • Paolo Bonzini's avatar
      KVM: x86: accept userspace interrupt only if no event is injected · fa7a549d
      Paolo Bonzini authored
      Once an exception has been injected, any side effects related to
      the exception (such as setting CR2 or DR6) have been taked place.
      Therefore, once KVM sets the VM-entry interruption information
      field or the AMD EVENTINJ field, the next VM-entry must deliver that
      exception.
      
      Pending interrupts are processed after injected exceptions, so
      in theory it would not be a problem to use KVM_INTERRUPT when
      an injected exception is present.  However, DOSEMU is using
      run->ready_for_interrupt_injection to detect interrupt windows
      and then using KVM_SET_SREGS/KVM_SET_REGS to inject the
      interrupt manually.  For this to work, the interrupt window
      must be delayed after the completion of the previous event
      injection.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarStas Sergeev <stsp2@yandex.ru>
      Tested-by: default avatarStas Sergeev <stsp2@yandex.ru>
      Fixes: 71cc849b ("KVM: x86: Fix split-irqchip vs interrupt injection window request")
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fa7a549d
  4. 27 Jul, 2021 10 commits
    • Paolo Bonzini's avatar
      KVM: add missing compat KVM_CLEAR_DIRTY_LOG · 8750f9bb
      Paolo Bonzini authored
      The arguments to the KVM_CLEAR_DIRTY_LOG ioctl include a pointer,
      therefore it needs a compat ioctl implementation.  Otherwise,
      32-bit userspace fails to invoke it on 64-bit kernels; for x86
      it might work fine by chance if the padding is zero, but not
      on big-endian architectures.
      
      Reported-by: Thomas Sattler
      Cc: stable@vger.kernel.org
      Fixes: 2a31b9db ("kvm: introduce manual dirty log reprotect")
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8750f9bb
    • Li RongQing's avatar
      KVM: use cpu_relax when halt polling · 74775654
      Li RongQing authored
      SMT siblings share caches and other hardware, and busy halt polling
      will degrade its sibling performance if its sibling is working
      
      Sean Christopherson suggested as below:
      
      "Rather than disallowing halt-polling entirely, on x86 it should be
      sufficient to simply have the hardware thread yield to its sibling(s)
      via PAUSE.  It probably won't get back all performance, but I would
      expect it to be close.
      This compiles on all KVM architectures, and AFAICT the intended usage
      of cpu_relax() is identical for all architectures."
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Message-Id: <20210727111247.55510-1-lirongqing@baidu.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      74775654
    • Maxim Levitsky's avatar
      KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl · 5868b822
      Maxim Levitsky authored
      Currently when SVM is enabled in guest CPUID, AVIC is inhibited as soon
      as the guest CPUID is set.
      
      AVIC happens to be fully disabled on all vCPUs by the time any guest
      entry starts (if after migration the entry can be nested).
      
      The reason is that currently we disable avic right away on vCPU from which
      the kvm_request_apicv_update was called and for this case, it happens to be
      called on all vCPUs (by svm_vcpu_after_set_cpuid).
      
      After we stop doing this, AVIC will end up being disabled only when
      KVM_REQ_APICV_UPDATE is processed which is after we done switching to the
      nested guest.
      
      Fix this by just using vmcb01 in svm_refresh_apicv_exec_ctrl for avic
      (which is a right thing to do anyway).
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210713142023.106183-4-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5868b822
    • Maxim Levitsky's avatar
      KVM: SVM: tweak warning about enabled AVIC on nested entry · feea0136
      Maxim Levitsky authored
      It is possible that AVIC was requested to be disabled but
      not yet disabled, e.g if the nested entry is done right
      after svm_vcpu_after_set_cpuid.
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210713142023.106183-3-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      feea0136
    • Maxim Levitsky's avatar
      KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated · f1577ab2
      Maxim Levitsky authored
      It is possible for AVIC inhibit and AVIC active state to be mismatched.
      Currently we disable AVIC right away on vCPU which started the AVIC inhibit
      request thus this warning doesn't trigger but at least in theory,
      if svm_set_vintr is called at the same time on multiple vCPUs,
      the warning can happen.
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210713142023.106183-2-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f1577ab2
    • Christian Borntraeger's avatar
      KVM: s390: restore old debugfs names · bb000f64
      Christian Borntraeger authored
      commit bc9e9e67 ("KVM: debugfs: Reuse binary stats descriptors")
      did replace the old definitions with the binary ones. While doing that
      it missed that some files are names different than the counters. This
      is especially important for kvm_stat which does have special handling
      for counters named instruction_*.
      
      Fixes: commit bc9e9e67 ("KVM: debugfs: Reuse binary stats descriptors")
      CC: Jing Zhang <jingzhangos@google.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Message-Id: <20210726150108.5603-1-borntraeger@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bb000f64
    • Paolo Bonzini's avatar
      KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized · 3fa5e8fd
      Paolo Bonzini authored
      Right now, svm_hv_vmcb_dirty_nested_enlightenments has an incorrect
      dereference of vmcb->control.reserved_sw before the vmcb is checked
      for being non-NULL.  The compiler is usually sinking the dereference
      after the check; instead of doing this ourselves in the source,
      ensure that svm_hv_vmcb_dirty_nested_enlightenments is only called
      with a non-NULL VMCB.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Vineeth Pillai <viremana@linux.microsoft.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [Untested for now due to issues with my AMD machine. - Paolo]
      3fa5e8fd
    • David Matlack's avatar
      KVM: selftests: Introduce access_tracking_perf_test · c33e05d9
      David Matlack authored
      This test measures the performance effects of KVM's access tracking.
      Access tracking is driven by the MMU notifiers test_young, clear_young,
      and clear_flush_young. These notifiers do not have a direct userspace
      API, however the clear_young notifier can be triggered by marking a
      pages as idle in /sys/kernel/mm/page_idle/bitmap. This test leverages
      that mechanism to enable access tracking on guest memory.
      
      To measure performance this test runs a VM with a configurable number of
      vCPUs that each touch every page in disjoint regions of memory.
      Performance is measured in the time it takes all vCPUs to finish
      touching their predefined region.
      
      Example invocation:
      
        $ ./access_tracking_perf_test -v 8
        Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
        guest physical test memory offset: 0xffdfffff000
      
        Populating memory             : 1.337752570s
        Writing to populated memory   : 0.010177640s
        Reading from populated memory : 0.009548239s
        Mark memory idle              : 23.973131748s
        Writing to idle memory        : 0.063584496s
        Mark memory idle              : 24.924652964s
        Reading from idle memory      : 0.062042814s
      
      Breaking down the results:
      
       * "Populating memory": The time it takes for all vCPUs to perform the
         first write to every page in their region.
      
       * "Writing to populated memory" / "Reading from populated memory": The
         time it takes for all vCPUs to write and read to every page in their
         region after it has been populated. This serves as a control for the
         later results.
      
       * "Mark memory idle": The time it takes for every vCPU to mark every
         page in their region as idle through page_idle.
      
       * "Writing to idle memory" / "Reading from idle memory": The time it
         takes for all vCPUs to write and read to every page in their region
         after it has been marked idle.
      
      This test should be portable across architectures but it is only enabled
      for x86_64 since that's all I have tested.
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210713220957.3493520-7-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c33e05d9
    • David Matlack's avatar
      KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing · 15b7b737
      David Matlack authored
      There is a missing break statement which causes a fallthrough to the
      next statement where optarg will be null and a segmentation fault will
      be generated.
      
      Fixes: 9e965bb7 ("KVM: selftests: Add backing src parameter to dirty_log_perf_test")
      Reviewed-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210713220957.3493520-6-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      15b7b737
    • Juergen Gross's avatar
      x86/kvm: fix vcpu-id indexed array sizes · 76b4f357
      Juergen Gross authored
      KVM_MAX_VCPU_ID is the maximum vcpu-id of a guest, and not the number
      of vcpu-ids. Fix array indexed by vcpu-id to have KVM_MAX_VCPU_ID+1
      elements.
      
      Note that this is currently no real problem, as KVM_MAX_VCPU_ID is
      an odd number, resulting in always enough padding being available at
      the end of those arrays.
      
      Nevertheless this should be fixed in order to avoid rare problems in
      case someone is using an even number for KVM_MAX_VCPU_ID.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Message-Id: <20210701154105.23215-2-jgross@suse.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      76b4f357
  5. 26 Jul, 2021 5 commits
  6. 19 Jul, 2021 1 commit
  7. 18 Jul, 2021 13 commits
    • Linus Torvalds's avatar
      Linux 5.14-rc2 · 2734d6c1
      Linus Torvalds authored
      2734d6c1
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.14-2021-07-18' of... · 8c25c447
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Skip invalid hybrid PMU on hybrid systems when the atom (little) CPUs
         are offlined.
      
       - Fix 'perf test' problems related to the recently added hybrid
         (BIG/little) code.
      
       - Split ARM's coresight (hw tracing) decode by aux records to avoid
         fatal decoding errors.
      
       - Fix add event failure in 'perf probe' when running 32-bit perf in a
         64-bit kernel.
      
       - Fix 'perf sched record' failure when CONFIG_SCHEDSTATS is not set.
      
       - Fix memory and refcount leaks detected by ASAn when running 'perf
         test', should be clean of warnings now.
      
       - Remove broken definition of __LITTLE_ENDIAN from tools'
         linux/kconfig.h, which was breaking the build in some systems.
      
       - Cast PTHREAD_STACK_MIN to int as it may turn into 'long
         sysconf(__SC_THREAD_STACK_MIN_VALUE), breaking the build in some
         systems.
      
       - Fix libperf build error with LIBPFM4=1.
      
       - Sync UAPI files changed by the memfd_secret new syscall.
      
      * tag 'perf-tools-fixes-for-v5.14-2021-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (35 commits)
        perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set
        perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel
        perf data: Close all files in close_dir()
        perf probe-file: Delete namelist in del_events() on the error path
        perf test bpf: Free obj_buf
        perf trace: Free strings in trace__parse_events_option()
        perf trace: Free syscall tp fields in evsel->priv
        perf trace: Free syscall->arg_fmt
        perf trace: Free malloc'd trace fields on exit
        perf lzma: Close lzma stream on exit
        perf script: Fix memory 'threads' and 'cpus' leaks on exit
        perf script: Release zstd data
        perf session: Cleanup trace_event
        perf inject: Close inject.output on exit
        perf report: Free generated help strings for sort option
        perf env: Fix memory leak of cpu_pmu_caps
        perf test maps__merge_in: Fix memory leak of maps
        perf dso: Fix memory leak in dso__new_map()
        perf test event_update: Fix memory leak of unit
        perf test event_update: Fix memory leak of evlist
        ...
      8c25c447
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · f0eb870a
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "A few fixes for issues in the new online shrink code, additional
        corrections for my recent bug-hunt w.r.t. extent size hints on
        realtime, and improved input checking of the GROWFSRT ioctl.
      
        IOW, the usual 'I somehow got bored during the merge window and
        resumed auditing the farther reaches of xfs':
      
         - Fix shrink eligibility checking when sparse inode clusters enabled
      
         - Reset '..' directory entries when unlinking directories to prevent
           verifier errors if fs is shrinked later
      
         - Don't report unusable extent size hints to FSGETXATTR
      
         - Don't warn when extent size hints are unusable because the sysadmin
           configured them that way
      
         - Fix insufficient parameter validation in GROWFSRT ioctl
      
         - Fix integer overflow when adding rt volumes to filesystem"
      
      * tag 'xfs-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: detect misaligned rtinherit directory extent size hints
        xfs: fix an integer overflow error in xfs_growfs_rt
        xfs: improve FSGROWFSRT precondition checking
        xfs: don't expose misaligned extszinherit hints to userspace
        xfs: correct the narrative around misaligned rtinherit/extszinherit dirs
        xfs: reset child dir '..' entry when unlinking child
        xfs: check for sparse inode clusters that cross new EOAG when shrinking
      f0eb870a
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · fbf1bddc
      Linus Torvalds authored
      Pull iomap fixes from Darrick Wong:
       "A handful of bugfixes for the iomap code.
      
        There's nothing especially exciting here, just fixes for UBSAN (not
        KASAN as I erroneously wrote in the tag message) warnings about
        undefined behavior in the SEEK_DATA/SEEK_HOLE code, and some
        reshuffling of per-page block state info to fix some problems with
        gfs2.
      
         - Fix KASAN warnings due to integer overflow in SEEK_DATA/SEEK_HOLE
      
         - Fix assertion errors when using inlinedata files on gfs2"
      
      * tag 'iomap-5.14-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: Don't create iomap_page objects in iomap_page_mkwrite_actor
        iomap: Don't create iomap_page objects for inline files
        iomap: Permit pages without an iop to enter writeback
        iomap: remove the length variable in iomap_seek_hole
        iomap: remove the length variable in iomap_seek_data
      fbf1bddc
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.14' of... · 6750691a
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Restore the original behavior of scripts/setlocalversion when
         LOCALVERSION is set to empty.
      
       - Show Kconfig prompts even for 'make -s'
      
       - Fix the combination of COFNIG_LTO_CLANG=y and CONFIG_MODVERSIONS=y
         for older GNU Make versions
      
      * tag 'kbuild-fixes-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        Documentation: Fix intiramfs script name
        Kbuild: lto: fix module versionings mismatch in GNU make 3.X
        kbuild: do not suppress Kconfig prompts for silent build
        scripts/setlocalversion: fix a bug when LOCALVERSION is empty
      6750691a
    • Robert Richter's avatar
      Documentation: Fix intiramfs script name · 5e60f363
      Robert Richter authored
      Documentation was not changed when renaming the script in commit
      80e715a0 ("initramfs: rename gen_initramfs_list.sh to
      gen_initramfs.sh"). Fixing this.
      
      Basically does:
      
       $ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh)
      
      Fixes: 80e715a0 ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh")
      Signed-off-by: default avatarRobert Richter <rrichter@amd.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      5e60f363
    • Lecopzer Chen's avatar
      Kbuild: lto: fix module versionings mismatch in GNU make 3.X · 1d11053d
      Lecopzer Chen authored
      When building modules(CONFIG_...=m), I found some of module versions
      are incorrect and set to 0.
      This can be found in build log for first clean build which shows
      
      WARNING: EXPORT symbol "XXXX" [drivers/XXX/XXX.ko] version generation failed,
      symbol will not be versioned.
      
      But in second build(incremental build), the WARNING disappeared and the
      module version becomes valid CRC and make someone who want to change
      modules without updating kernel image can't insert their modules.
      
      The problematic code is
      +	$(foreach n, $(filter-out FORCE,$^),				\
      +		$(if $(wildcard $(n).symversions),			\
      +			; cat $(n).symversions >> $@.symversions))
      
      For example:
        rm -f fs/notify/built-in.a.symversions    ; rm -f fs/notify/built-in.a; \
      llvm-ar cDPrST fs/notify/built-in.a fs/notify/fsnotify.o \
      fs/notify/notification.o fs/notify/group.o ...
      
      `foreach n` shows nothing to `cat` into $(n).symversions because
      `if $(wildcard $(n).symversions)` return nothing, but actually
      they do exist during this line was executed.
      
      -rw-r--r-- 1 root root 168580 Jun 13 19:10 fs/notify/fsnotify.o
      -rw-r--r-- 1 root root    111 Jun 13 19:10 fs/notify/fsnotify.o.symversions
      
      The reason is the $(n).symversions are generated at runtime, but
      Makefile wildcard function expends and checks the file exist or not
      during parsing the Makefile.
      
      Thus fix this by use `test` shell command to check the file
      existence in runtime.
      
      Rebase from both:
      1. [https://lore.kernel.org/lkml/20210616080252.32046-1-lecopzer.chen@mediatek.com/]
      2. [https://lore.kernel.org/lkml/20210702032943.7865-1-lecopzer.chen@mediatek.com/]
      
      Fixes: 38e89184 ("kbuild: lto: fix module versioning")
      Co-developed-by: default avatarSami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarLecopzer Chen <lecopzer.chen@mediatek.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1d11053d
    • Masahiro Yamada's avatar
      kbuild: do not suppress Kconfig prompts for silent build · d952cfaf
      Masahiro Yamada authored
      When a new CONFIG option is available, Kbuild shows a prompt to get
      the user input.
      
        $ make
        [ snip ]
        Core Scheduling for SMT (SCHED_CORE) [N/y/?] (NEW)
      
      This is the only interactive place in the build process.
      
      Commit 174a1dcc ("kbuild: sink stdout from cmd for silent build")
      suppressed Kconfig prompts as well because syncconfig is invoked by
      the 'cmd' macro. You cannot notice the fact that Kconfig is waiting
      for the user input.
      
      Use 'kecho' to show the equivalent short log without suppressing stdout
      from sub-make.
      
      Fixes: 174a1dcc ("kbuild: sink stdout from cmd for silent build")
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      d952cfaf
    • Mikulas Patocka's avatar
      scripts/setlocalversion: fix a bug when LOCALVERSION is empty · 5df99bec
      Mikulas Patocka authored
      The commit 042da426 ("scripts/setlocalversion: simplify the short
      version part") reduces indentation. Unfortunately, it also changes behavior
      in a subtle way - if the user has empty "LOCALVERSION" variable, the plus
      sign is appended to the kernel version. It wasn't appended before.
      
      This patch reverts to the old behavior - we append the plus sign only if
      the LOCALVERSION variable is not set.
      
      Fixes: 042da426 ("scripts/setlocalversion: simplify the short version part")
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      5df99bec
    • Yang Jihong's avatar
      perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set · b0f00855
      Yang Jihong authored
      The tracepoints trace_sched_stat_{wait, sleep, iowait} are not exposed to user
      if CONFIG_SCHEDSTATS is not set, "perf sched record" records the three events.
      As a result, the command fails.
      
      Before:
      
        #perf sched record sleep 1
        event syntax error: 'sched:sched_stat_wait'
                             \___ unknown tracepoint
      
        Error:  File /sys/kernel/tracing/events/sched/sched_stat_wait not found.
        Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
      
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      Solution:
        Check whether schedstat tracepoints are exposed. If no, these events are not recorded.
      
      After:
        # perf sched record sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.163 MB perf.data (1091 samples) ]
        # perf sched report
        run measurement overhead: 4736 nsecs
        sleep measurement overhead: 9059979 nsecs
        the run test took 999854 nsecs
        the sleep test took 8945271 nsecs
        nr_run_events:        716
        nr_sleep_events:      785
        nr_wakeup_events:     0
        ...
        ------------------------------------------------------------
      
      Fixes: 2a09b5de ("sched/fair: do not expose some tracepoints to user if CONFIG_SCHEDSTATS is not set")
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Yafang Shao <laoar.shao@gmail.com>
      Link: http://lore.kernel.org/lkml/20210713112358.194693-1-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0f00855
    • Yang Jihong's avatar
      perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel · 22a66551
      Yang Jihong authored
      The "address" member of "struct probe_trace_point" uses long data type.
      If kernel is 64-bit and perf program is 32-bit, size of "address"
      variable is 32 bits.
      
      As a result, upper 32 bits of address read from kernel are truncated, an
      error occurs during address comparison in kprobe_warn_out_range().
      
      Before:
      
        # perf probe -a schedule
        schedule is out of .text, skip it.
          Error: Failed to add events.
      
      Solution:
        Change data type of "address" variable to u64 and change corresponding
      address printing and value assignment.
      
      After:
      
        # perf.new.new probe -a schedule
        Added new event:
          probe:schedule       (on schedule)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:schedule -aR sleep 1
      
        # perf probe -l
          probe:schedule       (on schedule@kernel/sched/core.c)
        # perf record -e probe:schedule -aR sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ]
        # perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 1K of event 'probe:schedule'
        # Event count (approx.): 1366
        #
        # Overhead  Command          Shared Object      Symbol
        # ........  ...............  .................  ............
        #
             6.22%  migration/0      [kernel.kallsyms]  [k] schedule
             6.22%  migration/1      [kernel.kallsyms]  [k] schedule
             6.22%  migration/2      [kernel.kallsyms]  [k] schedule
             6.22%  migration/3      [kernel.kallsyms]  [k] schedule
             6.15%  migration/10     [kernel.kallsyms]  [k] schedule
             6.15%  migration/11     [kernel.kallsyms]  [k] schedule
             6.15%  migration/12     [kernel.kallsyms]  [k] schedule
             6.15%  migration/13     [kernel.kallsyms]  [k] schedule
             6.15%  migration/14     [kernel.kallsyms]  [k] schedule
             6.15%  migration/15     [kernel.kallsyms]  [k] schedule
             6.15%  migration/4      [kernel.kallsyms]  [k] schedule
             6.15%  migration/5      [kernel.kallsyms]  [k] schedule
             6.15%  migration/6      [kernel.kallsyms]  [k] schedule
             6.15%  migration/7      [kernel.kallsyms]  [k] schedule
             6.15%  migration/8      [kernel.kallsyms]  [k] schedule
             6.15%  migration/9      [kernel.kallsyms]  [k] schedule
             0.22%  rcu_sched        [kernel.kallsyms]  [k] schedule
        ...
        #
        # (Cannot load tips.txt file, please install perf!)
        #
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jianlin Lv <jianlin.lv@arm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Li Huafei <lihuafei1@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Link: http://lore.kernel.org/lkml/20210715063723.11926-1-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      22a66551
    • Riccardo Mancini's avatar
      perf data: Close all files in close_dir() · d4b3eedc
      Riccardo Mancini authored
      When using 'perf report' in directory mode, the first file is not closed
      on exit, causing a memory leak.
      
      The problem is caused by the iterating variable never reaching 0.
      
      Fixes: 14552063 ("perf data: Add perf_data__(create_dir|close_dir) functions")
      Signed-off-by: default avatarRiccardo Mancini <rickyman7@gmail.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhen Lei <thunder.leizhen@huawei.com>
      Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4b3eedc
    • Riccardo Mancini's avatar
      perf probe-file: Delete namelist in del_events() on the error path · e0fa7ab4
      Riccardo Mancini authored
      ASan reports some memory leaks when running:
      
        # perf test "42: BPF filter"
      
      This second leak is caused by a strlist not being dellocated on error
      inside probe_file__del_events.
      
      This patch adds a goto label before the deallocation and makes the error
      path jump to it.
      Signed-off-by: default avatarRiccardo Mancini <rickyman7@gmail.com>
      Fixes: e7895e42 ("perf probe: Split del_perf_probe_events()")
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343282.git.rickyman7@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0fa7ab4
  8. 17 Jul, 2021 5 commits
    • Linus Torvalds's avatar
      Merge tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 1d67c8d9
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "Here are the patches for this week that came as the fallout of the
        merge window:
      
         - Two fixes for the NVidia memory controller driver
      
         - multiple defconfig files get patched to turn CONFIG_FB back on
           after that is no longer selected by CONFIG_DRM
      
         - ffa and scmpi firmware drivers fixes, mostly addressing compiler
           and documentation warnings
      
         - Platform specific fixes for device tree files on ASpeed, Renesas
           and NVidia SoC, mostly for recent regressions.
      
         - A workaround for a regression on the USB PHY with devlink when the
           usb-nop-xceiv driver is not available until the rootfs is mounted.
      
         - Device tree compiler warnings in Arm Versatile-AB"
      
      * tag 'soc-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
        ARM: dts: versatile: Fix up interrupt controller node names
        ARM: multi_v7_defconfig: Make NOP_USB_XCEIV driver built-in
        ARM: configs: Update u8500_defconfig
        ARM: configs: Update Vexpress defconfig
        ARM: configs: Update Versatile defconfig
        ARM: configs: Update RealView defconfig
        ARM: configs: Update Integrator defconfig
        arm: Typo s/PCI_IXP4XX_LEGACY/IXP4XX_PCI_LEGACY/
        firmware: arm_scmi: Fix range check for the maximum number of pending messages
        firmware: arm_scmi: Avoid padding in sensor message structure
        firmware: arm_scmi: Fix kernel doc warnings about return values
        firmware: arm_scpi: Fix kernel doc warnings
        firmware: arm_scmi: Fix kernel doc warnings
        ARM: shmobile: defconfig: Restore graphical consoles
        firmware: arm_ffa: Fix a possible ffa_linux_errmap buffer overflow
        firmware: arm_ffa: Fix the comment style
        firmware: arm_ffa: Simplify probe function
        firmware: arm_ffa: Ensure drivers provide a probe function
        firmware: arm_scmi: Fix possible scmi_linux_errmap buffer overflow
        firmware: arm_scmi: Ensure drivers provide a probe function
        ...
      1d67c8d9
    • Linus Torvalds's avatar
      Revert "mm/slub: use stackdepot to save stack trace in objects" · ae14c63a
      Linus Torvalds authored
      This reverts commit 78869146.
      
      It's not clear why, but it causes unexplained problems in entirely
      unrelated xfs code.  The most likely explanation is some slab
      corruption, possibly triggered due to CONFIG_SLUB_DEBUG_ON.  See [1].
      
      It ends up having a few other problems too, like build errors on
      arch/arc, and Geert reporting it using much more memory on m68k [3] (it
      probably does so elsewhere too, but it is probably just more noticeable
      on m68k).
      
      The architecture issues (both build and memory use) are likely just
      because this change effectively force-enabled STACKDEPOT (along with a
      very bad default value for the stackdepot hash size).  But together with
      the xfs issue, this all smells like "this commit was not ready" to me.
      
      Link: https://lore.kernel.org/linux-xfs/YPE3l82acwgI2OiV@infradead.org/ [1]
      Link: https://lore.kernel.org/lkml/202107150600.LkGNb4Vb-lkp@intel.com/ [2]
      Link: https://lore.kernel.org/lkml/CAMuHMdW=eoVzM1Re5FVoEN87nKfiLmM2+Ah7eNu2KXEhCvbZyA@mail.gmail.com/ [3]
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae14c63a
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 5d766d55
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "One core fix for an oops which can occur if the error handling thread
        fails to start for some reason and the driver is removed.
      
        The other fixes are all minor ones in drivers"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Add missing host_lock in ufshcd_vops_setup_xfer_req()
        scsi: mpi3mr: Fix W=1 compilation warnings
        scsi: pm8001: Clean up kernel-doc and comments
        scsi: zfcp: Report port fc_security as unknown early during remote cable pull
        scsi: core: Fix bad pointer dereference when ehandler kthread is invalid
        scsi: fas216: Fix a build error
        scsi: core: Fix the documentation of the scsi_execute() time parameter
      5d766d55
    • Linus Torvalds's avatar
      Merge tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 44cb60b4
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Eight cifs/smb3 fixes, including three for stable.
      
        Three are DFS related fixes, and two to fix problems pointed out by
        static checkers"
      
      * tag '5.14-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: do not share tcp sessions of dfs connections
        SMB3.1.1: fix mount failure to some servers when compression enabled
        cifs: added WARN_ON for all the count decrements
        cifs: fix missing null session check in mount
        cifs: handle reconnect of tcon when there is no cached dfs referral
        cifs: fix the out of range assignment to bit fields in parse_server_interfaces
        cifs: Do not use the original cruid when following DFS links for multiuser mounts
        cifs: use the expiry output of dns_query to schedule next resolution
      44cb60b4
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-kunit-fixes-5.14-rc2' of... · ccbb22b9
      Linus Torvalds authored
      Merge tag 'linux-kselftest-kunit-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kunit fixes from Shuah Khan:
       "Fixes to kunit tool and documentation:
      
         - fix asserts on older python versions
      
         - fixes to misleading error messages when TAP header format is
           incorrect or when file is missing
      
         - documentation fix: drop obsolete information about uml_abort
           coverage
      
         - remove unnecessary annotations"
      
      * tag 'linux-kselftest-kunit-fixes-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: tool: Assert the version requirement
        kunit: tool: remove unnecessary "annotations" import
        Documentation: kunit: drop obsolete note about uml_abort for coverage
        kunit: tool: Fix error messages for cases of no tests and wrong TAP header
      ccbb22b9