1. 17 Jan, 2020 3 commits
    • Kim Phillips's avatar
      perf/x86/amd: Add support for Large Increment per Cycle Events · 57388912
      Kim Phillips authored
      Description of hardware operation
      ---------------------------------
      
      The core AMD PMU has a 4-bit wide per-cycle increment for each
      performance monitor counter.  That works for most events, but
      now with AMD Family 17h and above processors, some events can
      occur more than 15 times in a cycle.  Those events are called
      "Large Increment per Cycle" events. In order to count these
      events, two adjacent h/w PMCs get their count signals merged
      to form 8 bits per cycle total.  In addition, the PERF_CTR count
      registers are merged to be able to count up to 64 bits.
      
      Normally, events like instructions retired, get programmed on a single
      counter like so:
      
      PERF_CTL0 (MSR 0xc0010200) 0x000000000053ff0c # event 0x0c, umask 0xff
      PERF_CTR0 (MSR 0xc0010201) 0x0000800000000001 # r/w 48-bit count
      
      The next counter at MSRs 0xc0010202-3 remains unused, or can be used
      independently to count something else.
      
      When counting Large Increment per Cycle events, such as FLOPs,
      however, we now have to reserve the next counter and program the
      PERF_CTL (config) register with the Merge event (0xFFF), like so:
      
      PERF_CTL0 (msr 0xc0010200) 0x000000000053ff03 # FLOPs event, umask 0xff
      PERF_CTR0 (msr 0xc0010201) 0x0000800000000001 # rd 64-bit cnt, wr lo 48b
      PERF_CTL1 (msr 0xc0010202) 0x0000000f004000ff # Merge event, enable bit
      PERF_CTR1 (msr 0xc0010203) 0x0000000000000000 # wr hi 16-bits count
      
      The count is widened from the normal 48-bits to 64 bits by having the
      second counter carry the higher 16 bits of the count in its lower 16
      bits of its counter register.
      
      The odd counter, e.g., PERF_CTL1, is programmed with the enabled Merge
      event before the even counter, PERF_CTL0.
      
      The Large Increment feature is available starting with Family 17h.
      For more details, search any Family 17h PPR for the "Large Increment
      per Cycle Events" section, e.g., section 2.1.15.3 on p. 173 in this
      version:
      
      https://www.amd.com/system/files/TechDocs/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip
      
      Description of software operation
      ---------------------------------
      
      The following steps are taken in order to support reserving and
      enabling the extra counter for Large Increment per Cycle events:
      
      1. In the main x86 scheduler, we reduce the number of available
      counters by the number of Large Increment per Cycle events being
      scheduled, tracked by a new cpuc variable 'n_pair' and a new
      amd_put_event_constraints_f17h().  This improves the counter
      scheduler success rate.
      
      2. In perf_assign_events(), if a counter is assigned to a Large
      Increment event, we increment the current counter variable, so the
      counter used for the Merge event is removed from assignment
      consideration by upcoming event assignments.
      
      3. In find_counter(), if a counter has been found for the Large
      Increment event, we set the next counter as used, to prevent other
      events from using it.
      
      4. We perform steps 2 & 3 also in the x86 scheduler fastpath, i.e.,
      we add Merge event accounting to the existing used_mask logic.
      
      5. Finally, we add on the programming of Merge event to the
      neighbouring PMC counters in the counter enable/disable{_all}
      code paths.
      
      Currently, software does not support a single PMU with mixed 48- and
      64-bit counting, so Large increment event counts are limited to 48
      bits.  In set_period, we zero-out the upper 16 bits of the count, so
      the hardware doesn't copy them to the even counter's higher bits.
      
      Simple invocation example showing counting 8 FLOPs per 256-bit/%ymm
      vaddps instruction executed in a loop 100 million times:
      
      perf stat -e cpu/fp_ret_sse_avx_ops.all/,cpu/instructions/ <workload>
      
       Performance counter stats for '<workload>':
      
             800,000,000      cpu/fp_ret_sse_avx_ops.all/u
             300,042,101      cpu/instructions/u
      
      Prior to this patch, the reported SSE/AVX FLOPs retired count would
      be wrong.
      
      [peterz: lots of renames and edits to the code]
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      57388912
    • Kim Phillips's avatar
      perf/x86/amd: Constrain Large Increment per Cycle events · 471af006
      Kim Phillips authored
      AMD Family 17h processors and above gain support for Large Increment
      per Cycle events.  Unfortunately there is no CPUID or equivalent bit
      that indicates whether the feature exists or not, so we continue to
      determine eligibility based on a CPU family number comparison.
      
      For Large Increment per Cycle events, we add a f17h-and-compatibles
      get_event_constraints_f17h() that returns an even counter bitmask:
      Large Increment per Cycle events can only be placed on PMCs 0, 2,
      and 4 out of the currently available 0-5.  The only currently
      public event that requires this feature to report valid counts
      is PMCx003 "Retired SSE/AVX Operations".
      
      Note that the CPU family logic in amd_core_pmu_init() is changed
      so as to be able to selectively add initialization for features
      available in ranges of backward-compatible CPU families.  This
      Large Increment per Cycle feature is expected to be retained
      in future families.
      
      A side-effect of assigning a new get_constraints function for f17h
      disables calling the old (prior to f15h) amd_get_event_constraints
      implementation left enabled by commit e40ed154 ("perf/x86: Add perf
      support for AMD family-17h processors"), which is no longer
      necessary since those North Bridge event codes are obsoleted.
      
      Also fix a spelling mistake whilst in the area (calulating ->
      calculating).
      
      Fixes: e40ed154 ("perf/x86: Add perf support for AMD family-17h processors")
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20191114183720.19887-2-kim.phillips@amd.com
      471af006
    • Harry Pan's avatar
      perf/x86/intel/rapl: Add Comet Lake support · 1e0f1772
      Harry Pan authored
      Comet Lake supports the same RAPL counters like Kaby Lake and Skylake.
      After this, on CML machine the energy counters appear in perf list.
      Signed-off-by: default avatarHarry Pan <harry.pan@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20191227171944.1.Id6f3ab98474d7d1dba5b95390b24e0a67368d364@changeid
      1e0f1772
  2. 10 Jan, 2020 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.6-20200106' of... · 53f3feeb
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.6-20200106' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      perf record:
      
        Alexey Budankov:
      
        - Adapt affinity for machines with #CPUs > 1K to overcome current 1024 CPUs
          mask size limitation of cpu_set_t type.
      
      perf report/top TUI:
      
        Arnaldo Carvalho de Melo:
      
        - Make ENTER consistently present the pop up menu with and without call
          chains, to eliminate confusion. The menu continues available at all times
          use 'm' and '+' can be used to toggle just one call chain level, 'e' for all
          the call chains for a top level histogram entry and 'E' to expand all call
          chains in all top level entries. Extra info about these options was added to
          the pop up menu entries. Pressing 'k' serves as special hotkey to go straight
          to the main vmlinux entries, to avoid having to press enter and then select
          "Zoom into the kernel DSO".
      
      perf sched timehist:
      
        David Ahern:
      
        - Add support for filtering on CPU.
      
      perf tests:
      
        Arnaldo Carvalho de Melo:
      
        - Show expected versus obtained values in bp_signal test.
      
      libperf:
      
        Jiri Olsa:
      
        - Move to tools/lib/perf.
      
        - Add man pages.
      
      libapi:
      
        Andrey Zhizhikin:
      
        - Fix gcc9 stringop-truncation compilation error.
      
      tools lib:
      
        Vitaly Chikunov:
      
        - Fix builds when glibc contains strlcpy(), which is the case for ALT Linux.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      53f3feeb
  3. 06 Jan, 2020 20 commits
  4. 25 Dec, 2019 2 commits
  5. 23 Dec, 2019 3 commits
  6. 22 Dec, 2019 11 commits
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · c6017471
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Fix a few bugs that could lead to corrupt files, fsck complaints, and
        filesystem crashes:
      
         - Minor documentation fixes
      
         - Fix a file corruption due to read racing with an insert range
           operation.
      
         - Fix log reservation overflows when allocating large rt extents
      
         - Fix a buffer log item flags check
      
         - Don't allow administrators to mount with sunit= options that will
           cause later xfs_repair complaints about the root directory being
           suspicious because the fs geometry appeared inconsistent
      
         - Fix a non-static helper that should have been static"
      
      * tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Make the symbol 'xfs_rtalloc_log_count' static
        xfs: don't commit sunit/swidth updates to disk if that would cause repair failures
        xfs: split the sunit parameter update into two parts
        xfs: refactor agfl length computation function
        libxfs: resync with the userspace libxfs
        xfs: use bitops interface for buf log item AIL flag check
        xfs: fix log reservation overflows when allocating large rt extents
        xfs: stabilize insert range start boundary to avoid COW writeback race
        xfs: fix Sphinx documentation warning
      c6017471
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a3965607
      Linus Torvalds authored
      Pull ext4 bug fixes from Ted Ts'o:
       "Ext4 bug fixes, including a regression fix"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: clarify impact of 'commit' mount option
        ext4: fix unused-but-set-variable warning in ext4_add_entry()
        jbd2: fix kernel-doc notation warning
        ext4: use RCU API in debug_print_tree
        ext4: validate the debug_want_extra_isize mount option at parse time
        ext4: reserve revoke credits in __ext4_new_inode
        ext4: unlock on error in ext4_expand_extra_isize()
        ext4: optimize __ext4_check_dir_entry()
        ext4: check for directory entries too close to block end
        ext4: fix ext4_empty_dir() for directories with holes
      a3965607
    • Linus Torvalds's avatar
      Merge tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block · 44579f35
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Let's try this one again, this time without the compat_ioctl changes.
        We've got those fixed up, but that can go out next week.
      
        This contains:
      
         - block queue flush lockdep annotation (Bart)
      
         - Type fix for bsg_queue_rq() (Bart)
      
         - Three dasd fixes (Stefan, Jan)
      
         - nbd deadlock fix (Mike)
      
         - Error handling bio user map fix (Yang)
      
         - iocost fix (Tejun)
      
         - sbitmap waitqueue addition fix that affects the kyber IO scheduler
           (David)"
      
      * tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block:
        sbitmap: only queue kyber's wait callback if not already active
        block: fix memleak when __blk_rq_map_user_iov() is failed
        s390/dasd: fix typo in copyright statement
        s390/dasd: fix memleak in path handling error case
        s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly
        block: Fix a lockdep complaint triggered by request queue flushing
        block: Fix the type of 'sts' in bsg_queue_rq()
        block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT
        nbd: fix shutdown and recv work deadlock v2
        iocost: over-budget forced IOs should schedule async delay
      44579f35
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a313c8e0
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "PPC:
         - Fix a bug where we try to do an ultracall on a system without an
           ultravisor
      
        KVM:
         - Fix uninitialised sysreg accessor
         - Fix handling of demand-paged device mappings
         - Stop spamming the console on IMPDEF sysregs
         - Relax mappings of writable memslots
         - Assorted cleanups
      
        MIPS:
         - Now orphan, James Hogan is stepping down
      
        x86:
         - MAINTAINERS change, so long Radim and thanks for all the fish
         - supported CPUID fixes for AMD machines without SPEC_CTRL"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        MAINTAINERS: remove Radim from KVM maintainers
        MAINTAINERS: Orphan KVM for MIPS
        kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
        kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
        KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
        KVM: arm/arm64: Properly handle faulting of device mappings
        KVM: arm64: Ensure 'params' is initialised when looking up sys register
        KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region
        KVM: arm64: Don't log IMP DEF sysreg traps
        KVM: arm64: Sanely ratelimit sysreg messages
        KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create()
        KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
        KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
      a313c8e0
    • Linus Torvalds's avatar
      Merge tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 7214618c
      Linus Torvalds authored
      Pull RISC-V fixes from Paul Walmsley:
       "Several fixes, and one cleanup, for RISC-V.
      
        Fixes:
      
         - Fix an error in a Kconfig file that resulted in an undefined
           Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix undefined Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix scratch register clearing in M-mode (affects nommu users)
      
         - Fix a mismerge on my part that broke the build for
           CONFIG_SPARSEMEM_VMEMMAP users
      
        Cleanup:
      
         - Move SiFive L2 cache-related code to drivers/soc, per request"
      
      * tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: move sifive_l2_cache.c to drivers/soc
        riscv: define vmemmap before pfn_to_page calls
        riscv: fix scratch register clearing in M-mode.
        riscv: Fix use of undefined config option CONFIG_CONFIG_MMU
      7214618c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 78bac77b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
          including adding a missing ipv6 match description.
      
       2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
          Bhat.
      
       3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.
      
       4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.
      
       5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.
      
       6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
          Chaignon.
      
       7) Multicast MAC limit test is off by one in qede, from Manish Chopra.
      
       8) Fix established socket lookup race when socket goes from
          TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
          RCU grace period. From Eric Dumazet.
      
       9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.
      
      10) Fix active backup transition after link failure in bonding, from
          Mahesh Bandewar.
      
      11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.
      
      12) Fix wrong interface passed to ->mac_link_up(), from Russell King.
      
      13) Fix DSA egress flooding settings in b53, from Florian Fainelli.
      
      14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.
      
      15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.
      
      16) Reject invalid MTU values in stmmac, from Jose Abreu.
      
      17) Fix refcount leak in error path of u32 classifier, from Davide
          Caratti.
      
      18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
          Kaseorg.
      
      19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.
      
      20) Disable hardware GRO when XDP is attached to qede, frm Manish
          Chopra.
      
      21) Since we encode state in the low pointer bits, dst metrics must be
          at least 4 byte aligned, which is not necessarily true on m68k. Add
          annotations to fix this, from Geert Uytterhoeven.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
        sfc: Include XDP packet headroom in buffer step size.
        sfc: fix channel allocation with brute force
        net: dst: Force 4-byte alignment of dst_metrics
        selftests: pmtu: fix init mtu value in description
        hv_netvsc: Fix unwanted rx_table reset
        net: phy: ensure that phy IDs are correctly typed
        mod_devicetable: fix PHY module format
        qede: Disable hardware gro when xdp prog is installed
        net: ena: fix issues in setting interrupt moderation params in ethtool
        net: ena: fix default tx interrupt moderation interval
        net/smc: unregister ib devices in reboot_event
        net: stmmac: platform: Fix MDIO init for platforms without PHY
        llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
        net: hisilicon: Fix a BUG trigered by wrong bytes_compl
        net: dsa: ksz: use common define for tag len
        s390/qeth: don't return -ENOTSUPP to userspace
        s390/qeth: fix promiscuous mode after reset
        s390/qeth: handle error due to unsupported transport mode
        cxgb4: fix refcount init for TC-MQPRIO offload
        tc-testing: initial tdc selftests for cls_u32
        ...
      78bac77b
    • Jan Stancek's avatar
      pipe: fix empty pipe check in pipe_write() · 0dd1e377
      Jan Stancek authored
      LTP pipeio_1 test is hanging with v5.5-rc2-385-gb8e382a1,
      with read side observing empty pipe and sleeping and write
      side running out of space and then sleeping as well. In this
      scenario there are 5 writers and 1 reader.
      
      Problem is that after pipe_write() reacquires pipe lock, it
      re-checks for empty pipe with potentially stale 'head' and
      doesn't wake up read side anymore. pipe->tail can advance
      beyond 'head', because there are multiple writers.
      
      Use pipe->head for empty pipe check after reacquiring lock
      to observe current state.
      
      Testing: With patch, LTP pipeio_1 ran successfully in loop for 1 hour.
               Without patch it hanged within a minute.
      
      Fixes: 1b6b26ae ("pipe: fix and clarify pipe write wakeup logic")
      Reported-by: default avatarRachel Sibley <rasibley@redhat.com>
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0dd1e377
    • Paolo Bonzini's avatar
      Merge tag 'kvm-ppc-fixes-5.5-1' of... · d68321de
      Paolo Bonzini authored
      Merge tag 'kvm-ppc-fixes-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
      
      PPC KVM fix for 5.5
      
      - Fix a bug where we try to do an ultracall on a system without an
        ultravisor.
      d68321de
    • Paolo Bonzini's avatar
      MAINTAINERS: remove Radim from KVM maintainers · 19a049f1
      Paolo Bonzini authored
      Radim's kernel.org email is bouncing, which I take as a signal that
      he is not really able to deal with KVM at this time.  Make MAINTAINERS
      match the effective value of KVM's bus factor.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      19a049f1
    • James Hogan's avatar
      MAINTAINERS: Orphan KVM for MIPS · 088e11d4
      James Hogan authored
      I haven't been active for 18 months, and don't have the hardware set up
      to test KVM for MIPS, so mark it as orphaned and remove myself as
      maintainer. Hopefully somebody from MIPS can pick this up.
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      088e11d4
    • Jan Kara's avatar
      ext4: clarify impact of 'commit' mount option · 23f6b024
      Jan Kara authored
      The description of 'commit' mount option dates back to ext3 times.
      Update the description to match current meaning for ext4.
      Reported-by: default avatarPaul Richards <paul.richards@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20191218111210.14161-1-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      23f6b024