1. 06 Dec, 2022 11 commits
  2. 05 Dec, 2022 2 commits
  3. 02 Dec, 2022 2 commits
  4. 01 Dec, 2022 1 commit
  5. 29 Nov, 2022 3 commits
  6. 23 Nov, 2022 8 commits
  7. 16 Nov, 2022 4 commits
  8. 10 Nov, 2022 1 commit
    • Gerald Schaefer's avatar
      s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP · 00a34d5a
      Gerald Schaefer authored
      Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390.
      
      With this, vmemmap pages used to back struct pages for compound tail
      pages of hugetlb pages are freed and remapped to compound head page
      frame as RO, see also Documentation/vm/vmemmap_dedup.rst.
      
      For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages,
      saving 12K of memory for each 1M hugetlb page (~1.2%).
      /sys/kernel/debug/kernel_page_tables will show the impact:
      
      ---[ vmemmap Area Start ]---
      [...]
      0x0000037202d84000-0x0000037202d85000         4K PTE RW NX
      0x0000037202d85000-0x0000037202d88000        12K PTE RO NX
      
      For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap
      pages, saving 32764K of memory for each 2G hugetlb page (~1.6%)
      /sys/kernel/debug/kernel_page_tables will show the impact:
      
      ---[ vmemmap Area Start ]---
      [...]
      0x000003720a000000-0x000003720a001000         4K PTE RW NX
      0x000003720a001000-0x000003720c000000     32764K PTE RO NX
      
      The memory savings come with some costs:
      - vmemmap mapping for compound hugetlb pages is not a PMD mapping any
        more, but split to 4K PTE mappings, and it will not be coalesced back
        to PMD mapping after freeing hugetlb pages from the pool.
        Apart from theoretical performance impact, this will also (slightly)
        relativize the memory savings because of additional 2K PTE pagetable
        allocations.
      - Workload using "on the fly" hugetlb allocations via
        "nr_overcommit_hugepages" instead of using the hugetlb pool via
        "nr_hugepages" will suffer from considerably increased fault handling
        time, see also description from commit 78f39084
        ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl").
      - Freeing hugetlb pages from the pool will require re-allocation of the
        freed struct pages, and therefore needs some memory available to the
        kernel. This might fail in memory constrained scenarios.
      - For the same reason, memory offline might fail even for ZONE_MOVABLE
        when hugetlb pages are present (but not for s390, since we do not
        support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have
        hugetlb pages in ZONE_MOVABLE).
      - General increased complexity and overhead in kernel handling of
        compound (head) pages.
      
      Therefore, this feature is disabled by default, and has to be enabled
      explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter,
      or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl.
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      00a34d5a
  9. 26 Oct, 2022 6 commits
    • Thomas Richter's avatar
      s390/pai: rename structure member users to active_events · 58354c7d
      Thomas Richter authored
      Rename structure member users to active_events to make it consistent
      with PMU pai_ext. Also use the same prefix syntax for increment and
      decrement operators in both PMUs.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      58354c7d
    • Thomas Richter's avatar
      s390/pai: rework pai_crypto mapped buffer reference count · d3db4ac3
      Thomas Richter authored
      Rework the mapped buffer reference count in PMU pai_crypto
      to match the same technique as in PMU pai_ext.
      This simplifies the logic.
      Do not count the individual number of counter and sampling
      processes. Remember the type of access and the total number of
      references to the buffer.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      d3db4ac3
    • Thomas Richter's avatar
      s390/pai: move enum definition to header file · 4c787963
      Thomas Richter authored
      Move enum definition to header file. This is done in preparation
      for a follow on patch where this enum will be used in another source
      file.
      Also change the enum name from paiext_mode to paievt_mode
      to indicate this enum is now used for several events.
      Make naming consistent and rename PAI_MODE_COUNTER to PAI_MODE_COUNTING.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      4c787963
    • Thomas Richter's avatar
      s390/con3215: Fix white space errors · 55af33fd
      Thomas Richter authored
      Adjust white space according to coding guidelines.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      55af33fd
    • Thomas Richter's avatar
      s390/con3215: Drop console data printout when buffer full · 1f3307cf
      Thomas Richter authored
      Using z/VM the 3270 terminal emulator also emulates an IBM 3215 console
      which outputs line by line. When the screen is full, the console enters
      the MORE... state and waits for the operator to confirm the data
      on the screen by pressing a clear key. If this does not happen in the
      default time frame (currently 50 seconds) the console enters the HOLDING
      state.
      It then waits another time frame (currently 10 seconds) before the output
      continues on the next screen. When the operator presses the clear key
      during these wait times, the output continues immediately.
      
      This may lead to a very long boot time when the console
      has to print many messages, also the system may hang because of the
      console's limited buffer space and the system waits for the console
      output to drain and finally to finish. This problem can only occur
      when a terminal emulator is actually connected to the 3215 console
      driver. If not z/VM simply drops console output.
      
      Remedy this rare situation and add a kernel boot command line parameter
      con3215_drop. It can be set to 0 (do not drop) or 1 (do drop) which is
      the default. This instructs the kernel drop console data when the
      console buffer is full. This speeds up the boot time considerable and
      also does not hang the system anymore.
      
      Add a sysfs attribute file for console IBM 3215 named con_drop.
      This allows for changing the behavior after the boot, for example when
      during interactive debugging a panic/crash is expected.
      
      Here is a test of the new behavior using the following test program:
       #/bin/bash
       declare -i cnt=4
      
       mode=$(cat /sys/bus/ccw/drivers/3215/con_drop)
       [ $mode = yes ] && cnt=25
      
       echo "cons_drop $(cat /sys/bus/ccw/drivers/3215/con_drop)"
       echo "vmcp term more 5 2"
       vmcp term more 5 2
       echo "Run $cnt iterations of "'echo t > /proc/sysrq-trigger'
      
       for i in $(seq $cnt)
       do
      	echo "$i. command 'echo t > /proc/sysrq-trigger' at $(date +%F,%T)"
      	echo t > /proc/sysrq-trigger
      	sleep 1
       done
       echo "droptest done" > /dev/kmsg
       #
      
      Output with sysfs attribute con_drop set to 1:
       # ./droptest.sh
       cons_drop yes
       vmcp term more 5 2
       Run 25 iterations of echo t > /proc/sysrq-trigger
       1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:09
       2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:10
       3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:11
       4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:12
       5. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:13
       6. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:14
       7. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:15
       8. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:16
       9. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:17
       10. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:18
       11. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:19
       12. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:20
       13. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:21
       14. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:22
       15. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:23
       16. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:24
       17. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:25
       18. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:26
       19. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:27
       20. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:28
       21. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:29
       22. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:30
       23. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:31
       24. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:32
       25. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:33
       #
      
      There are no hangs anymore.
      
      Output with sysfs attribute con_drop set to 0 and identical
      setting for z/VM console 'term more 5 2'. Sometimes hitting the
      clear key at the x3270 console to progress output.
      
       # ./droptest.sh
       cons_drop no
       vmcp term more 5 2
       Run 4 iterations of echo t > /proc/sysrq-trigger
       1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:20:58
       2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:24:32
       3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:28:04
       4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:31:37
       #
      
      Details:
      Enable function raw3215_write() to handle tab expansion and newlines
      and feed it with input not larger than the console buffer of 65536
      bytes. Function raw3125_putchar() just forwards its character for
      output to raw3215_write().
      
      This moves tab to blank conversion to one function raw3215_write()
      which also does call raw3215_make_room() to wait for enough free
      buffer space.
      
      Function handle_write() loops over all its input and segments input
      into chunks of console buffer size (should the input be larger).
      
      Rework tab expansion handling logic to avoid code duplication.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      1f3307cf
    • Thomas Richter's avatar
      s390/con3215: Simplify console write operation · 655ae931
      Thomas Richter authored
      The functions con3215_write() and tty3215_write() have nearly
      identical function bodies and a slightly different function prototype.
      Create function handle_write() to handle the common function
      body and maintain the function prototypes.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      655ae931
  10. 23 Oct, 2022 2 commits
    • Linus Torvalds's avatar
      Linux 6.1-rc2 · 247f34f7
      Linus Torvalds authored
      247f34f7
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 05b4ebd2
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "RISC-V:
      
         - Fix compilation without RISCV_ISA_ZICBOM
      
         - Fix kvm_riscv_vcpu_timer_pending() for Sstc
      
        ARM:
      
         - Fix a bug preventing restoring an ITS containing mappings for very
           large and very sparse device topology
      
         - Work around a relocation handling error when compiling the nVHE
           object with profile optimisation
      
         - Fix for stage-2 invalidation holding the VM MMU lock for too long
           by limiting the walk to the largest block mapping size
      
         - Enable stack protection and branch profiling for VHE
      
         - Two selftest fixes
      
        x86:
      
         - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl
      
        selftests:
      
         - synchronize includes between include/uapi and tools/include/uapi"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        tools: include: sync include/api/linux/kvm.h
        KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER
        KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()
        kvm: Add support for arch compat vm ioctls
        RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc
        RISC-V: Fix compilation without RISCV_ISA_ZICBOM
        KVM: arm64: vgic: Fix exit condition in scan_its_table()
        KVM: arm64: nvhe: Fix build with profile optimization
        KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test
        KVM: arm64: selftests: Fix multiple versions of GIC creation
        KVM: arm64: Enable stack protection and branch profiling for VHE
        KVM: arm64: Limit stage2_apply_range() batch size to largest block
        KVM: arm64: Work out supported block level at compile time
      05b4ebd2