1. 30 Nov, 2020 2 commits
  2. 23 Nov, 2020 6 commits
    • Heiko Carstens's avatar
      s390/vdso: reimplement getcpu vdso syscall · 80f06306
      Heiko Carstens authored
      Implement the previously removed getcpu vdso syscall by using the
      TOD programmable field to pass the cpu number to user space.
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      80f06306
    • Heiko Carstens's avatar
      s390/mm: add debug user asce support · 062e5279
      Heiko Carstens authored
      Verify on exit to user space that always
      - the primary ASCE (cr1) is set to kernel ASCE
      - the secondary ASCE (cr7) is set to user ASCE
      
      If this is not the case: panic since something went terribly wrong.
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      062e5279
    • Heiko Carstens's avatar
      s390/mm: use invalid asce instead of kernel asce · 0290c9e3
      Heiko Carstens authored
      Create a region 3 page table which contains only invalid entries, and
      use that via "s390_invalid_asce" instead of the kernel ASCE whenever
      there is either
      - no user address space available, e.g. during early startup
      - as an intermediate ASCE when address spaces are switched
      
      This makes sure that user space accesses in such situations are
      guaranteed to fail.
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      0290c9e3
    • Heiko Carstens's avatar
      s390/mm: remove set_fs / rework address space handling · 87d59863
      Heiko Carstens authored
      Remove set_fs support from s390. With doing this rework address space
      handling and simplify it. As a result address spaces are now setup
      like this:
      
      CPU running in              | %cr1 ASCE | %cr7 ASCE | %cr13 ASCE
      ----------------------------|-----------|-----------|-----------
      user space                  |  user     |  user     |  kernel
      kernel, normal execution    |  kernel   |  user     |  kernel
      kernel, kvm guest execution |  gmap     |  user     |  kernel
      
      To achieve this the getcpu vdso syscall is removed in order to avoid
      secondary address mode and a separate vdso address space in for user
      space. The getcpu vdso syscall will be implemented differently with a
      subsequent patch.
      
      The kernel accesses user space always via secondary address space.
      This happens in different ways:
      - with mvcos in home space mode and directly read/write to secondary
        address space
      - with mvcs/mvcp in primary space mode and copy from primary space to
        secondary space or vice versa
      - with e.g. cs in secondary space mode and access secondary space
      
      Switching translation modes happens with sacf before and after
      instructions which access user space, like before.
      
      Lazy handling of control register reloading is removed in the hope to
      make everything simpler, but at the cost of making kernel entry and
      exit a bit slower. That is: on kernel entry the primary asce is always
      changed to contain the kernel asce, and on kernel exit the primary
      asce is changed again so it contains the user asce.
      
      In kernel mode there is only one exception to the primary asce: when
      kvm guests are executed the primary asce contains the gmap asce (which
      describes the guest address space). The primary asce is reset to
      kernel asce whenever kvm guest execution is interrupted, so that this
      doesn't has to be taken into account for any user space accesses.
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      87d59863
    • Heiko Carstens's avatar
      Merge branch 'fixes' into features · 77663819
      Heiko Carstens authored
      * fixes:
        s390: fix fpu restore in entry.S
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      77663819
    • Sven Schnelle's avatar
      s390: fix fpu restore in entry.S · 1179f170
      Sven Schnelle authored
      We need to disable interrupts in load_fpu_regs(). Otherwise an
      interrupt might come in after the registers are loaded, but before
      CIF_FPU is cleared in load_fpu_regs(). When the interrupt returns,
      CIF_FPU will be cleared and the registers will never be restored.
      
      The entry.S code usually saves the interrupt state in __SF_EMPTY on the
      stack when disabling/restoring interrupts. sie64a however saves the pointer
      to the sie control block in __SF_SIE_CONTROL, which references the same
      location.  This is non-obvious to the reader. To avoid thrashing the sie
      control block pointer in load_fpu_regs(), move the __SIE_* offsets eight
      bytes after __SF_EMPTY on the stack.
      
      Cc: <stable@vger.kernel.org> # 5.8
      Fixes: 0b0ed657 ("s390: remove critical section cleanup from entry.S")
      Reported-by: default avatarPierre Morel <pmorel@linux.ibm.com>
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      1179f170
  3. 20 Nov, 2020 15 commits
  4. 18 Nov, 2020 4 commits
  5. 12 Nov, 2020 3 commits
  6. 09 Nov, 2020 10 commits
    • Harald Freudenberger's avatar
      s390/zcrypt/pkey: introduce zcrypt_wait_api_operational() function · 43cb5a7c
      Harald Freudenberger authored
      The zcrypt api provides a new function to wait until the zcrypt
      api is operational:
      
        int zcrypt_wait_api_operational(void);
      
      The AP bus scan and the binding of ap devices to device drivers is
      an asynchronous job. This function waits until these initial jobs
      are done and so the zcrypt api should be ready to serve crypto
      requests - if there are resources available. The function uses an
      internal timeout of 60s. The very first caller will either wait for
      ap bus bindings complete or the timeout happens. This state will be
      remembered for further callers which will only be blocked until a
      decision is made (timeout or bindings complete).
      Reviewed-by: default avatarIngo Franzki <ifranzki@linux.ibm.com>
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      43cb5a7c
    • Harald Freudenberger's avatar
      s390/ap: ap bus userspace notifications for some bus conditions · 837cd105
      Harald Freudenberger authored
      This patch adds notifications to userspace for two important
      conditions of the ap bus:
      
      I) Initial ap bus scan done. This indicates that the initial
         scan of all the ap devices (cards, queues) is complete and
         ap devices have been build up for all the hardware found.
         This condition is signaled with
         1) An ap bus change uevent send to userspace with an environment
            key/value pair "INITSCAN=done":
      	# udevadm monitor -k -p
      	...
      	KERNEL[97.830919] change   /devices/ap (ap)
      	ACTION=change
      	DEVPATH=/devices/ap
      	SUBSYSTEM=ap
      	INITSCAN=done
      	SEQNUM=10421
         2) A sysfs attribute /sys/bus/ap/scans which shows the
            number of completed ap bus scans done since bus init.
            So a value of 1 or greater signals that the initial
            ap bus scan is complete.
         Note: The initial ap bus scan complete condition is fulfilled
         and will be signaled even if there was no ap resource found.
      
      II) APQN driver bindings complete. This indicates that all
          APQNs have been bound to an zcrypt or alternate device
          driver. Only with the help of an device driver an APQN
          can be used for crypto load. So the binding complete
          condition is the starting point for user space to be
          sure all crypto resources on the ap bus are available
          for use.
          This condition is signaled with
          1) An ap bus change uevent send to userspace with an environment
             key/value pair "BINDINGS=complete":
      	 # udevadm monitor -k -p
      	 ...
      	 KERNEL[97.830975] change   /devices/ap (ap)
      	 ACTION=change
      	 DEVPATH=/devices/ap
      	 SUBSYSTEM=ap
      	 BINDINGS=complete
      	 SEQNUM=10422
          2) A sysfs attribute /sys/bus/ap/bindings showing
      	 "<nr of bound apqns>/<total nr of apqns> (complete)"
             when all available apqns have been bound to device drivers, or
      	 "<nr of bound apqns>/<total nr of apqns>"
             when there are some apqns not bound to an device driver.
          Note: The binding complete condition is also fulfilled, when
          there are no apqns available to bind any device driver. In
          this case the binding complete will be signaled AFTER init
          scan is done.
          Note: This condition may arise multiple times when after
          initial scan modifications on the bindings take place. For
          example a manual unbind of an APQN switches the binding
          complete condition off. When at a later time the unbound APQNs
          are bound with an device driver the binding is (again) complete
          resulting in another uevent and marking the bindings sysfs
          attribute with '(complete)'.
      
      There is also a new function to be used within the kernel:
      
        int ap_wait_init_apqn_bindings_complete(unsigned long timeout)
      
      Interface to wait for the AP bus to have done one initial ap bus
      scan and all detected APQNs have been bound to device drivers.
      If these both conditions are not fulfilled, this function blocks
      on a condition with wait_for_completion_interruptible_timeout().
      If these both conditions are fulfilled (before the timeout hits)
      the return value is 0. If the timeout (in jiffies) hits instead
      -ETIME is returned. On failures negative return values are
      returned to the caller. Please note that further unbind/bind
      actions after initial binding complete is through do not cause this
      function to block again.
      Reviewed-by: default avatarIngo Franzki <ifranzki@linux.ibm.com>
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      837cd105
    • Christian Borntraeger's avatar
      s390/trng: set quality to 1024 · d041315e
      Christian Borntraeger authored
      The s390-trng does provide 100% entropy. The quality value is supported
      to be between 1 and 1024 and not 1..1000.  Use 1024 to make this driver
      the preferred one. If we ever have a better driver that has the same
      quality but is faster we can change this again when merging the new
      driver. No need to be conservative.
      
      This makes sure that the hw variant is preferred over things like
      virtio-rng, where the hypervisor has a potential to be misconfigured
      and thus should have a slightly lower confidence.
      
      Cc: Harald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d041315e
    • Daniel Vetter's avatar
      s390/pci: remove races against pte updates · a67a88b0
      Daniel Vetter authored
      Way back it was a reasonable assumptions that iomem mappings never
      change the pfn range they point at. But this has changed:
      
      - gpu drivers dynamically manage their memory nowadays, invalidating
      ptes with unmap_mapping_range when buffers get moved
      
      - contiguous dma allocations have moved from dedicated carvetouts to
      cma regions. This means if we miss the unmap the pfn might contain
      pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
      
      - even /dev/mem now invalidates mappings when the kernel requests that
      iomem region when CONFIG_IO_STRICT_DEVMEM is set, see
      commit 3234ac66 ("/dev/mem: Revoke mappings when a driver claims the
      region")
      
      Accessing pfns obtained from ptes without holding all the locks is
      therefore no longer a good idea. Fix this.
      
      Since zpci_memcpy_from|toio seems to not do anything nefarious with
      locks we just need to open code get_pfn and follow_pfn and make sure
      we drop the locks only after we're done. The write function also needs
      the copy_from_user move, since we can't take userspace faults while
      holding the mmap sem.
      Reviewed-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-samsung-soc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: linux-s390@vger.kernel.org
      Cc: Niklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      a67a88b0
    • Vasily Gorbik's avatar
      s390/early: rewrite program parameter setup in C · d7e7fbba
      Vasily Gorbik authored
      And move it earlier in the decompressor.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d7e7fbba
    • Vasily Gorbik's avatar
      s390/kasan: move memory needs estimation into a function · 0c4ec024
      Vasily Gorbik authored
      Also correct rounding downs in estimation calculations.
      Reviewed-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      0c4ec024
    • Vasily Gorbik's avatar
      s390/kasan: make kasan header self-contained · e385b550
      Vasily Gorbik authored
      It is relying on _REGION1_SHIFT / _REGION2_SHIFT values which come from
      asm/pgtable.h, so include it.
      Reviewed-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      e385b550
    • Vasily Gorbik's avatar
      s390/kasan: remove obvious parameter with the only possible value · 54b52981
      Vasily Gorbik authored
      Kasan early code is only working on init_mm, remove unneeded pgd
      parameter from kasan_copy_shadow and rename it to
      kasan_copy_shadow_mapping.
      Reviewed-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      54b52981
    • Vasily Gorbik's avatar
      s390/kasan: avoid confusing naming · 92bca2fe
      Vasily Gorbik authored
      Kasan has nothing to do with vmemmap, strip vmemmap from function names
      to avoid confusing people.
      Reviewed-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      92bca2fe
    • Vasily Gorbik's avatar
      s390/decompressor: fix build warning · 39f2899b
      Vasily Gorbik authored
      Fixes the following warning with CONFIG_KERNEL_UNCOMPRESSED=y
      
      arch/s390/boot/compressed/decompressor.h:6:46: warning: non-void function
      does not return a value [-Wreturn-type]
      static inline void *decompress_kernel(void) {}
                                                   ^
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      39f2899b