1. 21 Jun, 2018 7 commits
    • Marc Zyngier's avatar
      KVM: Enforce error in ioctl for compat tasks when !KVM_COMPAT · 7ddfd3e0
      Marc Zyngier authored
      The current behaviour of the compat ioctls is a bit odd.
      We provide a compat_ioctl method when KVM_COMPAT is set, and NULL
      otherwise. But NULL means that the normal, non-compat ioctl should
      be used directly for compat tasks, and there is no way to actually
      prevent a compat task from issueing KVM ioctls.
      
      This patch changes this behaviour, by always registering a compat_ioctl
      method, even if KVM_COMPAT is not selected. In that case, the callback
      will always return -EINVAL.
      
      Fixes: de8e5d74 ("KVM: Disable compat ioctl for s390")
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      7ddfd3e0
    • Jia He's avatar
      KVM: arm/arm64: add WARN_ON if size is not PAGE_SIZE aligned in unmap_stage2_range · 47a91b72
      Jia He authored
      There is a panic in armv8a server(QDF2400) under memory pressure tests
      (start 20 guests and run memhog in the host).
      
      ---------------------------------begin--------------------------------
      [35380.800950] BUG: Bad page state in process qemu-kvm  pfn:dd0b6
      [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375
      mapping:          (null) index:0x0
      [35380.815024] flags: 0x1fffc00000000000()
      [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000
      ffffecf981470000
      [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000
      0000000000000000
      [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375
      mapping:          (null) index:0x0
      [35380.815024] flags: 0x1fffc00000000000()
      [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000
      ffffecf981470000
      [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000
      0000000000000000
      [35380.834294] page dumped because: nonzero _refcount
      [...]
      --------------------------------end--------------------------------------
      
      The root cause might be what was fixed at [1]. But from the KVM points of
      view, it would be better if the issue was caught earlier.
      
      If the size is not PAGE_SIZE aligned, unmap_stage2_range might unmap the
      wrong(more or less) page range. Hence it caused the "BUG: Bad page
      state"
      
      Let's WARN in that case, so that the issue is obvious.
      
      [1] https://lkml.org/lkml/2018/5/3/1042Reviewed-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: jia.he@hxt-semitech.com
      [maz: tidied up commit message]
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      47a91b72
    • Dave Martin's avatar
      KVM: arm64: Avoid mistaken attempts to save SVE state for vcpus · 2955bcc8
      Dave Martin authored
      Commit e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce
      guest/host thrashing") uses fpsimd_save() to save the FPSIMD state
      for a vcpu when scheduling the vcpu out.  However, currently
      current's value of TIF_SVE is restored before calling fpsimd_save()
      which means that fpsimd_save() may erroneously attempt to save SVE
      state from the vcpu.  This enables current's vector state to be
      polluted with guest data.  current->thread.sve_state may be
      unallocated or not large enough, so this can also trigger a NULL
      dereference or buffer overrun.
      
      Instead of this, TIF_SVE should be configured properly for the
      guest when calling fpsimd_save() with the vcpu context loaded.
      
      This patch ensures this by delaying restoration of current's
      TIF_SVE until after the call to fpsimd_save().
      
      Fixes: e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      2955bcc8
    • Dave Martin's avatar
      KVM: arm64/sve: Fix SVE trap restoration for non-current tasks · b3eb56b6
      Dave Martin authored
      Commit e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce
      guest/host thrashing") attempts to restore the configuration of
      userspace SVE trapping via a call to fpsimd_bind_task_to_cpu(), but
      the logic for determining when to do this is not correct.
      
      The patch makes the errnoenous assumption that the only task that
      may try to enter userspace with the currently loaded FPSIMD/SVE
      register content is current.  This may not be the case however:  if
      some other user task T is scheduled on the CPU during the execution
      of the KVM run loop, and the vcpu does not try to use the registers
      in the meantime, then T's state may be left there intact.  If T
      happens to be the next task to enter userspace on this CPU then the
      hooks for reloading the register state and configuring traps will
      be skipped.
      
      (Also, current never has SVE state at this point anyway and should
      always have the trap enabled, as a side-effect of the ioctl()
      syscall needed to reach the KVM run loop in the first place.)
      
      This patch instead restores the state of the EL0 trap from the
      state observed at the most recent vcpu_load(), ensuring that the
      trap is set correctly for the loaded context (if any).
      
      Fixes: e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      b3eb56b6
    • Dave Martin's avatar
      KVM: arm64: Don't mask softirq with IRQs disabled in vcpu_put() · b045e4d0
      Dave Martin authored
      Commit e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce
      guest/host thrashing") introduces a specific helper
      kvm_arch_vcpu_put_fp() for saving the vcpu FPSIMD state during
      vcpu_put().
      
      This function uses local_bh_disable()/_enable() to protect the
      FPSIMD context manipulation from interruption by softirqs.
      
      This approach is not correct, because vcpu_put() can be invoked
      either from the KVM host vcpu thread (when exiting the vcpu run
      loop), or via a preempt notifier.  In the former case, only
      preemption is disabled.  In the latter case, the function is called
      from inside __schedule(), which means that IRQs are disabled.
      
      Use of local_bh_disable()/_enable() with IRQs disabled is considerd
      an error, resulting in lockdep splats while running VMs if lockdep
      is enabled.
      
      This patch disables IRQs instead of attempting to disable softirqs,
      avoiding the problem of calling local_bh_enable() with IRQs
      disabled in the __schedule() path.  This creates an additional
      interrupt blackout during vcpu run loop exit, but this is the rare
      case and the blackout latency is still less than that of
      __schedule().
      
      Fixes: e6b673b7 ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing")
      Reported-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      b045e4d0
    • Mark Rutland's avatar
      arm64: Introduce sysreg_clear_set() · 6ebdf4db
      Mark Rutland authored
      Currently we have a couple of helpers to manipulate bits in particular
      sysregs:
      
       * config_sctlr_el1(u32 clear, u32 set)
      
       * change_cpacr(u64 val, u64 mask)
      
      The parameters of these differ in naming convention, order, and size,
      which is unfortunate. They also differ slightly in behaviour, as
      change_cpacr() skips the sysreg write if the bits are unchanged, which
      is a useful optimization when sysreg writes are expensive.
      
      Before we gain yet another sysreg manipulation function, let's
      unify these with a common helper, providing a consistent order for
      clear/set operands, and the write skipping behaviour from
      change_cpacr(). Code will be migrated to the new helper in subsequent
      patches.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarDave Martin <dave.martin@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      6ebdf4db
    • Ard Biesheuvel's avatar
      KVM: arm/arm64: Drop resource size check for GICV window · ba56bc3a
      Ard Biesheuvel authored
      When booting a 64 KB pages kernel on a ACPI GICv3 system that
      implements support for v2 emulation, the following warning is
      produced
      
        GICV size 0x2000 not a multiple of page size 0x10000
      
      and support for v2 emulation is disabled, preventing GICv2 VMs
      from being able to run on such hosts.
      
      The reason is that vgic_v3_probe() performs a sanity check on the
      size of the window (it should be a multiple of the page size),
      while the ACPI MADT parsing code hardcodes the size of the window
      to 8 KB. This makes sense, considering that ACPI does not bother
      to describe the size in the first place, under the assumption that
      platforms implementing ACPI will follow the architecture and not
      put anything else in the same 64 KB window.
      
      So let's just drop the sanity check altogether, and assume that
      the window is at least 64 KB in size.
      
      Fixes: 90977732 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init")
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      ba56bc3a
  2. 16 Jun, 2018 8 commits
    • Linus Torvalds's avatar
      Linux 4.18-rc1 · ce397d21
      Linus Torvalds authored
      ce397d21
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block · 265c5596
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A collection of fixes that should go into -rc1. This contains:
      
         - bsg_open vs bsg_unregister race fix (Anatoliy)
      
         - NVMe pull request from Christoph, with fixes for regressions in
           this window, FC connect/reconnect path code unification, and a
           trace point addition.
      
         - timeout fix (Christoph)
      
         - remove a few unused functions (Christoph)
      
         - blk-mq tag_set reinit fix (Roman)"
      
      * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
        bsg: fix race of bsg_open and bsg_unregister
        block: remov blk_queue_invalidate_tags
        nvme-fabrics: fix and refine state checks in __nvmf_check_ready
        nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
        nvme-fabrics: refactor queue ready check
        blk-mq: remove blk_mq_tagset_iter
        nvme: remove nvme_reinit_tagset
        nvme-fc: fix nulling of queue data on reconnect
        nvme-fc: remove reinit_request routine
        blk-mq: don't time out requests again that are in the timeout handler
        nvme-fc: change controllers first connect to use reconnect path
        nvme: don't rely on the changed namespace list log
        nvmet: free smart-log buffer after use
        nvme-rdma: fix error flow during mapping request data
        nvme: add bio remapping tracepoint
        nvme: fix NULL pointer dereference in nvme_init_subsystem
        blk-mq: reinit q->tag_set_list entry only after grace period
      265c5596
    • Linus Torvalds's avatar
      Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental · 5e7b9212
      Linus Torvalds authored
      Pull documentation fixes from Mauro Carvalho Chehab:
       "This solves a series of broken links for files under Documentation,
        and improves a script meant to detect such broken links (see
        scripts/documentation-file-ref-check).
      
        The changes on this series are:
      
         - can.rst: fix a footnote reference;
      
         - crypto_engine.rst: Fix two parsing warnings;
      
         - Fix a lot of broken references to Documentation/*;
      
         - improve the scripts/documentation-file-ref-check script, in order
           to help detecting/fixing broken references, preventing
           false-positives.
      
        After this patch series, only 33 broken references to doc files are
        detected by scripts/documentation-file-ref-check"
      
      * tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
        fix a series of Documentation/ broken file name references
        Documentation: rstFlatTable.py: fix a broken reference
        ABI: sysfs-devices-system-cpu: remove a broken reference
        devicetree: fix a series of wrong file references
        devicetree: fix name of pinctrl-bindings.txt
        devicetree: fix some bindings file names
        MAINTAINERS: fix location of DT npcm files
        MAINTAINERS: fix location of some display DT bindings
        kernel-parameters.txt: fix pointers to sound parameters
        bindings: nvmem/zii: Fix location of nvmem.txt
        docs: Fix more broken references
        scripts/documentation-file-ref-check: check tools/*/Documentation
        scripts/documentation-file-ref-check: get rid of false-positives
        scripts/documentation-file-ref-check: hint: dash or underline
        scripts/documentation-file-ref-check: add a fix logic for DT
        scripts/documentation-file-ref-check: accept more wildcards at filenames
        scripts/documentation-file-ref-check: fix help message
        media: max2175: fix location of driver's companion documentation
        media: v4l: fix broken video4linux docs locations
        media: dvb: point to the location of the old README.dvb-usb file
        ...
      5e7b9212
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · dbb2816f
      Linus Torvalds authored
      Pull fsnotify updates from Jan Kara:
       "fsnotify cleanups unifying handling of different watch types.
      
        This is the shortened fsnotify series from Amir with the last five
        patches pulled out. Amir has modified those patches to not change
        struct inode but obviously it's too late for those to go into this
        merge window"
      
      * tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fsnotify: add fsnotify_add_inode_mark() wrappers
        fanotify: generalize fanotify_should_send_event()
        fsnotify: generalize send_to_group()
        fsnotify: generalize iteration of marks by object type
        fsnotify: introduce marks iteration helpers
        fsnotify: remove redundant arguments to handle_event()
        fsnotify: use type id to identify connector object type
      dbb2816f
    • Linus Torvalds's avatar
      Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux · 644f2639
      Linus Torvalds authored
      Pull fbdev updates from Bartlomiej Zolnierkiewicz:
       "There is nothing really major here, few small fixes, some cleanups and
        dead drivers removal:
      
         - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)
      
         - add missing module license tags to omap/omapfb driver (Arnd
           Bergmann)
      
         - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
           Bergmann)
      
         - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
           (Jia-Ju Bai)
      
         - allow COMPILE_TEST build for viafb driver (media part was reviewed
           by media subsystem Maintainer)
      
         - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
           drivers (drm parts were acked by shmob-drm driver Maintainer)
      
         - remove unused auo_k190xfb drivers
      
         - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
           Shevchenko, Colin Ian King)"
      
      * tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
        fb_omap2: add gpiolib dependency
        video/omap: add module license tags
        MAINTAINERS: make omapfb orphan
        video: fbdev: pxafb: match_string() conversion fixup
        video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
        video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
        video: fbdev: pxafb: Convert to use match_string() helper
        video: fbdev: via: allow COMPILE_TEST build
        video: fbdev: remove unused sh_mobile_meram driver
        drm: shmobile: remove unused MERAM support
        video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
        video: fbdev: remove unused auo_k190xfb drivers
        video: omap: Improve a size determination in omapfb_do_probe()
        video: sm501fb: Improve a size determination in sm501fb_probe()
        video: fbdev-MMP: Improve a size determination in path_init()
        video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
        video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
        video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
        video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
        video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
        ...
      644f2639
    • Linus Torvalds's avatar
      Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 35773c93
      Linus Torvalds authored
      Pull AFS updates from Al Viro:
       "Assorted AFS stuff - ended up in vfs.git since most of that consists
        of David's AFS-related followups to Christoph's procfs series"
      
      * 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        afs: Optimise callback breaking by not repeating volume lookup
        afs: Display manually added cells in dynamic root mount
        afs: Enable IPv6 DNS lookups
        afs: Show all of a server's addresses in /proc/fs/afs/servers
        afs: Handle CONFIG_PROC_FS=n
        proc: Make inline name size calculation automatic
        afs: Implement network namespacing
        afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
        afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
        proc: Add a way to make network proc files writable
        afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
        afs: Rearrange fs/afs/proc.c to move the show routines up
        afs: Rearrange fs/afs/proc.c by moving fops and open functions down
        afs: Move /proc management functions to the end of the file
      35773c93
    • Linus Torvalds's avatar
      Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 29d6849d
      Linus Torvalds authored
      Pull compat updates from Al Viro:
       "Some biarch patches - getting rid of assorted (mis)uses of
        compat_alloc_user_space().
      
        Not much in that area this cycle..."
      
      * 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        orangefs: simplify compat ioctl handling
        signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
        vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart
      29d6849d
    • Linus Torvalds's avatar
      Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · a5b729ea
      Linus Torvalds authored
      Pull aio fixes from Al Viro:
       "Assorted AIO followups and fixes"
      
      * 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        eventpoll: switch to ->poll_mask
        aio: only return events requested in poll_mask() for IOCB_CMD_POLL
        eventfd: only return events requested in poll_mask()
        aio: mark __aio_sigset::sigmask const
      a5b729ea
  3. 15 Jun, 2018 25 commits