1. 15 Nov, 2019 6 commits
  2. 14 Nov, 2019 7 commits
    • Xiaojie Yuan's avatar
      drm/amdgpu: fix null pointer deref in firmware header printing · a84fddb1
      Xiaojie Yuan authored
      v2: declare as (struct common_firmware_header *) type because
          struct xxx_firmware_header inherits from it
      
      When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
      unallocated amdgpu_firmware_info instance.
      
      This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
      optimized out such 'defined but not referenced' variable.
      
      [ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
      [ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
      [ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
      [ 1120.818931] Oops: 0000 [#1] SMP
      [ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
      [ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
      [ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
      [ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
      [ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>]  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
      [ 1120.990483] RSP: 0018:ffff991ee625f950  EFLAGS: 00010202
      [ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
      [ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
      [ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
      [ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
      [ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
      [ 1121.032666] FS:  00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
      [ 1121.041000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
      [ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1121.068938] Call Trace:
      [ 1121.071494]  [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
      [ 1121.077886]  [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
      [ 1121.085296]  [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
      [ 1121.092534]  [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
      [ 1121.099675]  [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
      [ 1121.106888]  [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
      [ 1121.113419]  [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
      [ 1121.120183]  [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
      [ 1121.126919]  [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
      [ 1121.132622]  [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
      [ 1121.138607]  [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
      [ 1121.144766]  [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
      [ 1121.150507]  [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
      [ 1121.156422]  [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
      [ 1121.162213]  [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
      [ 1121.167771]  [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
      [ 1121.173590]  [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
      [ 1121.179345]  [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
      [ 1121.185593]  [<ffffffffc099f000>] ? 0xffffffffc099efff
      [ 1121.190914]  [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
      [ 1121.197101]  [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
      [ 1121.202901]  [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
      [ 1121.208598]  [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
      [ 1121.214894]  [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
      [ 1121.220698]  [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
      [ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
      [ 1121.247422] RIP  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
      [ 1121.254432]  RSP <ffff991ee625f950>
      [ 1121.258017] CR2: 000000000000000a
      [ 1121.261427] ---[ end trace e98b35387ede75bd ]---
      Signed-off-by: default avatarXiaojie Yuan <xiaojie.yuan@amd.com>
      Fixes: c5fb9126 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)")
      Reviewed-by: default avatarKevin Wang <kevin1.wang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      a84fddb1
    • Takashi Iwai's avatar
      ALSA: usb-audio: Fix incorrect size check for processing/extension units · 976a68f0
      Takashi Iwai authored
      The recently introduced unit descriptor validation had some bug for
      processing and extension units, it counts a bControlSize byte twice so
      it expected a bigger size than it should have been.  This seems
      resulting in a probe error on a few devices.
      
      Fix the calculation for proper checks of PU and EU.
      
      Fixes: 57f87706 ("ALSA: usb-audio: More validations of descriptor units")
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20191114165613.7422-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      976a68f0
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.4-3' of... · 96b95eff
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix build error when compiling SPARC VDSO with CONFIG_COMPAT=y
      
       - pass correct --arch option to Sparse
      
      * tag 'kbuild-fixes-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: tell sparse about the $ARCH
        sparc: vdso: fix build error of vdso32
      96b95eff
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 4e84608c
      Linus Torvalds authored
      Pull RDMA fixes from Jason Gunthorpe:
       "Bug fixes for old bugs in the hns and hfi1 drivers:
      
         - Calculate various values in hns properly to avoid over/underflows
           in some cases
      
         - Fix an oops, PCI negotiation on Gen4 systems, and bugs related to
           retries"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/hns: Correct the value of srq_desc_size
        RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN
        IB/hfi1: TID RDMA WRITE should not return IB_WC_RNR_RETRY_EXC_ERR
        IB/hfi1: Calculate flow weight based on QP MTU for TID RDMA
        IB/hfi1: Ensure r_tid_ack is valid before building TID RDMA ACK packet
        IB/hfi1: Ensure full Gen3 speed in a Gen4 system
      4e84608c
    • Luc Van Oostenryck's avatar
      kbuild: tell sparse about the $ARCH · 80591e61
      Luc Van Oostenryck authored
      Sparse uses the same executable for all archs and uses flags
      like -m64, -mbig-endian or -D__arm__ for arch-specific parameters.
      But Sparse also uses value from the host machine used to build
      Sparse as default value for the target machine.
      
      This works, of course, well for native build but can create
      problems when cross-compiling, like defining both '__i386__'
      and '__arm__' when cross-compiling for arm on a x86-64 machine.
      
      Fix this by explicitely telling sparse the target architecture.
      Reported-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      Signed-off-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      80591e61
    • Masahiro Yamada's avatar
      sparc: vdso: fix build error of vdso32 · 53472914
      Masahiro Yamada authored
      Since commit 54b8ae66 ("kbuild: change *FLAGS_<basetarget>.o to
      take the path relative to $(obj)"), sparc allmodconfig fails to build
      as follows:
      
        CC      arch/sparc/vdso/vdso32/vclock_gettime.o
      unrecognized e_machine 18 arch/sparc/vdso/vdso32/vclock_gettime.o
      arch/sparc/vdso/vdso32/vclock_gettime.o: failed
      
      The cause of the breakage is that -pg flag not being dropped.
      
      The vdso32 files are located in the vdso32/ subdirectory, but I missed
      to update the Makefile.
      
      I removed the meaningless CFLAGS_REMOVE_vdso-note.o since it is only
      effective for C file.
      
      vdso-note.o is compiled from assembly file:
      
        arch/sparc/vdso/vdso-note.S
        arch/sparc/vdso/vdso32/vdso-note.S
      
      Fixes: 54b8ae66 ("kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)")
      Reported-by: default avatarAnatoly Pugachev <matorola@gmail.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Tested-by: default avatarAnatoly Pugachev <matorola@gmail.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      53472914
    • Takashi Iwai's avatar
      ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk() · cc9dbfa9
      Takashi Iwai authored
      The commit 60849562 ("ALSA: usb-audio: Fix possible NULL
      dereference at create_yamaha_midi_quirk()") added NULL checks in
      create_yamaha_midi_quirk(), but there was an overlook.  The code
      allows one of either injd or outjd is NULL, but the second if check
      made returning -ENODEV if any of them is NULL.  Fix it in a proper
      form.
      
      Fixes: 60849562 ("ALSA: usb-audio: Fix possible NULL dereference at create_yamaha_midi_quirk()")
      Reported-by: default avatarPavel Machek <pavel@denx.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20191113111259.24123-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      cc9dbfa9
  3. 13 Nov, 2019 9 commits
  4. 12 Nov, 2019 12 commits
    • Linus Torvalds's avatar
      Remove VirtualBox guest shared folders filesystem · 0e3f1ad8
      Linus Torvalds authored
      This went into staging in rc7.  It turns out that was a mistake, and
      apparently it wasn't even supposed to go there at all, but be introduced
      as a regular filesystem.
      
      We don't try to sneak in whole new filesystems this late in the rc, just
      delete the whole thing, and it can be re-introduced as a proper patch
      with proper acks from actual filesystem people instead of some odd
      late-rc staging back-door.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0e3f1ad8
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8c5bd25b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Fix unwinding of KVM_CREATE_VM failure, VT-d posted interrupts,
        DAX/ZONE_DEVICE, and module unload/reload"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
        KVM: VMX: Introduce pi_is_pir_empty() helper
        KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
        KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
        KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
        KVM: X86: Fix initialization of MSR lists
        KVM: fix placement of refcount initialization
        KVM: Fix NULL-ptr deref after kvm_create_vm fails
      8c5bd25b
    • Rodrigo Vivi's avatar
      Merge tag 'gvt-fixes-2019-11-12' of https://github.com/intel/gvt-linux into drm-intel-fixes · 31e8d629
      Rodrigo Vivi authored
      gvt-fixes-2019-11-12
      
      - Fix dmabuf reference drop (Pan)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      From: Zhenyu Wang <zhenyuw@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191112061834.GN4196@zhen-hp.sh.intel.com
      31e8d629
    • Jani Nikula's avatar
      drm/i915: update rawclk also on resume · 2f216a85
      Jani Nikula authored
      Since CNP it's possible for rawclk to have two different values, 19.2
      and 24 MHz. If the value indicated by SFUSE_STRAP register is different
      from the power on default for PCH_RAWCLK_FREQ, we'll end up having a
      mismatch between the rawclk hardware and software states after
      suspend/resume. On previous platforms this used to work by accident,
      because the power on defaults worked just fine.
      
      Update the rawclk also on resume. The natural place to do this would be
      intel_modeset_init_hw(), however VLV/CHV need it done before
      intel_power_domains_init_hw(). Thus put it there even if it feels
      slightly out of place.
      
      v2: Call intel_update_rawclck() in intel_power_domains_init_hw() for all
          platforms (Ville).
      Reported-by: default avatarShawn Lee <shawn.c.lee@intel.com>
      Cc: Shawn Lee <shawn.c.lee@intel.com>
      Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Tested-by: default avatarShawn Lee <shawn.c.lee@intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191101142024.13877-1-jani.nikula@intel.com
      (cherry picked from commit 59ed05cc)
      Cc: <stable@vger.kernel.org> # v4.15+
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      2f216a85
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eb094f06
      Linus Torvalds authored
      Pull x86 TSX Async Abort and iTLB Multihit mitigations from Thomas Gleixner:
       "The performance deterioration departement is not proud at all of
        presenting the seventh installment of speculation mitigations and
        hardware misfeature workarounds:
      
         1) TSX Async Abort (TAA) - 'The Annoying Affair'
      
            TAA is a hardware vulnerability that allows unprivileged
            speculative access to data which is available in various CPU
            internal buffers by using asynchronous aborts within an Intel TSX
            transactional region.
      
            The mitigation depends on a microcode update providing a new MSR
            which allows to disable TSX in the CPU. CPUs which have no
            microcode update can be mitigated by disabling TSX in the BIOS if
            the BIOS provides a tunable.
      
            Newer CPUs will have a bit set which indicates that the CPU is not
            vulnerable, but the MSR to disable TSX will be available
            nevertheless as it is an architected MSR. That means the kernel
            provides the ability to disable TSX on the kernel command line,
            which is useful as TSX is a truly useful mechanism to accelerate
            side channel attacks of all sorts.
      
         2) iITLB Multihit (NX) - 'No eXcuses'
      
            iTLB Multihit is an erratum where some Intel processors may incur
            a machine check error, possibly resulting in an unrecoverable CPU
            lockup, when an instruction fetch hits multiple entries in the
            instruction TLB. This can occur when the page size is changed
            along with either the physical address or cache type. A malicious
            guest running on a virtualized system can exploit this erratum to
            perform a denial of service attack.
      
            The workaround is that KVM marks huge pages in the extended page
            tables as not executable (NX). If the guest attempts to execute in
            such a page, the page is broken down into 4k pages which are
            marked executable. The workaround comes with a mechanism to
            recover these shattered huge pages over time.
      
        Both issues come with full documentation in the hardware
        vulnerabilities section of the Linux kernel user's and administrator's
        guide.
      
        Thanks to all patch authors and reviewers who had the extraordinary
        priviledge to be exposed to this nuisance.
      
        Special thanks to Borislav Petkov for polishing the final TAA patch
        set and to Paolo Bonzini for shepherding the KVM iTLB workarounds and
        providing also the backports to stable kernels for those!"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs
        Documentation: Add ITLB_MULTIHIT documentation
        kvm: x86: mmu: Recovery of shattered NX large pages
        kvm: Add helper function for creating VM worker threads
        kvm: mmu: ITLB_MULTIHIT mitigation
        cpu/speculation: Uninline and export CPU mitigations helpers
        x86/cpu: Add Tremont to the cpu vulnerability whitelist
        x86/bugs: Add ITLB_MULTIHIT bug infrastructure
        x86/tsx: Add config options to set tsx=on|off|auto
        x86/speculation/taa: Add documentation for TSX Async Abort
        x86/tsx: Add "auto" option to the tsx= cmdline parameter
        kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
        x86/speculation/taa: Add sysfs reporting for TSX Async Abort
        x86/speculation/taa: Add mitigation for TSX Async Abort
        x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
        x86/cpu: Add a helper function x86_read_arch_cap_msr()
        x86/msr: Add the IA32_TSX_CTRL MSR
      eb094f06
    • Sean Christopherson's avatar
      KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved · a78986aa
      Sean Christopherson authored
      Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
      instead manually handle ZONE_DEVICE on a case-by-case basis.  For things
      like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
      pages, e.g. put pages grabbed via gup().  But for flows such as setting
      A/D bits or shifting refcounts for transparent huge pages, KVM needs to
      to avoid processing ZONE_DEVICE pages as the flows in question lack the
      underlying machinery for proper handling of ZONE_DEVICE pages.
      
      This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
      when running a KVM guest backed with /dev/dax memory, as KVM straight up
      doesn't put any references to ZONE_DEVICE pages acquired by gup().
      
      Note, Dan Williams proposed an alternative solution of doing put_page()
      on ZONE_DEVICE pages immediately after gup() in order to simplify the
      auditing needed to ensure is_zone_device_page() is called if and only if
      the backing device is pinned (via gup()).  But that approach would break
      kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
      unmap() when accessing guest memory, unlike KVM's secondary MMU, which
      coordinates with mmu_notifier invalidations to avoid creating stale
      page references, i.e. doesn't rely on pages being pinned.
      
      [*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.plReported-by: default avatarAdam Borowski <kilobyte@angband.pl>
      Analyzed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Cc: stable@vger.kernel.org
      Fixes: 3565fce3 ("mm, x86: get_user_pages() for dax mappings")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a78986aa
    • Joao Martins's avatar
      KVM: VMX: Introduce pi_is_pir_empty() helper · 29881b6e
      Joao Martins authored
      Streamline the PID.PIR check and change its call sites to use
      the newly added helper.
      Suggested-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      29881b6e
    • Joao Martins's avatar
      KVM: VMX: Do not change PID.NDST when loading a blocked vCPU · 132194ff
      Joao Martins authored
      When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
      linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
      changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
      (wakeup_handler()) is responsible to kick (unblock) any vCPU on that
      linked list that now has pending posted interrupts.
      
      While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
      will cause vmx_vcpu_pi_put() to set PID.SN.  If later the vCPU will be
      scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
      PID.SN but will also *overwrite PID.NDST to this different pCPU*.
      Instead of keeping it with original pCPU which vCPU had entered block
      phase on.
      
      This results in an issue because when a posted interrupt is delivered, as
      the wakeup_handler() will be executed and fail to find blocked vCPU on
      its per pCPU linked list of all vCPUs that are blocked on this pCPU.
      Which is due to the vCPU being placed on a *different* per pCPU
      linked list i.e. the original pCPU in which it entered block phase.
      
      The regression is introduced by commit c112b5f5 ("KVM: x86:
      Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
      it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
      avoiding changing PID.NDST when loading a blocked vCPU.
      
      Fixes: c112b5f5 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
      Tested-by: default avatarNathan Ni <nathan.ni@oracle.com>
      Co-developed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      132194ff
    • Joao Martins's avatar
      KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts · 9482ae45
      Joao Martins authored
      Commit 17e433b5 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
      introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
      if a vCPU have a pending posted interrupt. This routine is used by
      kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
      on pCPU instead of a vCPU doing busy loop.
      
      vmx_dy_apicv_has_pending_interrupt() determines if a
      vCPU has a pending posted interrupt solely based on PID.ON. However,
      when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
      raised posted interrupts to only set bit in PID.PIR without setting
      PID.ON (and without sending notification vector), as depicted in VT-d
      manual section 5.2.3 "Interrupt-Posting Hardware Operation".
      
      Therefore, checking PID.ON is insufficient to determine if a vCPU has
      pending posted interrupts and instead we should also check if there is
      some bit set on PID.PIR if PID.SN=1.
      
      Fixes: 17e433b5 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
      Reviewed-by: default avatarJagannathan Raman <jag.raman@oracle.com>
      Co-developed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9482ae45
    • Liran Alon's avatar
      KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON · d9ff2744
      Liran Alon authored
      The Outstanding Notification (ON) bit is part of the Posted Interrupt
      Descriptor (PID) as opposed to the Posted Interrupts Register (PIR).
      The latter is a bitmap for pending vectors.
      Reviewed-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d9ff2744
    • Chenyi Qiang's avatar
      KVM: X86: Fix initialization of MSR lists · 7a5ee6ed
      Chenyi Qiang authored
      The three MSR lists(msrs_to_save[], emulated_msrs[] and
      msr_based_features[]) are global arrays of kvm.ko, which are
      adjusted (copy supported MSRs forward to override the unsupported MSRs)
      when insmod kvm-{intel,amd}.ko, but it doesn't reset these three arrays
      to their initial value when rmmod kvm-{intel,amd}.ko. Thus, at the next
      installation, kvm-{intel,amd}.ko will do operations on the modified
      arrays with some MSRs lost and some MSRs duplicated.
      
      So define three constant arrays to hold the initial MSR lists and
      initialize msrs_to_save[], emulated_msrs[] and msr_based_features[]
      based on the constant arrays.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: default avatarChenyi Qiang <chenyi.qiang@intel.com>
      [Remove now useless conditionals. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7a5ee6ed
    • Linus Torvalds's avatar
      Merge Intel Gen8/Gen9 graphics fixes from Jon Bloomfield. · 100d46bd
      Linus Torvalds authored
      This fixes two different classes of bugs in the Intel graphics hardware:
      
      MMIO register read hang:
       "On Intels Gen8 and Gen9 Graphics hardware, a read of specific graphics
        MMIO registers when the product is in certain low power states causes
        a system hang.
      
        There are two potential triggers for DoS:
          a) H/W corruption of the RC6 save/restore vector
          b) Hard hang within the MIPI hardware
      
        This prevents the DoS in two areas of the hardware:
          1) Detect corruption of RC6 address on exit from low-power state,
             and if we find it corrupted, disable RC6 and RPM
          2) Permanently lower the MIPI MMIO timeout"
      
      Blitter command streamer unrestricted memory accesses:
       "On Intels Gen9 Graphics hardware the Blitter Command Streamer (BCS)
        allows writing to Memory Mapped Input Output (MMIO) that should be
        blocked. With modifications of page tables, this can lead to privilege
        escalation. This exposure is limited to the Guest Physical Address
        space and does not allow for access outside of the graphics virtual
        machine.
      
        This series establishes a software parser into the Blitter command
        stream to scan for, and prevent, reads or writes to MMIO's that should
        not be accessible to non-privileged contexts.
      
        Much of the command parser infrastructure has existed for some time,
        and is used on Ivybridge/Haswell/Valleyview derived products to allow
        the use of features normally blocked by hardware. In this legacy
        context, the command parser is employed to allow normally unprivileged
        submissions to be run with elevated privileges in order to grant
        access to a limited set of extra capabilities. In this mode the parser
        is optional; In the event that the parser finds any construct that it
        cannot properly validate (e.g. nested command buffers), it simply
        aborts the scan and submits the buffer in non-privileged mode.
      
        For Gen9 Graphics, this series makes the parser mandatory for all
        Blitter submissions. The incoming user buffer is first copied to a
        kernel owned buffer, and parsed. If all checks are successful the
        kernel owned buffer is mapped READ-ONLY and submitted on behalf of the
        user. If any checks fail, or the parser is unable to complete the scan
        (nested buffers), it is forcibly rejected. The successfully scanned
        buffer is executed with NORMAL user privileges (key difference from
        legacy usage).
      
        Modern usermode does not use the Blitter on later hardware, having
        switched over to using the 3D engine instead for performance reasons.
        There are however some legacy usermode apps that rely on Blitter,
        notably the SNA X-Server. There are no known usermode applications
        that require nested command buffers on the Blitter, so the forcible
        rejection of such buffers in this patch series is considered an
        acceptable limitation"
      
      * Intel graphics fixes in emailed bundle from Jon Bloomfield <jon.bloomfield@intel.com>:
        drm/i915/cmdparser: Fix jump whitelist clearing
        drm/i915/gen8+: Add RC6 CTX corruption WA
        drm/i915: Lower RM timeout to avoid DSI hard hangs
        drm/i915/cmdparser: Ignore Length operands during command matching
        drm/i915/cmdparser: Add support for backward jumps
        drm/i915/cmdparser: Use explicit goto for error paths
        drm/i915: Add gen9 BCS cmdparsing
        drm/i915: Allow parsing of unsized batches
        drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
        drm/i915: Add support for mandatory cmdparsing
        drm/i915: Remove Master tables from cmdparser
        drm/i915: Disable Secure Batches for gen6+
        drm/i915: Rename gen7 cmdparser tables
      100d46bd
  5. 11 Nov, 2019 6 commits
    • Linus Torvalds's avatar
      Merge branch 'for-5.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · de620fb9
      Linus Torvalds authored
      Pull cgroup fix from Tejun Heo:
       "There's an inadvertent preemption point in ptrace_stop() which was
        reliably triggering for a test scenario significantly slowing it down.
      
        This contains Oleg's fix to remove the unwanted preemption point"
      
      * 'for-5.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: freezer: call cgroup_enter_frozen() with preemption disabled in ptrace_stop()
      de620fb9
    • Filipe Manana's avatar
      Btrfs: fix log context list corruption after rename exchange operation · e6c61710
      Filipe Manana authored
      During rename exchange we might have successfully log the new name in the
      source root's log tree, in which case we leave our log context (allocated
      on stack) in the root's list of log contextes. However we might fail to
      log the new name in the destination root, in which case we fallback to
      a transaction commit later and never sync the log of the source root,
      which causes the source root log context to remain in the list of log
      contextes. This later causes invalid memory accesses because the context
      was allocated on stack and after rename exchange finishes the stack gets
      reused and overwritten for other purposes.
      
      The kernel's linked list corruption detector (CONFIG_DEBUG_LIST=y) can
      detect this and report something like the following:
      
        [  691.489929] ------------[ cut here ]------------
        [  691.489947] list_add corruption. prev->next should be next (ffff88819c944530), but was ffff8881c23f7be4. (prev=ffff8881c23f7a38).
        [  691.489967] WARNING: CPU: 2 PID: 28933 at lib/list_debug.c:28 __list_add_valid+0x95/0xe0
        (...)
        [  691.489998] CPU: 2 PID: 28933 Comm: fsstress Not tainted 5.4.0-rc6-btrfs-next-62 #1
        [  691.490001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        [  691.490003] RIP: 0010:__list_add_valid+0x95/0xe0
        (...)
        [  691.490007] RSP: 0018:ffff8881f0b3faf8 EFLAGS: 00010282
        [  691.490010] RAX: 0000000000000000 RBX: ffff88819c944530 RCX: 0000000000000000
        [  691.490011] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffffa2c497e0
        [  691.490013] RBP: ffff8881f0b3fe68 R08: ffffed103eaa4115 R09: ffffed103eaa4114
        [  691.490015] R10: ffff88819c944000 R11: ffffed103eaa4115 R12: 7fffffffffffffff
        [  691.490016] R13: ffff8881b4035610 R14: ffff8881e7b84728 R15: 1ffff1103e167f7b
        [  691.490019] FS:  00007f4b25ea2e80(0000) GS:ffff8881f5500000(0000) knlGS:0000000000000000
        [  691.490021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  691.490022] CR2: 00007fffbb2d4eec CR3: 00000001f2a4a004 CR4: 00000000003606e0
        [  691.490025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [  691.490027] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [  691.490029] Call Trace:
        [  691.490058]  btrfs_log_inode_parent+0x667/0x2730 [btrfs]
        [  691.490083]  ? join_transaction+0x24a/0xce0 [btrfs]
        [  691.490107]  ? btrfs_end_log_trans+0x80/0x80 [btrfs]
        [  691.490111]  ? dget_parent+0xb8/0x460
        [  691.490116]  ? lock_downgrade+0x6b0/0x6b0
        [  691.490121]  ? rwlock_bug.part.0+0x90/0x90
        [  691.490127]  ? do_raw_spin_unlock+0x142/0x220
        [  691.490151]  btrfs_log_dentry_safe+0x65/0x90 [btrfs]
        [  691.490172]  btrfs_sync_file+0x9f1/0xc00 [btrfs]
        [  691.490195]  ? btrfs_file_write_iter+0x1800/0x1800 [btrfs]
        [  691.490198]  ? rcu_read_lock_any_held.part.11+0x20/0x20
        [  691.490204]  ? __do_sys_newstat+0x88/0xd0
        [  691.490207]  ? cp_new_stat+0x5d0/0x5d0
        [  691.490218]  ? do_fsync+0x38/0x60
        [  691.490220]  do_fsync+0x38/0x60
        [  691.490224]  __x64_sys_fdatasync+0x32/0x40
        [  691.490228]  do_syscall_64+0x9f/0x540
        [  691.490233]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [  691.490235] RIP: 0033:0x7f4b253ad5f0
        (...)
        [  691.490239] RSP: 002b:00007fffbb2d6078 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
        [  691.490242] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f4b253ad5f0
        [  691.490244] RDX: 00007fffbb2d5fe0 RSI: 00007fffbb2d5fe0 RDI: 0000000000000003
        [  691.490245] RBP: 000000000000000d R08: 0000000000000001 R09: 00007fffbb2d608c
        [  691.490247] R10: 00000000000002e8 R11: 0000000000000246 R12: 00000000000001f4
        [  691.490248] R13: 0000000051eb851f R14: 00007fffbb2d6120 R15: 00005635a498bda0
      
      This started happening recently when running some test cases from fstests
      like btrfs/004 for example, because support for rename exchange was added
      last week to fsstress from fstests.
      
      So fix this by deleting the log context for the source root from the list
      if we have logged the new name in the source root.
      Reported-by: default avatarSu Yue <Damenly_Su@gmx.com>
      Fixes: d4682ba0 ("Btrfs: sync log after logging new name")
      CC: stable@vger.kernel.org # 4.19+
      Tested-by: default avatarSu Yue <Damenly_Su@gmx.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      e6c61710
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 72d5ac67
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small changes: two in the core and one in the qla2xxx driver.
      
        The sg_tablesize fix affects a thinko in the migration to blk-mq of
        certain legacy drivers which could cause an oops and the sd core
        change should only affect zoned block devices which were wrongly
        suppressing error messages for reset all zones"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: core: Handle drivers which set sg_tablesize to zero
        scsi: qla2xxx: fix NPIV tear down process
        scsi: sd_zbc: Fix sd_zbc_complete()
      72d5ac67
    • Ben Hutchings's avatar
      drm/i915/cmdparser: Fix jump whitelist clearing · ea0b163b
      Ben Hutchings authored
      When a jump_whitelist bitmap is reused, it needs to be cleared.
      Currently this is done with memset() and the size calculation assumes
      bitmaps are made of 32-bit words, not longs.  So on 64-bit
      architectures, only the first half of the bitmap is cleared.
      
      If some whitelist bits are carried over between successive batches
      submitted on the same context, this will presumably allow embedding
      the rogue instructions that we're trying to reject.
      
      Use bitmap_zero() instead, which gets the calculation right.
      
      Fixes: f8c08d8f ("drm/i915/cmdparser: Add support for backward jumps")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarJon Bloomfield <jon.bloomfield@intel.com>
      ea0b163b
    • Paolo Bonzini's avatar
      KVM: fix placement of refcount initialization · e2d3fcaf
      Paolo Bonzini authored
      Reported by syzkaller:
      
         =============================
         WARNING: suspicious RCU usage
         -----------------------------
         ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
      
         other info that might help us debug this:
      
         rcu_scheduler_active = 2, debug_locks = 1
         no locks held by repro_11/12688.
      
         stack backtrace:
         Call Trace:
          dump_stack+0x7d/0xc5
          lockdep_rcu_suspicious+0x123/0x170
          kvm_dev_ioctl+0x9a9/0x1260 [kvm]
          do_vfs_ioctl+0x1a1/0xfb0
          ksys_ioctl+0x6d/0x80
          __x64_sys_ioctl+0x73/0xb0
          do_syscall_64+0x108/0xaa0
          entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Commit a97b0e77 (kvm: call kvm_arch_destroy_vm if vm creation fails)
      sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
      fails, we need to decrease this count.  By moving it earlier, we can push
      the decrease to out_err_no_arch_destroy_vm without introducing yet another
      error label.
      
      syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=15209b84e00000
      
      Reported-by: syzbot+75475908cd0910f141ee@syzkaller.appspotmail.com
      Fixes: a97b0e77 ("kvm: call kvm_arch_destroy_vm if vm creation fails")
      Cc: Jim Mattson <jmattson@google.com>
      Analyzed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e2d3fcaf
    • Paolo Bonzini's avatar
      KVM: Fix NULL-ptr deref after kvm_create_vm fails · 8a44119a
      Paolo Bonzini authored
      Reported by syzkaller:
      
          kasan: CONFIG_KASAN_INLINE enabled
          kasan: GPF could be caused by NULL-ptr deref or user memory access
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          CPU: 0 PID: 14727 Comm: syz-executor.3 Not tainted 5.4.0-rc4+ #0
          RIP: 0010:kvm_coalesced_mmio_init+0x5d/0x110 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:121
          Call Trace:
           kvm_dev_ioctl_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:3446 [inline]
           kvm_dev_ioctl+0x781/0x1490 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3494
           vfs_ioctl fs/ioctl.c:46 [inline]
           file_ioctl fs/ioctl.c:509 [inline]
           do_vfs_ioctl+0x196/0x1150 fs/ioctl.c:696
           ksys_ioctl+0x62/0x90 fs/ioctl.c:713
           __do_sys_ioctl fs/ioctl.c:720 [inline]
           __se_sys_ioctl fs/ioctl.c:718 [inline]
           __x64_sys_ioctl+0x6e/0xb0 fs/ioctl.c:718
           do_syscall_64+0xca/0x5d0 arch/x86/entry/common.c:290
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Commit 9121923c ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
      moves memslots and buses allocations around, however, if kvm->srcu/irq_srcu fails
      initialization, NULL will be returned instead of error code, NULL will not be intercepted
      in kvm_dev_ioctl_create_vm() and be dereferenced by kvm_coalesced_mmio_init(), this patch
      fixes it.
      
      Moving the initialization is required anyway to avoid an incorrect synchronize_srcu that
      was also reported by syzkaller:
      
       wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
       __synchronize_srcu+0x197/0x250 kernel/rcu/srcutree.c:921
       synchronize_srcu_expedited kernel/rcu/srcutree.c:946 [inline]
       synchronize_srcu+0x239/0x3e8 kernel/rcu/srcutree.c:997
       kvm_page_track_unregister_notifier+0xe7/0x130 arch/x86/kvm/page_track.c:212
       kvm_mmu_uninit_vm+0x1e/0x30 arch/x86/kvm/mmu.c:5828
       kvm_arch_destroy_vm+0x4a2/0x5f0 arch/x86/kvm/x86.c:9579
       kvm_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:702 [inline]
      
      so do it.
      
      Reported-by: syzbot+89a8060879fa0bd2db4f@syzkaller.appspotmail.com
      Reported-by: syzbot+e27e7027eb2b80e44225@syzkaller.appspotmail.com
      Fixes: 9121923c ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8a44119a