1. 16 Sep, 2019 40 commits
    • Bjorn Helgaas's avatar
      resource: Include resource end in walk_*() interfaces · 9a80dfcc
      Bjorn Helgaas authored
      [ Upstream commit a98959fd ]
      
      find_next_iomem_res() finds an iomem resource that covers part of a range
      described by "start, end".  All callers expect that range to be inclusive,
      i.e., both start and end are included, but find_next_iomem_res() doesn't
      handle the end address correctly.
      
      If it finds an iomem resource that contains exactly the end address, it
      skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
      iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
      it:
      
        find_next_iomem_res(...)
        {
          start = 0x0;
          end = 0x10000;
          for (p = next_resource(...)) {
            # p->start = 0x10000;
            # p->end = 0x10000;
            # we *should* return this resource, but this condition is false:
            if ((p->end >= start) && (p->start < end))
              break;
      
      Adjust find_next_iomem_res() so it allows a resource that includes the
      single byte at the end of the range.  This is a corner case that we
      probably don't see in practice.
      
      Fixes: 58c1b5b0 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Brijesh Singh <brijesh.singh@amd.com>
      CC: Dan Williams <dan.j.williams@intel.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: Lianbo Jiang <lijiang@redhat.com>
      CC: Takashi Iwai <tiwai@suse.de>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Tom Lendacky <thomas.lendacky@amd.com>
      CC: Vivek Goyal <vgoyal@redhat.com>
      CC: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
      CC: bhe@redhat.com
      CC: dan.j.williams@intel.com
      CC: dyoung@redhat.com
      CC: kexec@lists.infradead.org
      CC: mingo@redhat.com
      CC: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/153805812254.1157.16736368485811773752.stgit@bhelgaas-glaptop.roam.corp.google.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      9a80dfcc
    • Johannes Thumshirn's avatar
      btrfs: correctly validate compression type · 1c13c9c4
      Johannes Thumshirn authored
      [ Upstream commit aa53e3bf ]
      
      Nikolay reported the following KASAN splat when running btrfs/048:
      
      [ 1843.470920] ==================================================================
      [ 1843.471971] BUG: KASAN: slab-out-of-bounds in strncmp+0x66/0xb0
      [ 1843.472775] Read of size 1 at addr ffff888111e369e2 by task btrfs/3979
      
      [ 1843.473904] CPU: 3 PID: 3979 Comm: btrfs Not tainted 5.2.0-rc3-default #536
      [ 1843.475009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 1843.476322] Call Trace:
      [ 1843.476674]  dump_stack+0x7c/0xbb
      [ 1843.477132]  ? strncmp+0x66/0xb0
      [ 1843.477587]  print_address_description+0x114/0x320
      [ 1843.478256]  ? strncmp+0x66/0xb0
      [ 1843.478740]  ? strncmp+0x66/0xb0
      [ 1843.479185]  __kasan_report+0x14e/0x192
      [ 1843.479759]  ? strncmp+0x66/0xb0
      [ 1843.480209]  kasan_report+0xe/0x20
      [ 1843.480679]  strncmp+0x66/0xb0
      [ 1843.481105]  prop_compression_validate+0x24/0x70
      [ 1843.481798]  btrfs_xattr_handler_set_prop+0x65/0x160
      [ 1843.482509]  __vfs_setxattr+0x71/0x90
      [ 1843.483012]  __vfs_setxattr_noperm+0x84/0x130
      [ 1843.483606]  vfs_setxattr+0xac/0xb0
      [ 1843.484085]  setxattr+0x18c/0x230
      [ 1843.484546]  ? vfs_setxattr+0xb0/0xb0
      [ 1843.485048]  ? __mod_node_page_state+0x1f/0xa0
      [ 1843.485672]  ? _raw_spin_unlock+0x24/0x40
      [ 1843.486233]  ? __handle_mm_fault+0x988/0x1290
      [ 1843.486823]  ? lock_acquire+0xb4/0x1e0
      [ 1843.487330]  ? lock_acquire+0xb4/0x1e0
      [ 1843.487842]  ? mnt_want_write_file+0x3c/0x80
      [ 1843.488442]  ? debug_lockdep_rcu_enabled+0x22/0x40
      [ 1843.489089]  ? rcu_sync_lockdep_assert+0xe/0x70
      [ 1843.489707]  ? __sb_start_write+0x158/0x200
      [ 1843.490278]  ? mnt_want_write_file+0x3c/0x80
      [ 1843.490855]  ? __mnt_want_write+0x98/0xe0
      [ 1843.491397]  __x64_sys_fsetxattr+0xba/0xe0
      [ 1843.492201]  ? trace_hardirqs_off_thunk+0x1a/0x1c
      [ 1843.493201]  do_syscall_64+0x6c/0x230
      [ 1843.493988]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1843.495041] RIP: 0033:0x7fa7a8a7707a
      [ 1843.495819] Code: 48 8b 0d 21 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 be 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee dd 2b 00 f7 d8 64 89 01 48
      [ 1843.499203] RSP: 002b:00007ffcb73bca38 EFLAGS: 00000202 ORIG_RAX: 00000000000000be
      [ 1843.500210] RAX: ffffffffffffffda RBX: 00007ffcb73bda9d RCX: 00007fa7a8a7707a
      [ 1843.501170] RDX: 00007ffcb73bda9d RSI: 00000000006dc050 RDI: 0000000000000003
      [ 1843.502152] RBP: 00000000006dc050 R08: 0000000000000000 R09: 0000000000000000
      [ 1843.503109] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffcb73bda91
      [ 1843.504055] R13: 0000000000000003 R14: 00007ffcb73bda82 R15: ffffffffffffffff
      
      [ 1843.505268] Allocated by task 3979:
      [ 1843.505771]  save_stack+0x19/0x80
      [ 1843.506211]  __kasan_kmalloc.constprop.5+0xa0/0xd0
      [ 1843.506836]  setxattr+0xeb/0x230
      [ 1843.507264]  __x64_sys_fsetxattr+0xba/0xe0
      [ 1843.507886]  do_syscall_64+0x6c/0x230
      [ 1843.508429]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [ 1843.509558] Freed by task 0:
      [ 1843.510188] (stack is not available)
      
      [ 1843.511309] The buggy address belongs to the object at ffff888111e369e0
                      which belongs to the cache kmalloc-8 of size 8
      [ 1843.514095] The buggy address is located 2 bytes inside of
                      8-byte region [ffff888111e369e0, ffff888111e369e8)
      [ 1843.516524] The buggy address belongs to the page:
      [ 1843.517561] page:ffff88813f478d80 refcount:1 mapcount:0 mapping:ffff88811940c300 index:0xffff888111e373b8 compound_mapcount: 0
      [ 1843.519993] flags: 0x4404000010200(slab|head)
      [ 1843.520951] raw: 0004404000010200 ffff88813f48b008 ffff888119403d50 ffff88811940c300
      [ 1843.522616] raw: ffff888111e373b8 000000000016000f 00000001ffffffff 0000000000000000
      [ 1843.524281] page dumped because: kasan: bad access detected
      
      [ 1843.525936] Memory state around the buggy address:
      [ 1843.526975]  ffff888111e36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.528479]  ffff888111e36900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.530138] >ffff888111e36980: fc fc fc fc fc fc fc fc fc fc fc fc 02 fc fc fc
      [ 1843.531877]                                                        ^
      [ 1843.533287]  ffff888111e36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.534874]  ffff888111e36a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1843.536468] ==================================================================
      
      This is caused by supplying a too short compression value ('lz') in the
      test-case and comparing it to 'lzo' with strncmp() and a length of 3.
      strncmp() read past the 'lz' when looking for the 'o' and thus caused an
      out-of-bounds read.
      
      Introduce a new check 'btrfs_compress_is_valid_type()' which not only
      checks the user-supplied value against known compression types, but also
      employs checks for too short values.
      Reported-by: default avatarNikolay Borisov <nborisov@suse.com>
      Fixes: 272e5326 ("btrfs: prop: fix vanished compression property after failed set")
      CC: stable@vger.kernel.org # 5.1+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c13c9c4
    • Bart Van Assche's avatar
      RDMA/srp: Accept again source addresses that do not have a port number · 0ca2688b
      Bart Van Assche authored
      [ Upstream commit bcef5b72 ]
      
      The function srp_parse_in() is used both for parsing source address
      specifications and for target address specifications. Target addresses
      must have a port number. Having to specify a port number for source
      addresses is inconvenient. Make sure that srp_parse_in() supports again
      parsing addresses with no port number.
      
      Cc: <stable@vger.kernel.org>
      Fixes: c62adb7d ("IB/srp: Fix IPv6 address parsing")
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0ca2688b
    • Bart Van Assche's avatar
      RDMA/srp: Document srp_parse_in() arguments · 95416047
      Bart Van Assche authored
      [ Upstream commit e37df2d5 ]
      
      This patch avoids that a warning is reported when building with W=1.
      
      Cc: Sergey Gorenko <sergeygo@mellanox.com>
      Cc: Max Gurtovoy <maxg@mellanox.com>
      Cc: Laurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95416047
    • Linus Walleij's avatar
      ARM: dts: gemini: Set DIR-685 SPI CS as active low · bab0ff2d
      Linus Walleij authored
      [ Upstream commit f90b8fda ]
      
      The SPI to the display on the DIR-685 is active low, we were
      just saved by the SPI library enforcing active low on everything
      before, so set it as active low to avoid ambiguity.
      
      Link: https://lore.kernel.org/r/20190715202101.16060-1-linus.walleij@linaro.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bab0ff2d
    • Michael Neuling's avatar
      KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation · 3a1b79ad
      Michael Neuling authored
      [ Upstream commit 3fefd1cd ]
      
      When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The
      code currently sets:
          CR0 <- 00 || MSR[TS]
      but according to the ISA it should be:
          CR0 <-  0 || MSR[TS] || 0
      
      This fixes the bit shift to put the bits in the correct location.
      
      This is a data integrity issue as CR0 is corrupted.
      
      Fixes: 4bb3c7a0 ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9")
      Cc: stable@vger.kernel.org # v4.17+
      Tested-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3a1b79ad
    • Paul Mackerras's avatar
      KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct · 3ac71806
      Paul Mackerras authored
      [ Upstream commit fd0944ba ]
      
      When the 'regs' field was added to struct kvm_vcpu_arch, the code
      was changed to use several of the fields inside regs (e.g., gpr, lr,
      etc.) but not the ccr field, because the ccr field in struct pt_regs
      is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is
      only 32 bits.  This changes the code to use the regs.ccr field
      instead of cr, and changes the assembly code on 64-bit platforms to
      use 64-bit loads and stores instead of 32-bit ones.
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3ac71806
    • Wanpeng Li's avatar
      KVM: VMX: check CPUID before allowing read/write of IA32_XSS · beeeead9
      Wanpeng Li authored
      [ Upstream commit 4d763b16 ]
      
      Raise #GP when guest read/write IA32_XSS, but the CPUID bits
      say that it shouldn't exist.
      
      Fixes: 20300099 (kvm: vmx: add MSR logic for XSAVES)
      Reported-by: default avatarXiaoyao Li <xiaoyao.li@linux.intel.com>
      Reported-by: default avatarTao Xu <tao3.xu@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      beeeead9
    • Sean Christopherson's avatar
      KVM: VMX: Fix handling of #MC that occurs during VM-Entry · 891011ca
      Sean Christopherson authored
      [ Upstream commit beb8d93b ]
      
      A previous fix to prevent KVM from consuming stale VMCS state after a
      failed VM-Entry inadvertantly blocked KVM's handling of machine checks
      that occur during VM-Entry.
      
      Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways,
      depending on when the #MC is recognoized.  As it pertains to this bug
      fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY
      is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to
      indicate the VM-Entry failed.
      
      If a machine-check event occurs during a VM entry, one of the following occurs:
       - The machine-check event is handled as if it occurred before the VM entry:
              ...
       - The machine-check event is handled after VM entry completes:
              ...
       - A VM-entry failure occurs as described in Section 26.7. The basic
         exit reason is 41, for "VM-entry failure due to machine-check event".
      
      Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in
      vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit().
      Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY
      in a sane fashion and also simplifies vmx_complete_atomic_exit() since
      VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh.
      
      Fixes: b060ca3b ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      891011ca
    • Sean Christopherson's avatar
      KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value · 74ce1333
      Sean Christopherson authored
      [ Upstream commit d28f4290 ]
      
      The behavior of WRMSR is in no way dependent on whether or not KVM
      consumes the value.
      
      Fixes: 4566654b ("KVM: vmx: Inject #GP on invalid PAT CR")
      Cc: stable@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74ce1333
    • Paolo Bonzini's avatar
      KVM: x86: optimize check for valid PAT value · 74fd8aae
      Paolo Bonzini authored
      [ Upstream commit 674ea351 ]
      
      This check will soon be done on every nested vmentry and vmexit,
      "parallelize" it using bitwise operations.
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74fd8aae
    • Yan, Zheng's avatar
      ceph: use ceph_evict_inode to cleanup inode's resource · 81281039
      Yan, Zheng authored
      [ Upstream commit 87bc5b89 ]
      
      remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
      freeing inode to remove its caps. But VFS wakes freeing inode waiters
      before calling destroy_inode().
      
      Cc: stable@vger.kernel.org
      Link: https://tracker.ceph.com/issues/40102Signed-off-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      81281039
    • Takashi Iwai's avatar
      ALSA: hda - Don't resume forcibly i915 HDMI/DP codec · 42fa0e35
      Takashi Iwai authored
      [ Upstream commit 4914da2f ]
      
      We apply the codec resume forcibly at system resume callback for
      updating and syncing the jack detection state that may have changed
      during sleeping.  This is, however, superfluous for the codec like
      Intel HDMI/DP, where the jack detection is managed via the audio
      component notification; i.e. the jack state change shall be reported
      sooner or later from the graphics side at mode change.
      
      This patch changes the codec resume callback to avoid the forcible
      resume conditionally with a new flag, codec->relaxed_resume, for
      reducing the resume time.  The flag is set in the codec probe.
      
      Although this doesn't fix the entire bug mentioned in the bugzilla
      entry below, it's still a good optimization and some improvements are
      seen.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201901
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      42fa0e35
    • Paulo Alcantara (SUSE)'s avatar
      cifs: Properly handle auto disabling of serverino option · 987564c2
      Paulo Alcantara (SUSE) authored
      [ Upstream commit 29fbeb7a ]
      
      Fix mount options comparison when serverino option is turned off later
      in cifs_autodisable_serverino() and thus avoiding mismatch of new cifs
      mounts.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <paulo@paulo.ac>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilove@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      987564c2
    • Benjamin Block's avatar
      scsi: zfcp: fix request object use-after-free in send path causing wrong traces · d85e830d
      Benjamin Block authored
      [ Upstream commit 106d45f3 ]
      
      When tracing instances where we open and close WKA ports, we also pass the
      request-ID of the respective FSF command.
      
      But after successfully sending the FSF command we must not use the
      request-object anymore, as this might result in an use-after-free (see
      "zfcp: fix request object use-after-free in send path causing seqno
      errors" ).
      
      To fix this add a new variable that caches the request-ID before sending
      the request. This won't change during the hand-off to the FCP channel,
      and so it's safe to trace this cached request-ID later, instead of using
      the request object.
      Signed-off-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Fixes: d27a7cb9 ("zfcp: trace on request for open and close of WKA port")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d85e830d
    • Ajay Singh's avatar
      staging: wilc1000: fix error path cleanup in wilc_wlan_initialize() · ba8701d2
      Ajay Singh authored
      [ Upstream commit 6419f818 ]
      
      For the error path in wilc_wlan_initialize(), the resources are not
      cleanup in the correct order. Reverted the previous changes and use the
      correct order to free during error condition.
      
      Fixes: b46d6882 ("staging: wilc1000: remove COMPLEMENT_BOOT")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAjay Singh <ajay.kathat@microchip.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ba8701d2
    • Roman Bolshakov's avatar
      scsi: target/iblock: Fix overrun in WRITE SAME emulation · 60b856dc
      Roman Bolshakov authored
      [ Upstream commit 5676234f ]
      
      WRITE SAME corrupts data on the block device behind iblock if the command
      is emulated. The emulation code issues (M - 1) * N times more bios than
      requested, where M is the number of 512 blocks per real block size and N is
      the NUMBER OF LOGICAL BLOCKS specified in WRITE SAME command. So, for a
      device with 4k blocks, 7 * N more LBAs gets written after the requested
      range.
      
      The issue happens because the number of 512 byte sectors to be written is
      decreased one by one while the real bios are typically from 1 to 8 512 byte
      sectors per bio.
      
      Fixes: c66ac9db ("[SCSI] target: Add LIO target core v4.0.0-rc6")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarRoman Bolshakov <r.bolshakov@yadro.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      60b856dc
    • Bart Van Assche's avatar
      scsi: target/core: Use the SECTOR_SHIFT constant · ba52842d
      Bart Van Assche authored
      [ Upstream commit 80b045b3 ]
      
      Instead of duplicating the SECTOR_SHIFT definition from <linux/blkdev.h>,
      use it. This patch does not change any functionality.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Nicholas Bellinger <nab@linux-iscsi.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ba52842d
    • Mike Salvatore's avatar
      apparmor: reset pos on failure to unpack for various functions · 17111037
      Mike Salvatore authored
      [ Upstream commit 156e4299 ]
      
      Each function that manipulates the aa_ext struct should reset it's "pos"
      member on failure. This ensures that, on failure, no changes are made to
      the state of the aa_ext struct.
      
      There are paths were elements are optional and the error path is
      used to indicate the optional element is not present. This means
      instead of just aborting on error the unpack stream can become
      unsynchronized on optional elements, if using one of the affected
      functions.
      
      Cc: stable@vger.kernel.org
      Fixes: 736ec752 ("AppArmor: policy routines for loading and unpacking policy")
      Signed-off-by: default avatarMike Salvatore <mike.salvatore@canonical.com>
      Signed-off-by: default avatarJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      17111037
    • Mike Marciniszyn's avatar
      IB/hfi1: Avoid hardlockup with flushlist_lock · 90ca4912
      Mike Marciniszyn authored
      [ Upstream commit cf131a81 ]
      
      Heavy contention of the sde flushlist_lock can cause hard lockups at
      extreme scale when the flushing logic is under stress.
      
      Mitigate by replacing the item at a time copy to the local list with
      an O(1) list_splice_init() and using the high priority work queue to
      do the flushes.
      
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90ca4912
    • Jon Hunter's avatar
      clk: tegra210: Fix default rates for HDA clocks · fa717fc4
      Jon Hunter authored
      [ Upstream commit 9caec662 ]
      
      Currently the default clock rates for the HDA and HDA2CODEC_2X clocks
      are both 19.2MHz. However, the default rates for these clocks should
      actually be 51MHz and 48MHz, respectively. The current clock settings
      results in a distorted output during audio playback. Correct the default
      clock rates for these clocks by specifying them in the clock init table
      for Tegra210.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fa717fc4
    • Jon Hunter's avatar
      clk: tegra: Fix maximum audio sync clock for Tegra124/210 · 350503c8
      Jon Hunter authored
      [ Upstream commit 845d782d ]
      
      The maximum frequency supported for I2S on Tegra124 and Tegra210 is
      24.576MHz (as stated in the Tegra TK1 data sheet for Tegra124 and the
      Jetson TX1 module data sheet for Tegra210). However, the maximum I2S
      frequency is limited to 24MHz because that is the maximum frequency of
      the audio sync clock. Increase the maximum audio sync clock frequency
      to 24.576MHz for Tegra124 and Tegra210 in order to support 24.576MHz
      for I2S.
      
      Update the tegra_clk_register_sync_source() function so that it does
      not set the initial rate for the sync clocks and use the clock init
      tables to set the initial rate instead.
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      350503c8
    • Ronnie Sahlberg's avatar
      cifs: add spinlock for the openFileList to cifsInodeInfo · acc07941
      Ronnie Sahlberg authored
      [ Upstream commit 487317c9 ]
      
      We can not depend on the tcon->open_file_lock here since in multiuser mode
      we may have the same file/inode open via multiple different tcons.
      
      The current code is race prone and will crash if one user deletes a file
      at the same time a different user opens/create the file.
      
      To avoid this we need to have a spinlock attached to the inode and not the tcon.
      
      RHBZ:  1580165
      
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      acc07941
    • Filipe Manana's avatar
      Btrfs: fix race between block group removal and block group allocation · 1d064876
      Filipe Manana authored
      [ Upstream commit 8eaf40c0 ]
      
      If a task is removing the block group that currently has the highest start
      offset amongst all existing block groups, there is a short time window
      where it races with a concurrent block group allocation, resulting in a
      transaction abort with an error code of EEXIST.
      
      The following diagram explains the race in detail:
      
            Task A                                                        Task B
      
       btrfs_remove_block_group(bg offset X)
      
         remove_extent_mapping(em offset X)
           -> removes extent map X from the
              tree of extent maps
              (fs_info->mapping_tree), so the
              next call to find_next_chunk()
              will return offset X
      
                                                         btrfs_alloc_chunk()
                                                           find_next_chunk()
                                                             --> returns offset X
      
                                                           __btrfs_alloc_chunk(offset X)
                                                             btrfs_make_block_group()
                                                               btrfs_create_block_group_cache()
                                                                 --> creates btrfs_block_group_cache
                                                                     object with a key corresponding
                                                                     to the block group item in the
                                                                     extent, the key is:
                                                                     (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
      
                                                               --> adds the btrfs_block_group_cache object
                                                                   to the list new_bgs of the transaction
                                                                   handle
      
                                                         btrfs_end_transaction(trans handle)
                                                           __btrfs_end_transaction()
                                                             btrfs_create_pending_block_groups()
                                                               --> sees the new btrfs_block_group_cache
                                                                   in the new_bgs list of the transaction
                                                                   handle
                                                               --> its call to btrfs_insert_item() fails
                                                                   with -EEXIST when attempting to insert
                                                                   the block group item key
                                                                   (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
                                                                   because task A has not removed that key yet
                                                               --> aborts the running transaction with
                                                                   error -EEXIST
      
         btrfs_del_item()
           -> removes the block group's key from
              the extent tree, key is
              (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
      
      A sample transaction abort trace:
      
        [78912.403537] ------------[ cut here ]------------
        [78912.403811] BTRFS: Transaction aborted (error -17)
        [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
        (...)
        [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G        W         5.0.0-btrfs-next-46 #1
        [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
        [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
        (...)
        [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282
        [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006
        [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860
        [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000
        [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588
        [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158
        [78912.409899] FS:  00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000
        [78912.410285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0
        [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [78912.411898] Call Trace:
        [78912.412318]  __btrfs_end_transaction+0x5b/0x1c0 [btrfs]
        [78912.412746]  btrfs_inc_block_group_ro+0xcf/0x160 [btrfs]
        [78912.413179]  scrub_enumerate_chunks+0x188/0x5b0 [btrfs]
        [78912.413622]  ? __mutex_unlock_slowpath+0x100/0x2a0
        [78912.414078]  btrfs_scrub_dev+0x2ef/0x720 [btrfs]
        [78912.414535]  ? __sb_start_write+0xd4/0x1c0
        [78912.414963]  ? mnt_want_write_file+0x24/0x50
        [78912.415403]  btrfs_ioctl+0x17fb/0x3120 [btrfs]
        [78912.415832]  ? lock_acquire+0xa6/0x190
        [78912.416256]  ? do_vfs_ioctl+0xa2/0x6f0
        [78912.416685]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
        [78912.417116]  do_vfs_ioctl+0xa2/0x6f0
        [78912.417534]  ? __fget+0x113/0x200
        [78912.417954]  ksys_ioctl+0x70/0x80
        [78912.418369]  __x64_sys_ioctl+0x16/0x20
        [78912.418812]  do_syscall_64+0x60/0x1b0
        [78912.419231]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [78912.419644] RIP: 0033:0x7f3880252dd7
        (...)
        [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7
        [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003
        [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000
        [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000
        [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040
        [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]---
      
      Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(),
      only at the very end, after removing the block group item key from the
      extent tree (and removing the free space tree entry if we are using the
      free space tree feature).
      
      Fixes: 04216820 ("Btrfs: fix race between fs trimming and block group remove/allocation")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d064876
    • Shirish S's avatar
      drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc · f276beb3
      Shirish S authored
      [ Upstream commit 517b91f4 ]
      
      [What]
      readptr read always returns zero, since most likely
      these blocks are either power or clock gated.
      
      [How]
      fetch rptr after amdgpu_ring_alloc() which informs
      the power management code that the block is about to be
      used and hence the gating is turned off.
      Signed-off-by: default avatarLouis Li <Ching-shih.Li@amd.com>
      Signed-off-by: default avatarShirish S <shirish.s@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f276beb3
    • Louis Li's avatar
      drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) · 7abeffff
      Louis Li authored
      [ Upstream commit ce0e22f5 ]
      
      [What]
      vce ring test fails consistently during resume in s3 cycle, due to
      mismatch read & write pointers.
      On debug/analysis its found that rptr to be compared is not being
      correctly updated/read, which leads to this failure.
      Below is the failure signature:
      	[drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed
      	[drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block <vce_v3_0> failed -110
      	[drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed (-110).
      
      [How]
      fetch rptr appropriately, meaning move its read location further down
      in the code flow.
      With this patch applied the s3 failure is no more seen for >5k s3 cycles,
      which otherwise is pretty consistent.
      
      V2: remove reduntant fetch of rptr
      Signed-off-by: default avatarLouis Li <Ching-shih.Li@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7abeffff
    • Peter Xu's avatar
      kvm: Check irqchip mode before assign irqfd · d5f65393
      Peter Xu authored
      [ Upstream commit 654f1f13 ]
      
      When assigning kvm irqfd we didn't check the irqchip mode but we allow
      KVM_IRQFD to succeed with all the irqchip modes.  However it does not
      make much sense to create irqfd even without the kernel chips.  Let's
      provide a arch-dependent helper to check whether a specific irqfd is
      allowed by the arch.  At least for x86, it should make sense to check:
      
      - when irqchip mode is NONE, all irqfds should be disallowed, and,
      
      - when irqchip mode is SPLIT, irqfds that are with resamplefd should
        be disallowed.
      
      For either of the case, previously we'll silently ignore the irq or
      the irq ack event if the irqchip mode is incorrect.  However that can
      cause misterious guest behaviors and it can be hard to triage.  Let's
      fail KVM_IRQFD even earlier to detect these incorrect configurations.
      
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Radim Krčmář <rkrcmar@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5f65393
    • Kent Russell's avatar
      drm/amdkfd: Add missing Polaris10 ID · 90772cf5
      Kent Russell authored
      [ Upstream commit 0a5a9c27 ]
      
      This was added to amdgpu but was missed in amdkfd
      Signed-off-by: default avatarKent Russell <kent.russell@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.rg
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90772cf5
    • Eugeniy Paltsev's avatar
      ARC: mm: SIGSEGV userspace trying to access kernel virtual memory · cacbc853
      Eugeniy Paltsev authored
      [ Upstream commit a8c715b4 ]
      
      As of today if userspace process tries to access a kernel virtual addres
      (0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already
      exists, that process hangs instead of being killed with SIGSEGV
      
      Fix that by ensuring that do_page_fault() handles kenrel vaddr only if
      in kernel mode.
      
      And given this, we can also simplify the code a bit. Now a vmalloc fault
      implies kernel mode so its failure (for some reason) can reuse the
      @no_context label and we can remove @bad_area_nosemaphore.
      
      Reproduce user test for original problem:
      
      ------------------------>8-----------------
       #include <stdlib.h>
       #include <stdint.h>
      
       int main(int argc, char *argv[])
       {
       	volatile uint32_t temp;
      
       	temp = *(uint32_t *)(0x70000000);
       }
      ------------------------>8-----------------
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cacbc853
    • Eugeniy Paltsev's avatar
      ARC: mm: fix uninitialised signal code in do_page_fault · 7edfa9c9
      Eugeniy Paltsev authored
      [ Upstream commit 121e38e5 ]
      
      Commit 15773ae9 ("signal/arc: Use force_sig_fault where
      appropriate") introduced undefined behaviour by leaving si_code
      unitiailized and leaking random kernel values to user space.
      
      Fixes: 15773ae9 ("signal/arc: Use force_sig_fault where appropriate")
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7edfa9c9
    • Eric W. Biederman's avatar
      0828438e
    • Milan Broz's avatar
      dm crypt: move detailed message into debug level · fcb2f1e2
      Milan Broz authored
      [ Upstream commit 7a1cd723 ]
      
      The information about tag size should not be printed without debug info
      set. Also print device major:minor in the error message to identify the
      device instance.
      
      Also use rate limiting and debug level for info about used crypto API
      implementaton.  This is important because during online reencryption
      the existing message saturates syslog (because we are moving hotzone
      across the whole device).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fcb2f1e2
    • Long Li's avatar
      cifs: smbd: take an array of reqeusts when sending upper layer data · 96b44c20
      Long Li authored
      [ Upstream commit 4739f232 ]
      
      To support compounding, __smb_send_rqst() now sends an array of requests to
      the transport layer.
      Change smbd_send() to take an array of requests, and send them in as few
      packets as possible.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      96b44c20
    • Jisheng Zhang's avatar
      PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code · 3f27a14b
      Jisheng Zhang authored
      [ Upstream commit e6fdd3bf ]
      
      Use devm_pci_alloc_host_bridge() to simplify the error code path.  This
      also fixes a leak in the dw_pcie_host_init() error path.
      Signed-off-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarGustavo Pimentel <gustavo.pimentel@synopsys.com>
      CC: stable@vger.kernel.org	# v4.13+
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3f27a14b
    • Adrian Hunter's avatar
      mmc: sdhci-pci: Add support for Intel CML · 842da8fa
      Adrian Hunter authored
      [ Upstream commit 765c5967 ]
      
      Add PCI Ids for Intel CML.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      842da8fa
    • Ming Lei's avatar
      blk-mq: free hw queue's resource in hctx's release handler · e238e6dc
      Ming Lei authored
      [ Upstream commit c7e2d94b ]
      
      Once blk_cleanup_queue() returns, tags shouldn't be used any more,
      because blk_mq_free_tag_set() may be called. Commit 45a9c9d9
      ("blk-mq: Fix a use-after-free") fixes this issue exactly.
      
      However, that commit introduces another issue. Before 45a9c9d9,
      we are allowed to run queue during cleaning up queue if the queue's
      kobj refcount is held. After that commit, queue can't be run during
      queue cleaning up, otherwise oops can be triggered easily because
      some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue().
      
      We have invented ways for addressing this kind of issue before, such as:
      
      	8dc765d4 ("SCSI: fix queue cleanup race before queue initialization is done")
      	c2856ae2 ("blk-mq: quiesce queue before freeing queue")
      
      But still can't cover all cases, recently James reports another such
      kind of issue:
      
      	https://marc.info/?l=linux-scsi&m=155389088124782&w=2
      
      This issue can be quite hard to address by previous way, given
      scsi_run_queue() may run requeues for other LUNs.
      
      Fixes the above issue by freeing hctx's resources in its release handler, and this
      way is safe becasue tags isn't needed for freeing such hctx resource.
      
      This approach follows typical design pattern wrt. kobject's release handler.
      
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: James Smart <james.smart@broadcom.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: linux-scsi@vger.kernel.org,
      Cc: Martin K . Petersen <martin.petersen@oracle.com>,
      Cc: Christoph Hellwig <hch@lst.de>,
      Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>,
      Reported-by: default avatarJames Smart <james.smart@broadcom.com>
      Fixes: 45a9c9d9 ("blk-mq: Fix a use-after-free")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e238e6dc
    • Yufen Yu's avatar
      dm mpath: fix missing call of path selector type->end_io · 69409854
      Yufen Yu authored
      [ Upstream commit 5de719e3 ]
      
      After commit 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via
      blk_insert_cloned_request feedback"), map_request() will requeue the tio
      when issued clone request return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE.
      
      Thus, if device driver status is error, a tio may be requeued multiple
      times until the return value is not DM_MAPIO_REQUEUE.  That means
      type->start_io may be called multiple times, while type->end_io is only
      called when IO complete.
      
      In fact, even without commit 396eaf21, setup_clone() failure can
      also cause tio requeue and associated missed call to type->end_io.
      
      The service-time path selector selects path based on in_flight_size,
      which is increased by st_start_io() and decreased by st_end_io().
      Missed calls to st_end_io() can lead to in_flight_size count error and
      will cause the selector to make the wrong choice.  In addition,
      queue-length path selector will also be affected.
      
      To fix the problem, call type->end_io in ->release_clone_rq before tio
      requeue.  map_info is passed to ->release_clone_rq() for map_request()
      error path that result in requeue.
      
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Cc: stable@vger.kernl.org
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      69409854
    • Lyude Paul's avatar
      PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary · 0fe09701
      Lyude Paul authored
      [ Upstream commit e0547c81 ]
      
      On ThinkPad P50 SKUs with an Nvidia Quadro M1000M instead of the M2000M
      variant, the BIOS does not always reset the secondary Nvidia GPU during
      reboot if the laptop is configured in Hybrid Graphics mode.  The reason is
      unknown, but the following steps and possibly a good bit of patience will
      reproduce the issue:
      
        1. Boot up the laptop normally in Hybrid Graphics mode
        2. Make sure nouveau is loaded and that the GPU is awake
        3. Allow the Nvidia GPU to runtime suspend itself after being idle
        4. Reboot the machine, the more sudden the better (e.g. sysrq-b may help)
        5. If nouveau loads up properly, reboot the machine again and go back to
           step 2 until you reproduce the issue
      
      This results in some very strange behavior: the GPU will be left in exactly
      the same state it was in when the previously booted kernel started the
      reboot.  This has all sorts of bad side effects: for starters, this
      completely breaks nouveau starting with a mysterious EVO channel failure
      that happens well before we've actually used the EVO channel for anything:
      
        nouveau 0000:01:00.0: disp: chid 0 mthd 0000 data 00000400 00001000 00000002
      
      This causes a timeout trying to bring up the GR ctx:
      
        nouveau 0000:01:00.0: timeout
        WARNING: CPU: 0 PID: 12 at drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1547 gf100_grctx_generate+0x7b2/0x850 [nouveau]
        Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET82W (1.55 ) 12/18/2018
        Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
        ...
        nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
        nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
        nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000008000 engine 00 [GR] client 15 [HUB/SCC_NB] reason c4 [] on channel -1 [0000000000 unknown]
      
      The GPU never manages to recover.  Booting without loading nouveau causes
      issues as well, since the GPU starts sending spurious interrupts that cause
      other device's IRQs to get disabled by the kernel:
      
        irq 16: nobody cared (try booting with the "irqpoll" option)
        ...
        handlers:
        [<000000007faa9e99>] i801_isr [i2c_i801]
        Disabling IRQ #16
        ...
        serio: RMI4 PS/2 pass-through port at rmi4-00.fn03
        i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.4: Transaction timeout
        rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-110).
        i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.4: Transaction timeout
        rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to change enabled interrupts!
      
      This causes the touchpad and sometimes other things to get disabled.
      
      Since this happens without nouveau, we can't fix this problem from nouveau
      itself.
      
      Add a PCI quirk for the specific P50 variant of this GPU.  Make sure the
      GPU is advertising NoReset- so we don't reset the GPU when the machine is
      in Dedicated graphics mode (where the GPU being initialized by the BIOS is
      normal and expected).  Map the GPU MMIO space and read the magic 0x2240c
      register, which will have bit 1 set if the device was POSTed during a
      previous boot.  Once we've confirmed all of this, reset the GPU and
      re-disable it - bringing it back to a healthy state.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=203003
      Link: https://lore.kernel.org/lkml/20190212220230.1568-1-lyude@redhat.comSigned-off-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: nouveau@lists.freedesktop.org
      Cc: dri-devel@lists.freedesktop.org
      Cc: Karol Herbst <kherbst@redhat.com>
      Cc: Ben Skeggs <skeggsb@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0fe09701
    • Logan Gunthorpe's avatar
      PCI: Add macro for Switchtec quirk declarations · 5659dfca
      Logan Gunthorpe authored
      [ Upstream commit 01d5d7fa ]
      
      Add SWITCHTEC_QUIRK() to reduce redundancy in declaring devices that use
      quirk_switchtec_ntb_dma_alias().
      
      By itself, this is no functional change, but a subsequent patch updates
      SWITCHTEC_QUIRK() to fix ad281ecf ("PCI: Add DMA alias quirk for
      Microsemi Switchtec NTB").
      
      Fixes: ad281ecf ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      [bhelgaas: split to separate patch]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5659dfca
    • Christoph Muellner's avatar
      dt-bindings: mmc: Add disable-cqe-dcmd property. · e4ba1578
      Christoph Muellner authored
      [ Upstream commit 28f22fb7 ]
      
      Add disable-cqe-dcmd as optional property for MMC hosts.
      This property allows to disable or not enable the direct command
      features of the command queue engine.
      Signed-off-by: default avatarChristoph Muellner <christoph.muellner@theobroma-systems.com>
      Signed-off-by: default avatarPhilipp Tomsich <philipp.tomsich@theobroma-systems.com>
      Fixes: 84362d79 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e4ba1578