1. 16 Sep, 2019 40 commits
    • Filipe Manana's avatar
      Btrfs: fix race between block group removal and block group allocation · 1d064876
      Filipe Manana authored
      [ Upstream commit 8eaf40c0 ]
      
      If a task is removing the block group that currently has the highest start
      offset amongst all existing block groups, there is a short time window
      where it races with a concurrent block group allocation, resulting in a
      transaction abort with an error code of EEXIST.
      
      The following diagram explains the race in detail:
      
            Task A                                                        Task B
      
       btrfs_remove_block_group(bg offset X)
      
         remove_extent_mapping(em offset X)
           -> removes extent map X from the
              tree of extent maps
              (fs_info->mapping_tree), so the
              next call to find_next_chunk()
              will return offset X
      
                                                         btrfs_alloc_chunk()
                                                           find_next_chunk()
                                                             --> returns offset X
      
                                                           __btrfs_alloc_chunk(offset X)
                                                             btrfs_make_block_group()
                                                               btrfs_create_block_group_cache()
                                                                 --> creates btrfs_block_group_cache
                                                                     object with a key corresponding
                                                                     to the block group item in the
                                                                     extent, the key is:
                                                                     (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
      
                                                               --> adds the btrfs_block_group_cache object
                                                                   to the list new_bgs of the transaction
                                                                   handle
      
                                                         btrfs_end_transaction(trans handle)
                                                           __btrfs_end_transaction()
                                                             btrfs_create_pending_block_groups()
                                                               --> sees the new btrfs_block_group_cache
                                                                   in the new_bgs list of the transaction
                                                                   handle
                                                               --> its call to btrfs_insert_item() fails
                                                                   with -EEXIST when attempting to insert
                                                                   the block group item key
                                                                   (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
                                                                   because task A has not removed that key yet
                                                               --> aborts the running transaction with
                                                                   error -EEXIST
      
         btrfs_del_item()
           -> removes the block group's key from
              the extent tree, key is
              (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
      
      A sample transaction abort trace:
      
        [78912.403537] ------------[ cut here ]------------
        [78912.403811] BTRFS: Transaction aborted (error -17)
        [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
        (...)
        [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G        W         5.0.0-btrfs-next-46 #1
        [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
        [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
        (...)
        [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282
        [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006
        [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860
        [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000
        [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588
        [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158
        [78912.409899] FS:  00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000
        [78912.410285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0
        [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [78912.411898] Call Trace:
        [78912.412318]  __btrfs_end_transaction+0x5b/0x1c0 [btrfs]
        [78912.412746]  btrfs_inc_block_group_ro+0xcf/0x160 [btrfs]
        [78912.413179]  scrub_enumerate_chunks+0x188/0x5b0 [btrfs]
        [78912.413622]  ? __mutex_unlock_slowpath+0x100/0x2a0
        [78912.414078]  btrfs_scrub_dev+0x2ef/0x720 [btrfs]
        [78912.414535]  ? __sb_start_write+0xd4/0x1c0
        [78912.414963]  ? mnt_want_write_file+0x24/0x50
        [78912.415403]  btrfs_ioctl+0x17fb/0x3120 [btrfs]
        [78912.415832]  ? lock_acquire+0xa6/0x190
        [78912.416256]  ? do_vfs_ioctl+0xa2/0x6f0
        [78912.416685]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
        [78912.417116]  do_vfs_ioctl+0xa2/0x6f0
        [78912.417534]  ? __fget+0x113/0x200
        [78912.417954]  ksys_ioctl+0x70/0x80
        [78912.418369]  __x64_sys_ioctl+0x16/0x20
        [78912.418812]  do_syscall_64+0x60/0x1b0
        [78912.419231]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [78912.419644] RIP: 0033:0x7f3880252dd7
        (...)
        [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7
        [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003
        [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000
        [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000
        [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040
        [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]---
      
      Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(),
      only at the very end, after removing the block group item key from the
      extent tree (and removing the free space tree entry if we are using the
      free space tree feature).
      
      Fixes: 04216820 ("Btrfs: fix race between fs trimming and block group remove/allocation")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d064876
    • Shirish S's avatar
      drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc · f276beb3
      Shirish S authored
      [ Upstream commit 517b91f4 ]
      
      [What]
      readptr read always returns zero, since most likely
      these blocks are either power or clock gated.
      
      [How]
      fetch rptr after amdgpu_ring_alloc() which informs
      the power management code that the block is about to be
      used and hence the gating is turned off.
      Signed-off-by: default avatarLouis Li <Ching-shih.Li@amd.com>
      Signed-off-by: default avatarShirish S <shirish.s@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f276beb3
    • Louis Li's avatar
      drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) · 7abeffff
      Louis Li authored
      [ Upstream commit ce0e22f5 ]
      
      [What]
      vce ring test fails consistently during resume in s3 cycle, due to
      mismatch read & write pointers.
      On debug/analysis its found that rptr to be compared is not being
      correctly updated/read, which leads to this failure.
      Below is the failure signature:
      	[drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed
      	[drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block <vce_v3_0> failed -110
      	[drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed (-110).
      
      [How]
      fetch rptr appropriately, meaning move its read location further down
      in the code flow.
      With this patch applied the s3 failure is no more seen for >5k s3 cycles,
      which otherwise is pretty consistent.
      
      V2: remove reduntant fetch of rptr
      Signed-off-by: default avatarLouis Li <Ching-shih.Li@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7abeffff
    • Peter Xu's avatar
      kvm: Check irqchip mode before assign irqfd · d5f65393
      Peter Xu authored
      [ Upstream commit 654f1f13 ]
      
      When assigning kvm irqfd we didn't check the irqchip mode but we allow
      KVM_IRQFD to succeed with all the irqchip modes.  However it does not
      make much sense to create irqfd even without the kernel chips.  Let's
      provide a arch-dependent helper to check whether a specific irqfd is
      allowed by the arch.  At least for x86, it should make sense to check:
      
      - when irqchip mode is NONE, all irqfds should be disallowed, and,
      
      - when irqchip mode is SPLIT, irqfds that are with resamplefd should
        be disallowed.
      
      For either of the case, previously we'll silently ignore the irq or
      the irq ack event if the irqchip mode is incorrect.  However that can
      cause misterious guest behaviors and it can be hard to triage.  Let's
      fail KVM_IRQFD even earlier to detect these incorrect configurations.
      
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Radim Krčmář <rkrcmar@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5f65393
    • Kent Russell's avatar
      drm/amdkfd: Add missing Polaris10 ID · 90772cf5
      Kent Russell authored
      [ Upstream commit 0a5a9c27 ]
      
      This was added to amdgpu but was missed in amdkfd
      Signed-off-by: default avatarKent Russell <kent.russell@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.rg
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90772cf5
    • Eugeniy Paltsev's avatar
      ARC: mm: SIGSEGV userspace trying to access kernel virtual memory · cacbc853
      Eugeniy Paltsev authored
      [ Upstream commit a8c715b4 ]
      
      As of today if userspace process tries to access a kernel virtual addres
      (0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already
      exists, that process hangs instead of being killed with SIGSEGV
      
      Fix that by ensuring that do_page_fault() handles kenrel vaddr only if
      in kernel mode.
      
      And given this, we can also simplify the code a bit. Now a vmalloc fault
      implies kernel mode so its failure (for some reason) can reuse the
      @no_context label and we can remove @bad_area_nosemaphore.
      
      Reproduce user test for original problem:
      
      ------------------------>8-----------------
       #include <stdlib.h>
       #include <stdint.h>
      
       int main(int argc, char *argv[])
       {
       	volatile uint32_t temp;
      
       	temp = *(uint32_t *)(0x70000000);
       }
      ------------------------>8-----------------
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cacbc853
    • Eugeniy Paltsev's avatar
      ARC: mm: fix uninitialised signal code in do_page_fault · 7edfa9c9
      Eugeniy Paltsev authored
      [ Upstream commit 121e38e5 ]
      
      Commit 15773ae9 ("signal/arc: Use force_sig_fault where
      appropriate") introduced undefined behaviour by leaving si_code
      unitiailized and leaking random kernel values to user space.
      
      Fixes: 15773ae9 ("signal/arc: Use force_sig_fault where appropriate")
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7edfa9c9
    • Eric W. Biederman's avatar
      0828438e
    • Milan Broz's avatar
      dm crypt: move detailed message into debug level · fcb2f1e2
      Milan Broz authored
      [ Upstream commit 7a1cd723 ]
      
      The information about tag size should not be printed without debug info
      set. Also print device major:minor in the error message to identify the
      device instance.
      
      Also use rate limiting and debug level for info about used crypto API
      implementaton.  This is important because during online reencryption
      the existing message saturates syslog (because we are moving hotzone
      across the whole device).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fcb2f1e2
    • Long Li's avatar
      cifs: smbd: take an array of reqeusts when sending upper layer data · 96b44c20
      Long Li authored
      [ Upstream commit 4739f232 ]
      
      To support compounding, __smb_send_rqst() now sends an array of requests to
      the transport layer.
      Change smbd_send() to take an array of requests, and send them in as few
      packets as possible.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      96b44c20
    • Jisheng Zhang's avatar
      PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code · 3f27a14b
      Jisheng Zhang authored
      [ Upstream commit e6fdd3bf ]
      
      Use devm_pci_alloc_host_bridge() to simplify the error code path.  This
      also fixes a leak in the dw_pcie_host_init() error path.
      Signed-off-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarGustavo Pimentel <gustavo.pimentel@synopsys.com>
      CC: stable@vger.kernel.org	# v4.13+
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3f27a14b
    • Adrian Hunter's avatar
      mmc: sdhci-pci: Add support for Intel CML · 842da8fa
      Adrian Hunter authored
      [ Upstream commit 765c5967 ]
      
      Add PCI Ids for Intel CML.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      842da8fa
    • Ming Lei's avatar
      blk-mq: free hw queue's resource in hctx's release handler · e238e6dc
      Ming Lei authored
      [ Upstream commit c7e2d94b ]
      
      Once blk_cleanup_queue() returns, tags shouldn't be used any more,
      because blk_mq_free_tag_set() may be called. Commit 45a9c9d9
      ("blk-mq: Fix a use-after-free") fixes this issue exactly.
      
      However, that commit introduces another issue. Before 45a9c9d9,
      we are allowed to run queue during cleaning up queue if the queue's
      kobj refcount is held. After that commit, queue can't be run during
      queue cleaning up, otherwise oops can be triggered easily because
      some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue().
      
      We have invented ways for addressing this kind of issue before, such as:
      
      	8dc765d4 ("SCSI: fix queue cleanup race before queue initialization is done")
      	c2856ae2 ("blk-mq: quiesce queue before freeing queue")
      
      But still can't cover all cases, recently James reports another such
      kind of issue:
      
      	https://marc.info/?l=linux-scsi&m=155389088124782&w=2
      
      This issue can be quite hard to address by previous way, given
      scsi_run_queue() may run requeues for other LUNs.
      
      Fixes the above issue by freeing hctx's resources in its release handler, and this
      way is safe becasue tags isn't needed for freeing such hctx resource.
      
      This approach follows typical design pattern wrt. kobject's release handler.
      
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: James Smart <james.smart@broadcom.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: linux-scsi@vger.kernel.org,
      Cc: Martin K . Petersen <martin.petersen@oracle.com>,
      Cc: Christoph Hellwig <hch@lst.de>,
      Cc: James E . J . Bottomley <jejb@linux.vnet.ibm.com>,
      Reported-by: default avatarJames Smart <james.smart@broadcom.com>
      Fixes: 45a9c9d9 ("blk-mq: Fix a use-after-free")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJames Smart <james.smart@broadcom.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e238e6dc
    • Yufen Yu's avatar
      dm mpath: fix missing call of path selector type->end_io · 69409854
      Yufen Yu authored
      [ Upstream commit 5de719e3 ]
      
      After commit 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via
      blk_insert_cloned_request feedback"), map_request() will requeue the tio
      when issued clone request return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE.
      
      Thus, if device driver status is error, a tio may be requeued multiple
      times until the return value is not DM_MAPIO_REQUEUE.  That means
      type->start_io may be called multiple times, while type->end_io is only
      called when IO complete.
      
      In fact, even without commit 396eaf21, setup_clone() failure can
      also cause tio requeue and associated missed call to type->end_io.
      
      The service-time path selector selects path based on in_flight_size,
      which is increased by st_start_io() and decreased by st_end_io().
      Missed calls to st_end_io() can lead to in_flight_size count error and
      will cause the selector to make the wrong choice.  In addition,
      queue-length path selector will also be affected.
      
      To fix the problem, call type->end_io in ->release_clone_rq before tio
      requeue.  map_info is passed to ->release_clone_rq() for map_request()
      error path that result in requeue.
      
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Cc: stable@vger.kernl.org
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      69409854
    • Lyude Paul's avatar
      PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary · 0fe09701
      Lyude Paul authored
      [ Upstream commit e0547c81 ]
      
      On ThinkPad P50 SKUs with an Nvidia Quadro M1000M instead of the M2000M
      variant, the BIOS does not always reset the secondary Nvidia GPU during
      reboot if the laptop is configured in Hybrid Graphics mode.  The reason is
      unknown, but the following steps and possibly a good bit of patience will
      reproduce the issue:
      
        1. Boot up the laptop normally in Hybrid Graphics mode
        2. Make sure nouveau is loaded and that the GPU is awake
        3. Allow the Nvidia GPU to runtime suspend itself after being idle
        4. Reboot the machine, the more sudden the better (e.g. sysrq-b may help)
        5. If nouveau loads up properly, reboot the machine again and go back to
           step 2 until you reproduce the issue
      
      This results in some very strange behavior: the GPU will be left in exactly
      the same state it was in when the previously booted kernel started the
      reboot.  This has all sorts of bad side effects: for starters, this
      completely breaks nouveau starting with a mysterious EVO channel failure
      that happens well before we've actually used the EVO channel for anything:
      
        nouveau 0000:01:00.0: disp: chid 0 mthd 0000 data 00000400 00001000 00000002
      
      This causes a timeout trying to bring up the GR ctx:
      
        nouveau 0000:01:00.0: timeout
        WARNING: CPU: 0 PID: 12 at drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c:1547 gf100_grctx_generate+0x7b2/0x850 [nouveau]
        Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET82W (1.55 ) 12/18/2018
        Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
        ...
        nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
        nouveau 0000:01:00.0: gr: wait for idle timeout (en: 1, ctxsw: 0, busy: 1)
        nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000008000 engine 00 [GR] client 15 [HUB/SCC_NB] reason c4 [] on channel -1 [0000000000 unknown]
      
      The GPU never manages to recover.  Booting without loading nouveau causes
      issues as well, since the GPU starts sending spurious interrupts that cause
      other device's IRQs to get disabled by the kernel:
      
        irq 16: nobody cared (try booting with the "irqpoll" option)
        ...
        handlers:
        [<000000007faa9e99>] i801_isr [i2c_i801]
        Disabling IRQ #16
        ...
        serio: RMI4 PS/2 pass-through port at rmi4-00.fn03
        i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.4: Transaction timeout
        rmi4_f03 rmi4-00.fn03: rmi_f03_pt_write: Failed to write to F03 TX register (-110).
        i801_smbus 0000:00:1f.4: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.4: Transaction timeout
        rmi4_physical rmi4-00: rmi_driver_set_irq_bits: Failed to change enabled interrupts!
      
      This causes the touchpad and sometimes other things to get disabled.
      
      Since this happens without nouveau, we can't fix this problem from nouveau
      itself.
      
      Add a PCI quirk for the specific P50 variant of this GPU.  Make sure the
      GPU is advertising NoReset- so we don't reset the GPU when the machine is
      in Dedicated graphics mode (where the GPU being initialized by the BIOS is
      normal and expected).  Map the GPU MMIO space and read the magic 0x2240c
      register, which will have bit 1 set if the device was POSTed during a
      previous boot.  Once we've confirmed all of this, reset the GPU and
      re-disable it - bringing it back to a healthy state.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=203003
      Link: https://lore.kernel.org/lkml/20190212220230.1568-1-lyude@redhat.comSigned-off-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: nouveau@lists.freedesktop.org
      Cc: dri-devel@lists.freedesktop.org
      Cc: Karol Herbst <kherbst@redhat.com>
      Cc: Ben Skeggs <skeggsb@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0fe09701
    • Logan Gunthorpe's avatar
      PCI: Add macro for Switchtec quirk declarations · 5659dfca
      Logan Gunthorpe authored
      [ Upstream commit 01d5d7fa ]
      
      Add SWITCHTEC_QUIRK() to reduce redundancy in declaring devices that use
      quirk_switchtec_ntb_dma_alias().
      
      By itself, this is no functional change, but a subsequent patch updates
      SWITCHTEC_QUIRK() to fix ad281ecf ("PCI: Add DMA alias quirk for
      Microsemi Switchtec NTB").
      
      Fixes: ad281ecf ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      [bhelgaas: split to separate patch]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5659dfca
    • Christoph Muellner's avatar
      dt-bindings: mmc: Add disable-cqe-dcmd property. · e4ba1578
      Christoph Muellner authored
      [ Upstream commit 28f22fb7 ]
      
      Add disable-cqe-dcmd as optional property for MMC hosts.
      This property allows to disable or not enable the direct command
      features of the command queue engine.
      Signed-off-by: default avatarChristoph Muellner <christoph.muellner@theobroma-systems.com>
      Signed-off-by: default avatarPhilipp Tomsich <philipp.tomsich@theobroma-systems.com>
      Fixes: 84362d79 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e4ba1578
    • Sowjanya Komatineni's avatar
      dt-bindings: mmc: Add supports-cqe property · eb83f9fa
      Sowjanya Komatineni authored
      [ Upstream commit c7fddbd5 ]
      
      Add supports-cqe optional property for MMC hosts.
      
      This property is used to identify the specific host controller
      supporting command queue.
      Signed-off-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Reviewed-by: default avatarThierry Reding <treding@nvidia.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eb83f9fa
    • Christian Lamparter's avatar
      ARM: dts: qcom: ipq4019: enlarge PCIe BAR range · 0a0176f9
      Christian Lamparter authored
      [ Upstream commit f3e35357 ]
      
      David Bauer reported that the VDSL modem (attached via PCIe)
      on his AVM Fritz!Box 7530 was complaining about not having
      enough space in the BAR. A closer inspection of the old
      qcom-ipq40xx.dtsi pulled from the GL-iNet repository listed:
      
      | qcom,pcie@80000 {
      |	compatible = "qcom,msm_pcie";
      |	reg = <0x80000 0x2000>,
      |	      <0x99000 0x800>,
      |	      <0x40000000 0xf1d>,
      |	      <0x40000f20 0xa8>,
      |	      <0x40100000 0x1000>,
      |	      <0x40200000 0x100000>,
      |	      <0x40300000 0xd00000>;
      |	reg-names = "parf", "phy", "dm_core", "elbi",
      |			"conf", "io", "bars";
      
      Matching the reg-names with the listed reg leads to
      <0xd00000> as the size for the "bars".
      
      Cc: stable@vger.kernel.org
      BugLink: https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg45212.htmlReported-by: default avatarDavid Bauer <mail@david-bauer.net>
      Signed-off-by: default avatarChristian Lamparter <chunkeey@gmail.com>
      Signed-off-by: default avatarAndy Gross <agross@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0a0176f9
    • Niklas Cassel's avatar
      ARM: dts: qcom: ipq4019: Fix MSI IRQ type · 445a78ea
      Niklas Cassel authored
      [ Upstream commit 97131f85 ]
      
      The databook clearly states that the MSI IRQ (msi_ctrl_int) is a level
      triggered interrupt.
      
      The msi_ctrl_int will be high for as long as any MSI status bit is set,
      thus the IRQ type should be set to IRQ_TYPE_LEVEL_HIGH, causing the
      IRQ handler to keep getting called, as long as any MSI status bit is set.
      
      A git grep shows that ipq4019 is the only SoC using snps,dw-pcie that has
      configured this IRQ incorrectly.
      
      Not having the correct IRQ type defined will cause us to lose interrupts,
      which in turn causes timeouts in the PCIe endpoint drivers.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@linaro.org>
      Reviewed-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarAndy Gross <andy.gross@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      445a78ea
    • Mathias Kresin's avatar
      ARM: dts: qcom: ipq4019: fix PCI range · df1216d8
      Mathias Kresin authored
      [ Upstream commit da89f500 ]
      
      The PCI range is invalid and PCI attached devices doen't work.
      Signed-off-by: default avatarMathias Kresin <dev@kresin.me>
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarAndy Gross <andy.gross@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df1216d8
    • Theodore Ts'o's avatar
      ext4: protect journal inode's blocks using block_validity · 2fd4629d
      Theodore Ts'o authored
      [ Upstream commit 345c0dbf ]
      
      Add the blocks which belong to the journal inode to block_validity's
      system zone so attempts to deallocate or overwrite the journal due a
      corrupted file system where the journal blocks are also claimed by
      another inode.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202879Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2fd4629d
    • Koen Vandeputte's avatar
      media: i2c: tda1997x: select V4L2_FWNODE · f10a9230
      Koen Vandeputte authored
      [ Upstream commit 5f2efda7 ]
      
      Building tda1997x fails now unless V4L2_FWNODE is selected:
      
      drivers/media/i2c/tda1997x.o: in function `tda1997x_parse_dt'
      undefined reference to `v4l2_fwnode_endpoint_parse'
      
      While at it, also sort the selections alphabetically
      
      Fixes: 9ac0038d ("media: i2c: Add TDA1997x HDMI receiver driver")
      Signed-off-by: default avatarKoen Vandeputte <koen.vandeputte@ncentric.com>
      Cc: stable@vger.kernel.org # v4.17+
      Acked-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f10a9230
    • ZhangXiaoxu's avatar
      cifs: Fix lease buffer length error · 4061e662
      ZhangXiaoxu authored
      [ Upstream commit b57a55e2 ]
      
      There is a KASAN slab-out-of-bounds:
      BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0
      Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539
      
      CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
                  rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       dump_stack+0xdd/0x12a
       print_address_description+0xa7/0x540
       kasan_report+0x1ff/0x550
       check_memory_region+0x2f1/0x310
       memcpy+0x2f/0x80
       _copy_from_iter_full+0x783/0xaa0
       tcp_sendmsg_locked+0x1840/0x4140
       tcp_sendmsg+0x37/0x60
       inet_sendmsg+0x18c/0x490
       sock_sendmsg+0xae/0x130
       smb_send_kvec+0x29c/0x520
       __smb_send_rqst+0x3ef/0xc60
       smb_send_rqst+0x25a/0x2e0
       compound_send_recv+0x9e8/0x2af0
       cifs_send_recv+0x24/0x30
       SMB2_open+0x35e/0x1620
       open_shroot+0x27b/0x490
       smb2_open_op_close+0x4e1/0x590
       smb2_query_path_info+0x2ac/0x650
       cifs_get_inode_info+0x1058/0x28f0
       cifs_root_iget+0x3bb/0xf80
       cifs_smb3_do_mount+0xe00/0x14c0
       cifs_do_mount+0x15/0x20
       mount_fs+0x5e/0x290
       vfs_kern_mount+0x88/0x460
       do_mount+0x398/0x31e0
       ksys_mount+0xc6/0x150
       __x64_sys_mount+0xea/0x190
       do_syscall_64+0x122/0x590
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      It can be reproduced by the following step:
        1. samba configured with: server max protocol = SMB2_10
        2. mount -o vers=default
      
      When parse the mount version parameter, the 'ops' and 'vals'
      was setted to smb30,  if negotiate result is smb21, just
      update the 'ops' to smb21, but the 'vals' is still smb30.
      When add lease context, the iov_base is allocated with smb21
      ops, but the iov_len is initiallited with the smb30. Because
      the iov_len is longer than iov_base, when send the message,
      copy array out of bounds.
      
      we need to keep the 'ops' and 'vals' consistent.
      
      Fixes: 9764c02f ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)")
      Fixes: d5c7076b ("smb3: add smb3.1.1 to default dialect list")
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4061e662
    • Sean Christopherson's avatar
      KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels · df5d4ea2
      Sean Christopherson authored
      [ Upstream commit b68f3cc7 ]
      
      Invoking the 64-bit variation on a 32-bit kenrel will crash the guest,
      trigger a WARN, and/or lead to a buffer overrun in the host, e.g.
      rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and
      thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64.
      
      KVM allows userspace to report long mode support via CPUID, even though
      the guest is all but guaranteed to crash if it actually tries to enable
      long mode.  But, a pure 32-bit guest that is ignorant of long mode will
      happily plod along.
      
      SMM complicates things as 64-bit CPUs use a different SMRAM save state
      area.  KVM handles this correctly for 64-bit kernels, e.g. uses the
      legacy save state map if userspace has hid long mode from the guest,
      but doesn't fare well when userspace reports long mode support on a
      32-bit host kernel (32-bit KVM doesn't support 64-bit guests).
      
      Since the alternative is to crash the guest, e.g. by not loading state
      or explicitly requesting shutdown, unconditionally use the legacy SMRAM
      save state map for 32-bit KVM.  If a guest has managed to get far enough
      to handle SMIs when running under a weird/buggy userspace hypervisor,
      then don't deliberately crash the guest since there are no downsides
      (from KVM's perspective) to allow it to continue running.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df5d4ea2
    • WANG Chao's avatar
      x86/kvm: move kvm_load/put_guest_xcr0 into atomic context · 7a74d806
      WANG Chao authored
      [ Upstream commit 1811d979 ]
      
      guest xcr0 could leak into host when MCE happens in guest mode. Because
      do_machine_check() could schedule out at a few places.
      
      For example:
      
      kvm_load_guest_xcr0
      ...
      kvm_x86_ops->run(vcpu) {
        vmx_vcpu_run
          vmx_complete_atomic_exit
            kvm_machine_check
              do_machine_check
                do_memory_failure
                  memory_failure
                    lock_page
      
      In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule
      out, host cpu has guest xcr0 loaded (0xff).
      
      In __switch_to {
           switch_fpu_finish
             copy_kernel_to_fpregs
               XRSTORS
      
      If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will
      generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in
      and tries to reinitialize fpu by restoring init fpu state. Same story as
      last #GP, except we get DOUBLE FAULT this time.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWANG Chao <chao.wang@ucloud.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7a74d806
    • Ben Gardon's avatar
      kvm: mmu: Fix overflow on kvm mmu page limit calculation · 163b24b1
      Ben Gardon authored
      [ Upstream commit bc8a3d89 ]
      
      KVM bases its memory usage limits on the total number of guest pages
      across all memslots. However, those limits, and the calculations to
      produce them, use 32 bit unsigned integers. This can result in overflow
      if a VM has more guest pages that can be represented by a u32. As a
      result of this overflow, KVM can use a low limit on the number of MMU
      pages it will allocate. This makes KVM unable to map all of guest memory
      at once, prompting spurious faults.
      
      Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch
      	introduced no new failures.
      Signed-off-by: default avatarBen Gardon <bgardon@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      163b24b1
    • Moni Shoua's avatar
      IB/mlx5: Reset access mask when looping inside page fault handler · feced628
      Moni Shoua authored
      [ Upstream commit 1abe186e ]
      
      If page-fault handler spans multiple MRs then the access mask needs to
      be reset before each MR handling or otherwise write access will be
      granted to mapped pages instead of read-only.
      
      Cc: <stable@vger.kernel.org> # 3.19
      Fixes: 7bdf65d4 ("IB/mlx5: Handle page faults")
      Reported-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      feced628
    • Dinh Nguyen's avatar
      arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's · 37222eaf
      Dinh Nguyen authored
      [ Upstream commit 8efd6365 ]
      
      The gmac ethernet driver uses the "altr,sysmgr-syscon" property to
      configure phy settings for the gmac controller.
      
      Add the "altr,sysmgr-syscon" property to all gmac nodes.
      
      This patch fixes:
      
      [    0.917530] socfpga-dwmac ff800000.ethernet: No sysmgr-syscon node found
      [    0.924209] socfpga-dwmac ff800000.ethernet: Unable to parse OF data
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      Signed-off-by: default avatarDinh Nguyen <dinguyen@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      37222eaf
    • Hans de Goede's avatar
      usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps · 3cfce8b7
      Hans de Goede authored
      [ Upstream commit 976daf9d ]
      
      PD 2.0 sinks are supposed to accept src-capabilities with a 3.0 header and
      simply ignore any src PDOs which the sink does not understand such as PPS
      but some 2.0 sinks instead ignore the entire PD_DATA_SOURCE_CAP message,
      causing contract negotiation to fail.
      
      This commit fixes such sinks not working by re-trying the contract
      negotiation with PD-2.0 source-caps messages if we don't have a contract
      after PD_N_HARD_RESET_COUNT hard-reset attempts.
      
      The problem fixed by this commit was noticed with a Type-C to VGA dongle.
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3cfce8b7
    • Chris Wilson's avatar
      drm/i915: Sanity check mmap length against object size · fba4f7c1
      Chris Wilson authored
      [ Upstream commit 000c4f90 ]
      
      We assumed that vm_mmap() would reject an attempt to mmap past the end of
      the filp (our object), but we were wrong.
      
      Applications that tried to use the mmap beyond the end of the object
      would be greeted by a SIGBUS. After this patch, those applications will
      be told about the error on creating the mmap, rather than at a random
      moment on later access.
      Reported-by: default avatarAntonio Argenziano <antonio.argenziano@intel.com>
      Testcase: igt/gem_mmap/bad-size
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Antonio Argenziano <antonio.argenziano@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190314075829.16838-1-chris@chris-wilson.co.uk
      (cherry picked from commit 794a11cb)
      Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fba4f7c1
    • Joonas Lahtinen's avatar
      drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set · 6423a2ad
      Joonas Lahtinen authored
      [ Upstream commit ebfb6977 ]
      
      Add err goto label and use it when VMA can't be established or changes
      underneath.
      
      v2:
      - Dropping Fixes: as it's indeed impossible to race an object to the
        error address. (Chris)
      v3:
      - Use IS_ERR_VALUE (Chris)
      Reported-by: default avatarAdam Zabrocki <adamza@microsoft.com>
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Adam Zabrocki <adamza@microsoft.com>
      Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v2
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-2-joonas.lahtinen@linux.intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      6423a2ad
    • Pavel Shilovsky's avatar
      CIFS: Fix leaking locked VFS cache pages in writeback retry · 778d626c
      Pavel Shilovsky authored
      [ Upstream commit 165df9a0 ]
      
      If we don't find a writable file handle when retrying writepages
      we break of the loop and do not unlock and put pages neither from
      wdata2 nor from the original wdata. Fix this by walking through
      all the remaining pages and cleanup them properly.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      778d626c
    • Pavel Shilovsky's avatar
      CIFS: Fix error paths in writeback code · fb2dabea
      Pavel Shilovsky authored
      [ Upstream commit 9a66396f ]
      
      This patch aims to address writeback code problems related to error
      paths. In particular it respects EINTR and related error codes and
      stores and returns the first error occurred during writeback.
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fb2dabea
    • Ben Dooks's avatar
      drm: add __user attribute to ptr_to_compat() · e407b58c
      Ben Dooks authored
      [ Upstream commit e552f085 ]
      
      The ptr_to_compat() call takes a "void __user *", so cast
      the compat drm calls that use it to avoid the following
      warnings from sparse:
      
      drivers/gpu/drm/drm_ioc32.c:188:39: warning: incorrect type in argument 1 (different address spaces)
      drivers/gpu/drm/drm_ioc32.c:188:39:    expected void [noderef] <asn:1>*uptr
      drivers/gpu/drm/drm_ioc32.c:188:39:    got void *[addressable] [assigned] handle
      drivers/gpu/drm/drm_ioc32.c:529:41: warning: incorrect type in argument 1 (different address spaces)
      drivers/gpu/drm/drm_ioc32.c:529:41:    expected void [noderef] <asn:1>*uptr
      drivers/gpu/drm/drm_ioc32.c:529:41:    got void *[addressable] [assigned] handle
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      Signed-off-by: default avatarSean Paul <seanpaul@chromium.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301120046.26961-1-ben.dooks@codethink.co.ukSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      e407b58c
    • Bjorn Andersson's avatar
      PCI: qcom: Don't deassert reset GPIO during probe · e1a12c3b
      Bjorn Andersson authored
      [ Upstream commit 02b485e3 ]
      
      Acquiring the reset GPIO low means that reset is being deasserted, this
      is followed almost immediately with qcom_pcie_host_init() asserting it,
      initializing it and then finally deasserting it again, for the link to
      come up.
      
      Some PCIe devices requires a minimum time between the initial deassert
      and subsequent reset cycles. In a platform that boots with the reset
      GPIO asserted this requirement is being violated by this deassert/assert
      pulse.
      
      Acquire the reset GPIO high to prevent this situation by matching the
      state to the subsequent asserted state.
      
      Fixes: 82a82383 ("PCI: qcom: Add Qualcomm PCIe controller driver")
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      [lorenzo.pieralisi@arm.com: updated commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarStanimir Varbanov <svarbanov@mm-sol.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1a12c3b
    • Bjorn Andersson's avatar
      PCI: qcom: Fix error handling in runtime PM support · be905d0f
      Bjorn Andersson authored
      [ Upstream commit 6e5da6f7 ]
      
      The driver does not cope with the fact that probe can fail in a number
      of cases after enabling runtime PM on the device; this results in
      warnings about "Unbalanced pm_runtime_enable". Furthermore if probe
      fails after invoking qcom_pcie_host_init() the power-domain will be left
      referenced.
      
      As it is not possible for the error handling in qcom_pcie_host_init() to
      handle errors happening after returning from that function the
      pm_runtime_get_sync() is moved to qcom_pcie_probe() as well.
      
      Fixes: 854b69ef ("PCI: qcom: add runtime pm support to pcie_port")
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      [lorenzo.pieralisi@arm.com: updated commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarStanimir Varbanov <svarbanov@mm-sol.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      be905d0f
    • Dan Robertson's avatar
      btrfs: init csum_list before possible free · 476ecc14
      Dan Robertson authored
      [ Upstream commit e49be14b ]
      
      The scrub_ctx csum_list member must be initialized before scrub_free_ctx
      is called. If the csum_list is not initialized beforehand, the
      list_empty call in scrub_free_csums will result in a null deref if the
      allocation fails in the for loop.
      
      Fixes: a2de733c ("btrfs: scrub")
      CC: stable@vger.kernel.org # 3.0+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarDan Robertson <dan@dlrobertson.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      476ecc14
    • Anand Jain's avatar
      btrfs: scrub: fix circular locking dependency warning · 936690bd
      Anand Jain authored
      [ Upstream commit 1cec3f27 ]
      
      This fixes a longstanding lockdep warning triggered by
      fstests/btrfs/011.
      
      Circular locking dependency check reports warning[1], that's because the
      btrfs_scrub_dev() calls the stack #0 below with, the fs_info::scrub_lock
      held. The test case leading to this warning:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /btrfs
        $ btrfs scrub start -B /btrfs
      
      In fact we have fs_info::scrub_workers_refcnt to track if the init and destroy
      of the scrub workers are needed. So once we have incremented and decremented
      the fs_info::scrub_workers_refcnt value in the thread, its ok to drop the
      scrub_lock, and then actually do the btrfs_destroy_workqueue() part. So this
      patch drops the scrub_lock before calling btrfs_destroy_workqueue().
      
        [359.258534] ======================================================
        [359.260305] WARNING: possible circular locking dependency detected
        [359.261938] 5.0.0-rc6-default #461 Not tainted
        [359.263135] ------------------------------------------------------
        [359.264672] btrfs/20975 is trying to acquire lock:
        [359.265927] 00000000d4d32bea ((wq_completion)"%s-%s""btrfs", name){+.+.}, at: flush_workqueue+0x87/0x540
        [359.268416]
        [359.268416] but task is already holding lock:
        [359.270061] 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
        [359.272418]
        [359.272418] which lock already depends on the new lock.
        [359.272418]
        [359.274692]
        [359.274692] the existing dependency chain (in reverse order) is:
        [359.276671]
        [359.276671] -> #3 (&fs_info->scrub_lock){+.+.}:
        [359.278187]        __mutex_lock+0x86/0x9c0
        [359.279086]        btrfs_scrub_pause+0x31/0x100 [btrfs]
        [359.280421]        btrfs_commit_transaction+0x1e4/0x9e0 [btrfs]
        [359.281931]        close_ctree+0x30b/0x350 [btrfs]
        [359.283208]        generic_shutdown_super+0x64/0x100
        [359.284516]        kill_anon_super+0x14/0x30
        [359.285658]        btrfs_kill_super+0x12/0xa0 [btrfs]
        [359.286964]        deactivate_locked_super+0x29/0x60
        [359.288242]        cleanup_mnt+0x3b/0x70
        [359.289310]        task_work_run+0x98/0xc0
        [359.290428]        exit_to_usermode_loop+0x83/0x90
        [359.291445]        do_syscall_64+0x15b/0x180
        [359.292598]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.294011]
        [359.294011] -> #2 (sb_internal#2){.+.+}:
        [359.295432]        __sb_start_write+0x113/0x1d0
        [359.296394]        start_transaction+0x369/0x500 [btrfs]
        [359.297471]        btrfs_finish_ordered_io+0x2aa/0x7c0 [btrfs]
        [359.298629]        normal_work_helper+0xcd/0x530 [btrfs]
        [359.299698]        process_one_work+0x246/0x610
        [359.300898]        worker_thread+0x3c/0x390
        [359.302020]        kthread+0x116/0x130
        [359.303053]        ret_from_fork+0x24/0x30
        [359.304152]
        [359.304152] -> #1 ((work_completion)(&work->normal_work)){+.+.}:
        [359.306100]        process_one_work+0x21f/0x610
        [359.307302]        worker_thread+0x3c/0x390
        [359.308465]        kthread+0x116/0x130
        [359.309357]        ret_from_fork+0x24/0x30
        [359.310229]
        [359.310229] -> #0 ((wq_completion)"%s-%s""btrfs", name){+.+.}:
        [359.311812]        lock_acquire+0x90/0x180
        [359.312929]        flush_workqueue+0xaa/0x540
        [359.313845]        drain_workqueue+0xa1/0x180
        [359.314761]        destroy_workqueue+0x17/0x240
        [359.315754]        btrfs_destroy_workqueue+0x57/0x200 [btrfs]
        [359.317245]        scrub_workers_put+0x2c/0x60 [btrfs]
        [359.318585]        btrfs_scrub_dev+0x336/0x590 [btrfs]
        [359.319944]        btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
        [359.321622]        btrfs_ioctl+0x28a4/0x2e40 [btrfs]
        [359.322908]        do_vfs_ioctl+0xa2/0x6d0
        [359.324021]        ksys_ioctl+0x3a/0x70
        [359.325066]        __x64_sys_ioctl+0x16/0x20
        [359.326236]        do_syscall_64+0x54/0x180
        [359.327379]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.328772]
        [359.328772] other info that might help us debug this:
        [359.328772]
        [359.330990] Chain exists of:
        [359.330990]   (wq_completion)"%s-%s""btrfs", name --> sb_internal#2 --> &fs_info->scrub_lock
        [359.330990]
        [359.334376]  Possible unsafe locking scenario:
        [359.334376]
        [359.336020]        CPU0                    CPU1
        [359.337070]        ----                    ----
        [359.337821]   lock(&fs_info->scrub_lock);
        [359.338506]                                lock(sb_internal#2);
        [359.339506]                                lock(&fs_info->scrub_lock);
        [359.341461]   lock((wq_completion)"%s-%s""btrfs", name);
        [359.342437]
        [359.342437]  *** DEADLOCK ***
        [359.342437]
        [359.343745] 1 lock held by btrfs/20975:
        [359.344788]  #0: 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
        [359.346778]
        [359.346778] stack backtrace:
        [359.347897] CPU: 0 PID: 20975 Comm: btrfs Not tainted 5.0.0-rc6-default #461
        [359.348983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
        [359.350501] Call Trace:
        [359.350931]  dump_stack+0x67/0x90
        [359.351676]  print_circular_bug.isra.37.cold.56+0x15c/0x195
        [359.353569]  check_prev_add.constprop.44+0x4f9/0x750
        [359.354849]  ? check_prev_add.constprop.44+0x286/0x750
        [359.356505]  __lock_acquire+0xb84/0xf10
        [359.357505]  lock_acquire+0x90/0x180
        [359.358271]  ? flush_workqueue+0x87/0x540
        [359.359098]  flush_workqueue+0xaa/0x540
        [359.359912]  ? flush_workqueue+0x87/0x540
        [359.360740]  ? drain_workqueue+0x1e/0x180
        [359.361565]  ? drain_workqueue+0xa1/0x180
        [359.362391]  drain_workqueue+0xa1/0x180
        [359.363193]  destroy_workqueue+0x17/0x240
        [359.364539]  btrfs_destroy_workqueue+0x57/0x200 [btrfs]
        [359.365673]  scrub_workers_put+0x2c/0x60 [btrfs]
        [359.366618]  btrfs_scrub_dev+0x336/0x590 [btrfs]
        [359.367594]  ? start_transaction+0xa1/0x500 [btrfs]
        [359.368679]  btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
        [359.369545]  btrfs_ioctl+0x28a4/0x2e40 [btrfs]
        [359.370186]  ? __lock_acquire+0x263/0xf10
        [359.370777]  ? kvm_clock_read+0x14/0x30
        [359.371392]  ? kvm_sched_clock_read+0x5/0x10
        [359.372248]  ? sched_clock+0x5/0x10
        [359.372786]  ? sched_clock_cpu+0xc/0xc0
        [359.373662]  ? do_vfs_ioctl+0xa2/0x6d0
        [359.374552]  do_vfs_ioctl+0xa2/0x6d0
        [359.375378]  ? do_sigaction+0xff/0x250
        [359.376233]  ksys_ioctl+0x3a/0x70
        [359.376954]  __x64_sys_ioctl+0x16/0x20
        [359.377772]  do_syscall_64+0x54/0x180
        [359.378841]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.380422] RIP: 0033:0x7f5429296a97
      
      Backporting to older kernels: scrub_nocow_workers must be freed the same
      way as the others.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      [ update changelog ]
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      936690bd
    • David Sterba's avatar
      btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex · ff55333f
      David Sterba authored
      [ Upstream commit 0e94c4f4 ]
      
      The scrub context is allocated with GFP_KERNEL and called from
      btrfs_scrub_dev under the fs_info::device_list_mutex. This is not safe
      regarding reclaim that could try to flush filesystem data in order to
      get the memory. And the device_list_mutex is held during superblock
      commit, so this would cause a lockup.
      
      Move the alocation and initialization before any changes that require
      the mutex.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff55333f