1. 30 Sep, 2020 2 commits
  2. 29 Sep, 2020 11 commits
  3. 25 Sep, 2020 1 commit
  4. 24 Sep, 2020 13 commits
  5. 22 Sep, 2020 9 commits
  6. 18 Sep, 2020 4 commits
    • Jason Gunthorpe's avatar
      RDMA/ucma: Rework ucma_migrate_id() to avoid races with destroy · f5449e74
      Jason Gunthorpe authored
      ucma_destroy_id() assumes that all things accessing the ctx will do so via
      the xarray. This assumption violated only in the case the FD is being
      closed, then the ctx is reached via the ctx_list. Normally this is OK
      since ucma_destroy_id() cannot run concurrenty with release(), however
      with ucma_migrate_id() is involved this can violated as the close of the
      2nd FD can run concurrently with destroy on the first:
      
                      CPU0                      CPU1
              ucma_destroy_id(fda)
                                        ucma_migrate_id(fda -> fdb)
                                             ucma_get_ctx()
              xa_lock()
               _ucma_find_context()
               xa_erase()
              xa_unlock()
                                             xa_lock()
                                              ctx->file = new_file
                                              list_move()
                                             xa_unlock()
                                            ucma_put_ctx()
      
                                         ucma_close(fdb)
                                            _destroy_id()
                                            kfree(ctx)
      
              _destroy_id()
                wait_for_completion()
                // boom, ctx was freed
      
      The ctx->file must be modified under the handler and xa_lock, and prior to
      modification the ID must be rechecked that it is still reachable from
      cur_file, ie there is no parallel destroy or migrate.
      
      To make this work remove the double locking and streamline the control
      flow. The double locking was obsoleted by the handler lock now directly
      preventing new uevents from being created, and the ctx_list cannot be read
      while holding fgets on both files. Removing the double locking also
      removes the need to check for the same file.
      
      Fixes: 88314e4d ("RDMA/cma: add support for rdma_migrate_id()")
      Link: https://lore.kernel.org/r/0-v1-05c5a4090305+3a872-ucma_syz_migrate_jgg@nvidia.com
      Reported-and-tested-by: syzbot+cc6fc752b3819e082d0c@syzkaller.appspotmail.com
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      f5449e74
    • Jason Gunthorpe's avatar
      RDMA/mlx5: Clarify what the UMR is for when creating MRs · 8383da3e
      Jason Gunthorpe authored
      Once a mkey is created it can be modified using UMR. This is desirable for
      performance reasons. However, different hardware has restrictions on what
      modifications are possible using UMR. Make sense of these checks:
      
      - mlx5_ib_can_reconfig_with_umr() returns true if the access flags can be
        altered. Most cases create MRs using 0 access flags (now made clear by
        consistent use of set_mkc_access_pd_addr_fields()), but the old logic
        here was tormented. Make it clear that this is checking if the current
        access_flags can be modified using UMR to different access_flags. It is
        always OK to use UMR to change flags that all HW supports.
      
      - mlx5_ib_can_load_pas_with_umr() returns true if UMR can be used to
        enable and update the PAS/XLT. Enabling requires updating the entity
        size, so UMR ends up completely disabled on this old hardware. Make it
        clear why it is disabled. FRWR, ODP and cache always requires
        mlx5_ib_can_load_pas_with_umr().
      
      - mlx5_ib_pas_fits_in_mr() is used to tell if an existing MR can be
        resized to hold a new PAS list. This only works for cached MR's because
        we don't store the PAS list size in other cases.
      
      To be very clear, arrange things so any pre-created MR's in the cache
      check the newly requested access_flags before allowing the MR to leave the
      cache. If UMR cannot set the required access_flags the cache fails to
      create the MR.
      
      This in turn means relaxed ordering and atomic are now correctly blocked
      early for implicit ODP on older HW.
      
      Link: https://lore.kernel.org/r/20200914112653.345244-6-leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      8383da3e
    • Jason Gunthorpe's avatar
      RDMA/mlx5: Disable IB_DEVICE_MEM_MGT_EXTENSIONS if IB_WR_REG_MR can't work · 0ec52f01
      Jason Gunthorpe authored
      set_reg_wr() always fails if !umr_modify_entity_size_disabled because
      mlx5_ib_can_use_umr() always fails. Without set_reg_wr() IB_WR_REG_MR
      doesn't work and that means the device should not advertise
      IB_DEVICE_MEM_MGT_EXTENSIONS.
      
      Fixes: 841b07f9 ("IB/mlx5: Block MR WR if UMR is not possible")
      Link: https://lore.kernel.org/r/20200914112653.345244-5-leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      0ec52f01
    • Jason Gunthorpe's avatar
      RDMA/mlx5: Make mkeys always owned by the kernel's PD when not enabled · 5eb29f0d
      Jason Gunthorpe authored
      Any mkey that is not enabled and assigned to userspace should have the PD
      set to a kernel owned PD.
      
      When cache entries are created for the first time the PDN is set to 0,
      which is probably a kernel PD, but be explicit.
      
      When a MR is registered using the hybrid reg_create with UMR xlt & enable
      the disabled mkey is pointing at the user PD, keep it pointing at the
      kernel until a UMR enables it and sets the user PD.
      
      Fixes: 9ec4483a ("IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache")
      Link: https://lore.kernel.org/r/20200914112653.345244-4-leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      5eb29f0d