1. 22 Sep, 2017 11 commits
    • Shiraz Saleem's avatar
      i40iw: Prevent multiple netdev event notifier registrations · 47fb3c16
      Shiraz Saleem authored
      Netdev event notifier registration/de-registration is not
      synchronized with a lock and there is a possibility of a
      duplicate registration of notifier before the unregister
      completes.
      
      Register netdev event notifiers during module init and
      de-register them at module exit.
      
      This avoids the need to tie the registration to first netdev
      client interface open and de-registration to last client
      interface close and the synchronization to achieve it.
      
      This also fixes a crash due to duplicate registration.
      
      BUG: unable to handle kernel paging request at ffffffffa0d60388
      IP: [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
      PGD 190d067 PUD 190e063 PMD 76c840067 PTE 0
      Oops: 0000 [#1] SMP
      Modules linked in: i40e(OF-) fuse btrfs zlib_deflate raid6_pq xor vfat msdos
      [..]
      e1000e vxlan ip_tunnel ptp pps_core i2c_core video [last unloaded: i40iw]
      CPU: 1 PID: 27101 Comm: modprobe Tainted: GF       W  O--------------   3.10.0-229.el7.x86_64 #1
      Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F7 01/17/2014
      task: ffff88076e8a96c0 ti: ffff8806959c8000 task.ti: ffff8806959c8000
      RIP: 0010:[<ffffffff8160f75d>]  [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
      RSP: 0018:ffff8806959cbb38  EFLAGS: 00010282
      RAX: ffffffffa0d60380 RBX: 00000000fffffffd RCX: 0000000000000000
      0708] RDX: 0000000000000000 RSI: ffff88081227a000 RDI: 0000000000000002
      RBP: ffff8806959cbb60 R08: 0000000000000246 R09: 000000000000700c
      R10: ffff88080e16ea40 R11: 00000000000ae8df R12: ffffffffa0d60380
      R13: 0000000000000002 R14: ffff88076e738800 R15: 0000000000000000
      FS:  00007f604ef4a740(0000) GS:ffff88083e240000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa0d60388 CR3: 0000000753cd2000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
      ffffffff819e73a0 0000000000000000 0000000000000002 ffff88076e738800
      00000000ffffffff ffff8806959cbba0 ffffffff8109d61d 0000000000000000
      0000000000000000 ffff88076e738800 0000000000000000 ffff88076e738800
      Call Trace:
      [<ffffffff8109d61d>] __blocking_notifier_call_chain+0x4d/0x70
      [<ffffffff8109d656>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff8156b9e4>] __inet_del_ifa+0x154/0x2b0
      [<ffffffff8156d102>] inetdev_event+0x182/0x530
      [<ffffffff8160f76c>] notifier_call_chain+0x4c/0x70
      [<ffffffff8109d446>] raw_notifier_call_chain+0x16/0x20
      [<ffffffff814f71fd>] call_netdevice_notifiers+0x2d/0x60
      [<ffffffff814f8845>] rollback_registered_many+0x105/0x220
      [<ffffffff814f89a0>] rollback_registered+0x40/0x70
      [<ffffffff814f9c88>] unregister_netdevice_queue+0x48/0x80
      [<ffffffff814f9cdc>] unregister_netdev+0x1c/0x30
      [<ffffffffa0067139>] i40e_vsi_release+0x2a9/0x2b0 [i40e]
      [<ffffffffa00674e8>] i40e_remove+0x128/0x2b0 [i40e]
      [<ffffffff813092db>] pci_device_remove+0x3b/0xb0
      [<ffffffff813d26ef>] __device_release_driver+0x7f/0xf0
      [<ffffffff813d3068>] driver_detach+0xb8/0xc0
      [<ffffffff813d22db>] bus_remove_driver+0x9b/0x120
      [<ffffffff813d36dc>] driver_unregister+0x2c/0x50
      [<ffffffff81307d4c>] pci_unregister_driver+0x2c/0x90
      [<ffffffffa008f9d0>] i40e_exit_module+0x10/0x23 [i40e]
      [<ffffffff810dad0b>] SyS_delete_module+0x16b/0x2d0
      [<ffffffff81013b0c>] ? do_notify_resume+0x9c/0xb0
      [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b
      Code: e5 41 57 4d 89 c7 41 56 49 89 d6 41 55 49 89 f5 41 54 53 89 cb
      75 14 eb 3d 0f 1f 44 00 00 83 eb 01 74 25 4d 85 e4 74 20 4c 89 e0 <4c>
      8b 60 08 4c 89 f2 4c 89 ee 48 89 c7 ff 10 4d 85 ff 74 04 41
      RIP  [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      47fb3c16
    • Shiraz Saleem's avatar
      i40iw: Fail open if there are no available MSI-X vectors · cd9100ca
      Shiraz Saleem authored
      Check number of available MSI-X vectors for i40iw.
      If there are no available vectors, fail the open.
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      cd9100ca
    • Adit Ranadive's avatar
      RDMA/vmw_pvrdma: Fix reporting correct opcodes for completion · 01df7f5a
      Adit Ranadive authored
      Since the IB_WC_BIND_MW opcode has been dropped, set the correct
      IB WC opcode explicitly.
      
      Fixes: 29c8d9eb ("IB: Add vmw_pvrdma driver")
      Reviewed-by: default avatarAditya Sarwade <asarwade@vmware.com>
      Reviewed-by: default avatarJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: default avatarAdit Ranadive <aditr@vmware.com>
      Signed-off-by: default avatarBryan Tan <bryantan@vmware.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      01df7f5a
    • Leon Romanovsky's avatar
      IB/bnxt_re: Fix frame stack compilation warning · e13547bc
      Leon Romanovsky authored
      Reduce stack size by dynamically allocating memory instead
      of declaring large struct on the stack:
      
      drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function ‘bnxt_re_query_qp’:
      drivers/infiniband/hw/bnxt_re/ib_verbs.c:1600:1: warning: the frame size of 1216 bytes is larger than 1024 bytes [-Wframe-larger-than=]
       }
       ^
      
      Cc: Selvin Xavier <selvin.xavier@broadcom.com>
      Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Acked-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Reviewed-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      e13547bc
    • Sudip Mukherjee's avatar
      IB/mlx5: fix debugfs cleanup · cbafad87
      Sudip Mukherjee authored
      If delay_drop_debugfs_init() fails in any of the operations to create
      debugfs, it is calling delay_drop_debugfs_cleanup() as part of its
      cleanup. But delay_drop_debugfs_cleanup() checks for 'dbg' and since
      we have not yet pointed 'dbg' to the debugfs we need to cleanup, the
      cleanup fails and we are left with stray debugfs elements and also a
      memory leak.
      
      Fixes: 4a5fd5d2 ("IB/mlx5: Add necessary delay drop assignment")
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Acked-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      cbafad87
    • Colin Ian King's avatar
      IB/ocrdma: fix incorrect fall-through on switch statement · 06564f60
      Colin Ian King authored
      In the case where mbox_status is OCRDMA_MBX_STATUS_FAILED and
      add_status is OCRDMA_MBX_STATUS_FAILED err_num is assigned -EAGAIN
      however the case OCRDMA_MBX_STATUS_FAILED is missing a break and
      falls through to the default case which then re-assigns err_num
      to -EFAULT.   Fix this so that err_num is assigned to -EAGAIN
      for the add_status OCRDMA_MBX_STATUS_FAILED case and -EFAULT
      otherwise.
      
      Detected by CoverityScan CID#703125 ("Missing break in switch")
      
      Fixes: fe2caefc ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      06564f60
    • Santosh Shilimkar's avatar
      IB/ipoib: Suppress the retry related completion errors · af3c79be
      Santosh Shilimkar authored
      IPoIB doesn't support transport/rnr retry schemes as per
      RFC so those errors are expected. No need to flood the
      log files with them.
      Tested-by: default avatarMichael Nowak <michael.nowak@oracle.com>
      Tested-by: default avatarRafael Alejandro Peralez <rafael.peralez@oracle.com>
      Tested-by: default avatarLiwen Huang <liwen.huang@oracle.com>
      Tested-by: default avatarHong Liu <hong.x.liu@oracle.com>
      Reviewed-by: default avatarMukesh Kacker <mukesh.kacker@oracle.com>
      Reported-by: default avatarRajiv Raja <rajiv.raja@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      af3c79be
    • Steve Wise's avatar
      iw_cxgb4: remove the stid on listen create failure · 8b1bbf36
      Steve Wise authored
      If a listen create fails, then the server tid (stid) is incorrectly left
      in the stid idr table, which can cause a touch-after-free if the stid
      is looked up and the already freed endpoint is touched.  So make sure
      and remove it in the error path.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      8b1bbf36
    • Steve Wise's avatar
      iw_cxgb4: drop listen destroy replies if no ep found · 3c8415cc
      Steve Wise authored
      If the thread waiting for a CLOSE_LISTSRV_RPL times out and bails,
      then we need to handle a subsequent CPL if it arrives and the stid has
      been released.  In this case silently drop it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3c8415cc
    • Steve Wise's avatar
      iw_cxgb4: put ep reference in pass_accept_req() · 3d318605
      Steve Wise authored
      The listening endpoint should always be dereferenced at the end of
      pass_accept_req().
      
      Fixes: f86fac79 ("RDMA/iw_cxgb4: atomic find and reference for listening endpoints")
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3d318605
    • Alex Estrin's avatar
      IB/core: Fix for core panic · e6f9bc34
      Alex Estrin authored
      Build with the latest patches resulted in panic:
      11384.486289] BUG: unable to handle kernel NULL pointer dereference at
               (null)
      [11384.486293] IP:           (null)
      [11384.486295] PGD 0
      [11384.486295] P4D 0
      [11384.486296]
      [11384.486299] Oops: 0010 [#1] SMP
      ......... snip ......
      [11384.486401] CPU: 0 PID: 968 Comm: kworker/0:1H Tainted: G        W  O
          4.13.0-a-stream-20170825 #1
      [11384.486402] Hardware name: Intel Corporation S2600WT2R/S2600WT2R,
      BIOS SE5C610.86B.01.01.0014.121820151719 12/18/2015
      [11384.486418] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
      [11384.486419] task: ffff880850579680 task.stack: ffffc90007fec000
      [11384.486420] RIP: 0010:          (null)
      [11384.486420] RSP: 0018:ffffc90007fef970 EFLAGS: 00010206
      [11384.486421] RAX: ffff88084cfe8000 RBX: ffff88084dce4000 RCX:
      ffffc90007fef978
      [11384.486422] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
      ffff88084cfe8000
      [11384.486422] RBP: ffffc90007fefab0 R08: 0000000000000000 R09:
      ffff88084dce4080
      [11384.486423] R10: ffffffffa02d7f60 R11: 0000000000000000 R12:
      ffff88105af65a00
      [11384.486423] R13: ffff88084dce4000 R14: 000000000000c000 R15:
      000000000000c000
      [11384.486424] FS:  0000000000000000(0000) GS:ffff88085f400000(0000)
      knlGS:0000000000000000
      [11384.486425] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [11384.486425] CR2: 0000000000000000 CR3: 0000000001c09000 CR4:
      00000000001406f0
      [11384.486426] Call Trace:
      [11384.486431]  ? is_valid_mcast_lid.isra.21+0xfb/0x110 [ib_core]
      [11384.486436]  ib_attach_mcast+0x6f/0xa0 [ib_core]
      [11384.486441]  ipoib_mcast_attach+0x81/0x190 [ib_ipoib]
      [11384.486443]  ipoib_mcast_join_complete+0x354/0xb40 [ib_ipoib]
      [11384.486448]  mcast_work_handler+0x330/0x6c0 [ib_core]
      [11384.486452]  join_handler+0x101/0x220 [ib_core]
      [11384.486455]  ib_sa_mcmember_rec_callback+0x54/0x80 [ib_core]
      [11384.486459]  recv_handler+0x3a/0x60 [ib_core]
      [11384.486462]  ib_mad_recv_done+0x423/0x9b0 [ib_core]
      [11384.486466]  __ib_process_cq+0x5d/0xb0 [ib_core]
      [11384.486469]  ib_cq_poll_work+0x20/0x60 [ib_core]
      [11384.486472]  process_one_work+0x149/0x360
      [11384.486474]  worker_thread+0x4d/0x3c0
      [11384.486487]  kthread+0x109/0x140
      [11384.486488]  ? rescuer_thread+0x380/0x380
      [11384.486489]  ? kthread_park+0x60/0x60
      [11384.486490]  ? kthread_park+0x60/0x60
      [11384.486493]  ret_from_fork+0x25/0x30
      [11384.486493] Code:  Bad RIP value.
      [11384.486493] Code:  Bad RIP value.
      [11384.486496] RIP:           (null) RSP: ffffc90007fef970
      [11384.486497] CR2: 0000000000000000
      [11384.486531] ---[ end trace b1acec6fb4ff6e75 ]---
      [11384.532133] Kernel panic - not syncing: Fatal exception
      [11384.536541] Kernel Offset: disabled
      [11384.969491] ---[ end Kernel panic - not syncing: Fatal exception
      [11384.976875] sched: Unexpected reschedule of offline CPU#1!
      [11384.983646] ------------[ cut here ]------------
      
      Rdma device driver may not have implemented (*get_link_layer)()
      so it can not be called directly. Should use appropriate helper function.
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Fixes: 52363335 ("IB/core: Fix the validations of a multicast LID in attach or detach operations")
      Cc: stable@kernel.org # 4.13
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      e6f9bc34
  2. 31 Aug, 2017 15 commits
    • Matan Barak's avatar
      IB/core: Expose ioctl interface through experimental Kconfig · 8eb19e8e
      Matan Barak authored
      Add CONFIG_INFINIBAND_EXP_USER_ACCESS that enables the ioctl
      interface. This interface is experimental and is subject to change.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      8eb19e8e
    • Matan Barak's avatar
      IB/core: Assign root to all drivers · 52427112
      Matan Barak authored
      In order to use the parsing tree, we need to assign the root
      to all drivers. Currently, we just assign the default parsing
      tree via ib_uverbs_add_one. The driver could override this by
      assigning a parsing tree prior to registering the device.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      52427112
    • Matan Barak's avatar
      IB/core: Add completion queue (cq) object actions · 9ee79fce
      Matan Barak authored
      Adding CQ ioctl actions:
      1. create_cq
      2. destroy_cq
      
      This requires adding the following:
      1. A specification describing the method
      	a. Handler
      	b. Attributes specification
      		Each attribute is one of the following:
      		a. PTR_IN - input data
      			    Note: This could be encoded inlined for
      				  data < 64bit
      		b. PTR_OUT - response data
      		c. IDR - idr based object
      		d. FD - fd based object
                      Blobs attributes (clauses a and b) contain their type,
      	        while objects specifications (clauses c and d)
                      contains the expected object type (for example, the
                      given id should be UVERBS_TYPE_PD) and the required
                      access (READ, WRITE, NEW or DESTROY). If a NEW is
                      required, the new object's id will be assigned to this
                      attribute. All attributes could get UA_FLAGS
                      attribute. Currently we support stating that an
      		attribute is mandatory or that the specification size
                      corresponds to a lower bound (and that this attribute
      		could be extended).
      		We currently add both default attributes and the two
      		generic UHW_IN and UHW_OUT driver specific attributes.
      2. Handler
         A handler gets a uverbs_attr_bundle. The handler developer uses
         uverbs_attr_get to fetch an attribute of a given id.
         Each of these attribute groups correspond to the specification
         group defined in the action (clauses 1.b and 1.c respectively).
         The indices of these arrays corresponds to the attribute ids
         declared in the specifications (clause 2).
      
         The handler is quite simple. It assumes the infrastructure fetched
         all objects and locked, created or destroyed them as required by
         the specification. Pointer (or blob) attributes were validated to
         match their required sizes. After the handler finished, the
         infrastructure commits or rollbacks the objects.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      9ee79fce
    • Matan Barak's avatar
      IB/core: Add legacy driver's user-data · d70724f1
      Matan Barak authored
      In this phase, we don't want to change all the drivers to use
      flexible driver's specific attributes. Therefore, we add two default
      attributes: UHW_IN and UHW_OUT. These attributes are optional in some
      methods and they encode the driver specific command data. We add
      a function that extract this data and creates the legacy udata over
      it.
      
      Driver's data should start from UVERBS_UDATA_DRIVER_DATA_FLAG. This
      turns on the first bit of the namespace, indicating this attribute
      belongs to the driver's namespace.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d70724f1
    • Matan Barak's avatar
      IB/core: Export ioctl enum types to user-space · 64b19e13
      Matan Barak authored
      Add a new ib_user_ioctl_verbs.h which exports all required ABI
      enums and structs to the user-space.
      Export the default types to user-space through this file.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      64b19e13
    • Matan Barak's avatar
      IB/core: Explicitly destroy an object while keeping uobject · 4da70da2
      Matan Barak authored
      When some objects are destroyed, we need to extract their status at
      destruction. After object's destruction, this status
      (e.g. events_reported) relies in the uobject. In order to have the
      latest and correct status, the underlying object should be destroyed,
      but we should keep the uobject alive and read this information off the
      uobject. We introduce a rdma_explicit_destroy function. This function
      destroys the class type object (for example, the IDR class type which
      destroys the underlying object as well) and then convert the uobject
      to be of a null class type. This uobject will then be destroyed as any
      other uobject once uverbs_finalize_object[s] is called.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4da70da2
    • Matan Barak's avatar
      IB/core: Add macros for declaring methods and attributes · 35410306
      Matan Barak authored
      This patch adds macros for declaring objects, methods and
      attributes. These definitions are later used by downstream patches
      to declare some of the default types.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      35410306
    • Matan Barak's avatar
      IB/core: Add uverbs merge trees functionality · 118620d3
      Matan Barak authored
      Different drivers support different features and even subset of the
      common uverbs implementation. Currently, this is handled as bitmask
      in every driver that represents which kind of methods it supports, but
      doesn't go down to attributes granularity. Moreover, drivers might
      want to add their specific types, methods and attributes to let
      their user-space counter-parts be exposed to some more efficient
      abstractions. It means that existence of different features is
      validated syntactically via the parsing infrastructure rather than
      using a complex in-handler logic.
      
      In order to do that, we allow defining features and abstractions
      as parsing trees. These per-feature parsing tree could be merged
      to an efficient (perfect-hash based) parsing tree, which is later
      used by the parsing infrastructure.
      
      To sum it up, this makes a parse tree unique for a device and
      represents only the features this particular device supports.
      This is done by having a root specification tree per feature.
      Before a device registers itself as an IB device, it merges
      all these trees into one parsing tree. This parsing tree
      is used to parse all user-space commands.
      
      A future user-space application could read this parse tree. This
      tree represents which objects, methods and attributes are
      supported by this device.
      
      This is based on the idea of
      Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      118620d3
    • Matan Barak's avatar
      IB/core: Add DEVICE object and root tree structure · 09e3ebf8
      Matan Barak authored
      This adds the DEVICE object. This object supports creating the context
      that all objects are created from. Moreover, it supports executing
      methods which are related to the device itself, such as QUERY_DEVICE.
      This is a singleton object (per file instance).
      
      All standard objects are put in the root structure. This root will later
      on be used in drivers as the source for their whole parsing tree.
      Later on, when new features are added, these drivers could mix this root
      with other customized objects.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      09e3ebf8
    • Matan Barak's avatar
      IB/core: Declare an object instead of declaring only type attributes · 5009010f
      Matan Barak authored
      Switch all uverbs_type_attrs_xxxx with DECLARE_UVERBS_OBJECT
      macros. This will be later used in order to embed the object
      specific methods in the objects as well.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5009010f
    • Matan Barak's avatar
      IB/core: Add new ioctl interface · fac9658c
      Matan Barak authored
      In this ioctl interface, processing the command starts from
      properties of the command and fetching the appropriate user objects
      before calling the handler.
      
      Parsing and validation is done according to a specifier declared by
      the driver's code. In the driver, all supported objects are declared.
      These objects are separated to different object namepsaces. Dividing
      objects to namespaces is done at initialization by using the higher
      bits of the object ids. This initialization can mix objects declared
      in different places to one parsing tree using in this ioctl interface.
      
      For each object we list all supported methods. Similarly to objects,
      methods are separated to method namespaces too. Namespacing is done
      similarly to the objects case. This could be used in order to add
      methods to an existing object.
      
      Each method has a specific handler, which could be either a default
      handler or a driver specific handler.
      Along with the handler, a bunch of attributes are specified as well.
      Similarly to objects and method, attributes are namespaced and hashed
      by their ids at initialization too. All supported attributes are
      subject to automatic fetching and validation. These attributes include
      the command, response and the method's related objects' ids.
      
      When these entities (objects, methods and attributes) are used, the
      high bits of the entities ids are used in order to calculate the hash
      bucket index. Then, these high bits are masked out in order to have a
      zero based index. Since we use these high bits for both bucketing and
      namespacing, we get a compact representation and O(1) array access.
      This is mandatory for efficient dispatching.
      
      Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length.
      Attributes could be validated through some attributes, like:
      (*) Minimum size / Exact size
      (*) Fops for FD
      (*) Object type for IDR
      
      If an IDR/fd attribute is specified, the kernel also states the object
      type and the required access (NEW, WRITE, READ or DESTROY).
      All uobject/fd management is done automatically by the infrastructure,
      meaning - the infrastructure will fail concurrent commands that at
      least one of them requires concurrent access (WRITE/DESTROY),
      synchronize actions with device removals (dissociate context events)
      and take care of reference counting (increase/decrease) for concurrent
      actions invocation. The reference counts on the actual kernel objects
      shall be handled by the handlers.
      
       objects
      +--------+
      |        |
      |        |   methods                                                                +--------+
      |        |   ns         method      method_spec                           +-----+   |len     |
      +--------+  +------+[d]+-------+   +----------------+[d]+------------+    |attr1+-> |type    |
      | object +> |method+-> | spec  +-> +  attr_buckets  +-> |default_chain+--> +-----+   |idr_type|
      +--------+  +------+   |handler|   |                |   +------------+    |attr2|   |access  |
      |        |  |      |   +-------+   +----------------+   |driver chain|    +-----+   +--------+
      |        |  |      |                                    +------------+
      |        |  +------+
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      |        |
      +--------+
      
      [d] = Hash ids to groups using the high order bits
      
      The right types table is also chosen by using the high bits from
      the ids. Currently we have either default or driver specific groups.
      
      Once validation and object fetching (or creation) completed, we call
      the handler:
      int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
                     struct uverbs_attr_bundle *ctx);
      
      ctx bundles attributes of different namespaces. Each element there
      is an array of attributes which corresponds to one namespaces of
      attributes. For example, in the usually used case:
      
       ctx                               core
      +----------------------------+     +------------+
      | core:                      +---> | valid      |
      +----------------------------+     | cmd_attr   |
      | driver:                    |     +------------+
      |----------------------------+--+  | valid      |
                                      |  | cmd_attr   |
                                      |  +------------+
                                      |  | valid      |
                                      |  | obj_attr   |
                                      |  +------------+
                                      |
                                      |  drivers
                                      |  +------------+
                                      +> | valid      |
                                         | cmd_attr   |
                                         +------------+
                                         | valid      |
                                         | cmd_attr   |
                                         +------------+
                                         | valid      |
                                         | obj_attr   |
                                         +------------+
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      fac9658c
    • Adit Ranadive's avatar
      RDMA/vmw_pvrdma: Fix a signedness · 14d6c3a8
      Adit Ranadive authored
      Fixes: 29c8d9eb ("IB: Add vmw_pvrdma driver")
      Signed-off-by: default avatarAdit Ranadive <aditr@vmware.com>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      14d6c3a8
    • Aditya Sarwade's avatar
      RDMA/vmw_pvrdma: Report network header type in WC · 72f9b089
      Aditya Sarwade authored
      We should report the network header type in the work completion so that
      the kernel can infer the right RoCE type headers.
      Reviewed-by: default avatarBryan Tan <bryantan@vmware.com>
      Signed-off-by: default avatarAditya Sarwade <asarwade@vmware.com>
      Signed-off-by: default avatarAdit Ranadive <aditr@vmware.com>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      72f9b089
    • Roland Dreier's avatar
      IB/core: Add might_sleep() annotation to ib_init_ah_from_wc() · 79364227
      Roland Dreier authored
      For RoCE, ib_init_ah_from_wc() can follow the path
      
          ib_init_ah_from_wc() ->
            rdma_addr_find_l2_eth_by_grh() ->
              rdma_resolve_ip()
      
      and rdma_resolve_ip() will sleep in kzalloc() and wait_for_completion().
      
      However, developers will not see any warnings if they use ib_init_ah_from_wc()
      in an atomic context and test only on IB, because the function doesn't
      sleep in that case.
      
      Add a might_sleep() so that lockdep will catch bugs no matter what hardware is
      used to test.
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      79364227
    • Roland Dreier's avatar
      IB/cm: Fix sleeping in atomic when RoCE is used · c7616118
      Roland Dreier authored
      A couple of places in the CM do
      
          spin_lock_irq(&cm_id_priv->lock);
          ...
          if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
      
      However when the underlying transport is RoCE, this leads to a sleeping function
      being called with the lock held - the callchain is
      
          cm_alloc_response_msg() ->
            ib_create_ah_from_wc() ->
              ib_init_ah_from_wc() ->
                rdma_addr_find_l2_eth_by_grh() ->
                  rdma_resolve_ip()
      
      and rdma_resolve_ip() starts out by doing
      
          req = kzalloc(sizeof *req, GFP_KERNEL);
      
      not to mention rdma_addr_find_l2_eth_by_grh() doing
      
          wait_for_completion(&ctx.comp);
      
      to wait for the task that rdma_resolve_ip() queues up.
      
      Fix this by moving the AH creation out of the lock.
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Reviewed-by: default avatarSean Hefty <sean.hefty@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c7616118
  3. 30 Aug, 2017 2 commits
    • Matan Barak's avatar
      IB/core: Add support to finalize objects in one transaction · f43dbebf
      Matan Barak authored
      The new ioctl based infrastructure either commits or rollbacks
      all objects of the method as one transaction. In order to do
      that, we introduce a notion of dealing with a collection of
      objects that are related to a specific method.
      
      This also requires adding a notion of a method and attribute.
      A method contains a hash of attributes, where each bucket
      contains several attributes. The attributes are hashed according
      to their namespace which resides in the four upper bits of the id.
      
      For example, an object could be a CQ, which has an action of CREATE_CQ.
      This action has multiple attributes. For example, the CQ's new handle
      and the comp_channel. Each layer in this hierarchy - objects, methods
      and attributes is split into namespaces. The basic example for that is
      one namespace representing the default entities and another one
      representing the driver specific entities.
      
      When declaring these methods and attributes, we actually declare
      their specifications. When a method is executed, we actually
      allocates some space to hold auxiliary information. This auxiliary
      information contains meta-data about the required objects, such
      as pointers to their type information, pointers to the uobjects
      themselves (if exist), etc.
      The specification, along with the auxiliary information we allocated
      and filled is given to the finalize_objects function.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      f43dbebf
    • Matan Barak's avatar
      IB/core: Add a generic way to execute an operation on a uobject · a0aa309c
      Matan Barak authored
      The ioctl infrastructure treats all user-objects in the same manner.
      It gets objects ids from the user-space and by using the object type
      and type attributes mentioned in the object specification, it executes
      this required method. Passing an object id from the user-space as
      an attribute is carried out in three stages. The first is carried out
      before the actual handler and the last is carried out afterwards.
      
      The different supported operations are read, write, destroy and create.
      In the first stage, the former three actions just fetches the object
      from the repository (by using its id) and locks it. The last action
      allocates a new uobject. Afterwards, the second stage is carried out
      when the handler itself carries out the required modification of the
      object. The last stage is carried out after the handler finishes and
      commits the result. The former two operations just unlock the object.
      Destroy calls the "free object" operation, taking into account the
      object's type and releases the uobject as well. Creation just adds the
      new uobject to the repository, making the object visible to the
      application.
      
      In order to abstract these details from the ioctl infrastructure
      layer, we add uverbs_get_uobject_from_context and
      uverbs_finalize_object functions which corresponds to the first
      and last stages respectively.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      a0aa309c
  4. 29 Aug, 2017 11 commits
  5. 28 Aug, 2017 1 commit