1. 26 May, 2016 28 commits
  2. 25 May, 2016 5 commits
    • Mark Bloch's avatar
      IB/IPoIB: Allow setting the device address · 492a7e67
      Mark Bloch authored
      In IB networks, and specifically in IPoIB/rdmacm traffic, the device
      address of an IPoIB interface is used as a means to exchange information
      between nodes needed for communication.
      
      Currently an IPoIB interface will always be created with a device
      address based on its node GUID without a way to change that.
      
      This change adds the ability to set the device address of an IPoIB
      interface by value. We use the set mac address ndo to do that.
      
      The flow should be broken down to two:
      1) The GID value is already in the GID table,
         in this case the interface will be able to set carrier up.
      
      2) The GID value is not yet in the GID table,
         in this case the interface won't try to join the multicast group
         and will wait (listen on GID_CHANGE event) until the GID is inserted.
      
      In order to track those changes, we add a new flag:
      * IPOIB_FLAG_DEV_ADDR_SET.
      
      When set, it means the dev_addr is a based on a value in the gid
      table. this bit will be cleared upon a dev_addr change triggered
      by the user and set after validation.
      
      Per IB spec the port GUID can't change if the module is loaded.
      port GUID is the basis for GID at index 0 which is the basis for
      the default device address of a ipoib interface.
      
      The issue is that there are devices that don't follow the spec,
      they change the port GUID while HCA is powered on, so in order
      not to break userspace applications. We need to check if the
      user wanted to control the device address and we assume that
      if he sets the device address back to be based on GID index 0,
      he no longer wishs to control it.
      
      In order to track this, we add an additional flag:
      * IPOIB_FLAG_DEV_ADDR_CTRL
      
      When setting the device address, there is no validation of the upper
      twelve bytes of the device address (flags, qpn, subnet prefix) as those
      bytes are not under the control of the user.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      492a7e67
    • Erez Shitrit's avatar
      IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join · 3b561130
      Erez Shitrit authored
      Check (via an SA query) if the SM supports the new option for SendOnly
      multicast joins.
      If the SM supports that option it will use the new join state to create
      such multicast group.
      If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
      join for SendOnly MCG, use the correct state if supported.
      
      This check is performed at every invocation of mcast_restart task, to be
      sure that the driver stays in sync with the current state of the SM.
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3b561130
    • Erez Shitrit's avatar
      IB/core: Support new type of join-state for multicast · cd6e9b7e
      Erez Shitrit authored
      There are four types for MCG, FullMember, NonMember, SendOnlyNonMember,
      and the new added type: SendOnlyFullMember.
      Add support for the new SendOnlyFullMember join state.
      
      The new type allows host to send join request as sendonly, it will cause the
      group to be created but without getting packets from this multicast back to the
      host.
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      cd6e9b7e
    • Erez Shitrit's avatar
      IB/SA Agent: Add support for SA agent get ClassPortInfo · 628e6f75
      Erez Shitrit authored
      New SA query function to return the ClassPortInfo struct from the SA.
      If the SM supports FullMemberSendOnly mode for MCG's, it sets a
      capability bit in the capability_mask2 field of the response.
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      628e6f75
    • Erez Shitrit's avatar
      IB/core: Introduce capabilitymask2 field in ClassPortInfo mad · 507f6afa
      Erez Shitrit authored
      Change struct ib_class_port_info to conform to IB Spec 1.3
      That in order to get specific capability mask from ClassPortInfo mad.
      
      >From the IB Spec, ClassPortInfo section:
              "CapabilityMask2 Bits 0-26: Additional class-specific capabilities...
               RespTimeValue the rest 5 bits"
      
      The new struct now has one field for capabilitymask2 (previously was the
      reserved field) and the resp_time field.
      
      And it fixes up qib and srpt, use of the field repurposed to be used as
      capabilitymask2:
      IB/qib: Change pma_get_classportinfo
      IB/srpt: Adjust the use of ib_class_port_info
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarHal Rosenstock <hal@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      507f6afa
  3. 24 May, 2016 7 commits
    • Mark Bloch's avatar
      IB/core: Add IP to GID netlink offload · ae43f828
      Mark Bloch authored
      There is an assumption that rdmacm is used only between nodes
      in the same IB subnet, this why ARP resolution can be used to turn
      IP to GID in rdmacm.
      
      When dealing with IB communication between subnets this assumption
      is no longer valid. ARP resolution will get us the next hop device
      address and not the peer node's device address.
      
      To solve this issue, we will check user space if it can provide the
      GID of the peer node, and fail if not.
      
      We add a sequence number to identify each request and fill in the GID
      upon answer from userspace.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      ae43f828
    • Mark Bloch's avatar
      IB/core: Register SA ibnl client during ib_core initialization · 735c631a
      Mark Bloch authored
      Move SA ibnl client registration to ib_core module init.
      This will allow us to register a single client to handle
      all RDMA_NL_LS operations and make it SA independent.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      735c631a
    • Mark Bloch's avatar
      IB/netlink: Add a new local service operation · c34d3761
      Mark Bloch authored
      This commits adds a new RDMA local service operation:
      - IP to GID resolution.
      
      The client request would include the ifindex of the outgoing interface
      and would place in an attribute (LS_NLA_TYPE_IPV4 or LS_NLA_TYPE_IPV6)
      the destnation IP.
      
      The local service would answer with a message that has the attribute:
      - LS_NLA_TYPE_DGID - The destination GID.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c34d3761
    • Mark Bloch's avatar
      IB/SA: Integrate ib_sa module into ib_core module · c2e49c92
      Mark Bloch authored
      Consolidate ib_sa into ib_core, this commit eliminates
      ib_sa.ko and makes it part of ib_core.ko
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c2e49c92
    • Mark Bloch's avatar
      IB/MAD: Integrate ib_mad module into ib_core module · 4c2cb422
      Mark Bloch authored
      Consolidate ib_mad into ib_core, this commit eliminates
      ib_mad.ko and makes it part of ib_core.ko
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4c2cb422
    • Leon Romanovsky's avatar
      IB/core: Integrate IB address resolution module into core · e3f20f02
      Leon Romanovsky authored
      IB address resolution is declared as a module (ib_addr.ko) which loads
      itself before IB core module (ib_core.ko).
      
      It causes to the scenario where IB netlink which is initialized by IB
      core can't be used by ib_addr.ko.
      
      In order to solve it, we are converting ib_addr.ko to be part of
      IB core module.
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      e3f20f02
    • Honggang Li's avatar
      RDMA/cxgb3: device driver frees DMA memory with different size · 0de4cbb3
      Honggang Li authored
      [  598.852037] ------------[ cut here ]------------
      [  598.856698] WARNING: at lib/dma-debug.c:887 check_unmap+0xf8/0x920()
      [  598.863079] cxgb3 0000:01:00.0: DMA-API: device driver frees DMA memory with different size [device address=0x0000000003310000] [map size=17 bytes] [unmap size=16 bytes]
      [  598.878265] Modules linked in: xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad kvm_amd kvm ipmi_devintf ipmi_ssif dcdbas pcspkr ipmi_si sg ipmi_msghandler acpi_power_meter amd64_edac_mod shpchp edac_core sp5100_tco k10temp edac_mce_amd i2c_piix4 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ata_generic iw_cxgb3 pata_acpi ib_core ib_addr mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm pata_atiixp drm ahci libahci serio_raw i2c_core cxgb3 libata bnx2 mdio dm_mirror dm_region_hash dm_log dm_mod
      [  598.946822] CPU: 3 PID: 11820 Comm: cmtime Not tainted 3.10.0-327.el7.x86_64.debug #1
      [  598.954681] Hardware name: Dell Inc. PowerEdge R415/0GXH08, BIOS 2.0.2 10/22/2012
      [  598.962193]  ffff8808077479a8 000000000381a432 ffff880807747960 ffffffff81700918
      [  598.969663]  ffff880807747998 ffffffff8108b6c0 ffff880807747a80 ffff8808063f55c0
      [  598.977132]  ffffffff833ca850 0000000000000282 ffff88080b1bb800 ffff880807747a00
      [  598.984602] Call Trace:
      [  598.987062]  [<ffffffff81700918>] dump_stack+0x19/0x1b
      [  598.992224]  [<ffffffff8108b6c0>] warn_slowpath_common+0x70/0xb0
      [  598.998254]  [<ffffffff8108b75c>] warn_slowpath_fmt+0x5c/0x80
      [  599.004033]  [<ffffffff813903b8>] check_unmap+0xf8/0x920
      [  599.009369]  [<ffffffff81025959>] ? sched_clock+0x9/0x10
      [  599.014702]  [<ffffffff81390cee>] debug_dma_free_coherent+0x7e/0xa0
      [  599.021008]  [<ffffffffa01ece2c>] cxio_destroy_cq+0xcc/0x160 [iw_cxgb3]
      [  599.027654]  [<ffffffffa01e8da0>] iwch_destroy_cq+0xf0/0x140 [iw_cxgb3]
      [  599.034307]  [<ffffffffa01c4bfe>] ib_destroy_cq+0x1e/0x30 [ib_core]
      [  599.040601]  [<ffffffffa04ff2d2>] ib_uverbs_close+0x302/0x4d0 [ib_uverbs]
      [  599.047417]  [<ffffffff812335a2>] __fput+0x102/0x310
      [  599.052401]  [<ffffffff8123388e>] ____fput+0xe/0x10
      [  599.057297]  [<ffffffff810bbde4>] task_work_run+0xb4/0xe0
      [  599.062719]  [<ffffffff81092a84>] do_exit+0x304/0xc60
      [  599.067789]  [<ffffffff81025905>] ? native_sched_clock+0x35/0x80
      [  599.073820]  [<ffffffff81025959>] ? sched_clock+0x9/0x10
      [  599.079153]  [<ffffffff8170a49c>] ? _raw_spin_unlock_irq+0x2c/0x50
      [  599.085358]  [<ffffffff8109346c>] do_group_exit+0x4c/0xc0
      [  599.090779]  [<ffffffff810a8661>] get_signal_to_deliver+0x2e1/0x960
      [  599.097071]  [<ffffffff8101c497>] do_signal+0x57/0x6e0
      [  599.102229]  [<ffffffff81714bd1>] ? sysret_signal+0x5/0x4e
      [  599.107738]  [<ffffffff8101cb7f>] do_notify_resume+0x5f/0xb0
      [  599.113418]  [<ffffffff81714e7d>] int_signal+0x12/0x17
      [  599.118576] ---[ end trace 1e4653102e7e7019 ]---
      [  599.123211] Mapped at:
      [  599.125577]  [<ffffffff8138ed8b>] debug_dma_alloc_coherent+0x2b/0x80
      [  599.131968]  [<ffffffffa01ec862>] cxio_create_cq+0xf2/0x1f0 [iw_cxgb3]
      [  599.139920]  [<ffffffffa01e9c05>] iwch_create_cq+0x105/0x4e0 [iw_cxgb3]
      [  599.147895]  [<ffffffffa0500584>] create_cq.constprop.14+0x184/0x2e0 [ib_uverbs]
      [  599.156649]  [<ffffffffa05027fb>] ib_uverbs_create_cq+0x10b/0x140 [ib_uverbs]
      
      Fixes: b955150e ('RDMA/cxgb3: When a user QP is marked in error, also mark the CQs in error')
      Signed-off-by: default avatarHonggang Li <honli@redhat.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      0de4cbb3