1. 16 Sep, 2016 1 commit
  2. 02 Sep, 2016 17 commits
  3. 26 Aug, 2016 3 commits
  4. 25 Aug, 2016 2 commits
  5. 24 Aug, 2016 7 commits
  6. 23 Aug, 2016 6 commits
  7. 22 Aug, 2016 4 commits
    • Steve Wise's avatar
      iw_cxgb4: use the MPA initiator's IRD if < our ORD · 30b03b15
      Steve Wise authored
      The i40iw initiator sends an MPA-request with ird=16 and ord=16. The cxgb4
      responder sends an MPA-reply with ord = 32 causing i40iw to terminate
      due to insufficient resources.
      
      The logic to reduce the ORD to <= peer's IRD was wrong.
      Reported-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      30b03b15
    • Steve Wise's avatar
      iw_cxgb4: limit IRD/ORD advertised to ULP by device max. · 7f446abf
      Steve Wise authored
      The i40iw initiator sends an MPA-request with ird = 63, ord = 63. The
      cxgb4 responder sends a RST.  Since the inbound ord=63 and it exceeds
      the max_ird/c4iw_max_read_depth (=32 default), chelsio decides to abort.
      
      Instead, cxgb4 should adjust the ord/ird down before presenting it to
      the ULP.
      Reported-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      7f446abf
    • Ira Weiny's avatar
      IB/hfi1: Fix mm_struct use after free · e0cf75de
      Ira Weiny authored
      Testing with CONFIG_SLUB_DEBUG_ON=y resulted in the kernel panic below.
      
      This is the result of the mm_struct sometimes being free'd prior to
      hfi1_file_close being called.
      
      This was due to the combination of 2 reasons:
      
      1) hfi1_file_close is deferred in process exit and it therefore may not
         be called synchronously with process exit.
      2) exit_mm is called prior to exit_files in do_exit.  Normally this is ok
         however, our kernel bypass code requires us to have access to the
         mm_struct for house keeping both at "normal" close time as well as at
         process exit.
      
      Therefore, the fix is to simply keep a reference to the mm_struct until
      we are done with it.
      
      [ 3006.340150] general protection fault: 0000 [#1] SMP
      [ 3006.346469] Modules linked in: hfi1 rdmavt rpcrdma ib_isert iscsi_target_mod
      ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
       ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
       ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod snd_hda_code
       c_realtek iTCO_wdt snd_hda_codec_generic iTCO_vendor_support sb_edac edac_core
       x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass c
       rct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw snd_hda_intel
       gf128mul snd_hda_codec glue_helper snd_hda_core ablk_helper sn
       d_hwdep cryptd snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pcspkr
       shpchp mei_me sg lpc_ich mei i2c_i801 mfd_core ioatdma ipmi_devi
       ntf wmi ipmi_si ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd
       grace sunrpc ip_tables ext4 jbd2 mbcache mlx4_en ib_core sr_mod s
       d_mod cdrom crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect igb
       sysimgblt fb_sys_fops ptp mlx4_core ttm isci pps_core ahci drm li
       bsas libahci dca firewire_ohci i2c_algo_bit scsi_transport_sas firewire_core
       crc_itu_t i2c_core libata [last unloaded: mlx4_ib]
       [ 3006.461759] CPU: 16 PID: 11624 Comm: mpi_stress Not tainted 4.7.0-rc5+ #1
       [ 3006.469915] Hardware name: Intel Corporation W2600CR ........../W2600CR, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
       [ 3006.483027] task: ffff8804102f0040 ti: ffff8804102f8000 task.ti: ffff8804102f8000
       [ 3006.491971] RIP: 0010:[<ffffffff810f0383>]  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
       [ 3006.501905] RSP: 0018:ffff8804102fb908  EFLAGS: 00010002
       [ 3006.508447] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
       [ 3006.517012] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880410b56a40
       [ 3006.525569] RBP: ffff8804102fb9b0 R08: 0000000000000001 R09: 0000000000000000
       [ 3006.534119] R10: ffff8804102f0040 R11: 0000000000000000 R12: 0000000000000000
       [ 3006.542664] R13: ffff880410b56a40 R14: 0000000000000000 R15: 0000000000000000
       [ 3006.551203] FS:  00007ff478c08700(0000) GS:ffff88042e200000(0000) knlGS:0000000000000000
       [ 3006.560814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [ 3006.567806] CR2: 00007f667f5109e0 CR3: 0000000001c06000 CR4: 00000000000406e0
       [ 3006.576352] Stack:
       [ 3006.579157]  ffffffff8124b819 ffffffffffffffff 0000000000000000 ffff8804102fb940
       [ 3006.588072]  0000000000000002 0000000000000000 ffff8804102f0040 0000000000000007
       [ 3006.596971]  0000000000000006 ffff8803cad6f000 0000000000000000 ffff8804102f0040
       [ 3006.605878] Call Trace:
       [ 3006.609220]  [<ffffffff8124b819>] ? uncharge_batch+0x109/0x250
       [ 3006.616382]  [<ffffffff810f2313>] lock_acquire+0xd3/0x220
       [ 3006.623056]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
       [ 3006.631593]  [<ffffffff81775579>] down_write+0x49/0x80
       [ 3006.638022]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
       [ 3006.646569]  [<ffffffffa0a30bfc>] hfi1_release_user_pages+0x7c/0xa0 [hfi1]
       [ 3006.654898]  [<ffffffffa0a2efb6>] cacheless_tid_rb_remove+0x106/0x330 [hfi1]
       [ 3006.663417]  [<ffffffff810efd36>] ? mark_held_locks+0x66/0x90
       [ 3006.670498]  [<ffffffff817771f6>] ? _raw_spin_unlock_irqrestore+0x36/0x60
       [ 3006.678741]  [<ffffffffa0a2f1ee>] tid_rb_remove+0xe/0x10 [hfi1]
       [ 3006.686010]  [<ffffffffa0a0c5d5>] hfi1_mmu_rb_unregister+0xc5/0x100 [hfi1]
       [ 3006.694387]  [<ffffffffa0a2fcb9>] hfi1_user_exp_rcv_free+0x39/0x120 [hfi1]
       [ 3006.702732]  [<ffffffffa09fc6ea>] hfi1_file_close+0x17a/0x330 [hfi1]
       [ 3006.710489]  [<ffffffff81263e9a>] __fput+0xfa/0x230
       [ 3006.716595]  [<ffffffff8126400e>] ____fput+0xe/0x10
       [ 3006.722696]  [<ffffffff810b95c6>] task_work_run+0x86/0xc0
       [ 3006.729379]  [<ffffffff81099933>] do_exit+0x323/0xc40
       [ 3006.735672]  [<ffffffff8109a2dc>] do_group_exit+0x4c/0xc0
       [ 3006.742371]  [<ffffffff810a7f55>] get_signal+0x345/0x940
       [ 3006.748958]  [<ffffffff810340c7>] do_signal+0x37/0x700
       [ 3006.755328]  [<ffffffff8127872a>] ? poll_select_set_timeout+0x5a/0x90
       [ 3006.763146]  [<ffffffff811609cb>] ? __audit_syscall_exit+0x1db/0x260
       [ 3006.770853]  [<ffffffff8110f3e3>] ? rcu_read_lock_sched_held+0x93/0xa0
       [ 3006.778765]  [<ffffffff812347a4>] ? kfree+0x1e4/0x2a0
       [ 3006.784986]  [<ffffffff8108e75a>] ? exit_to_usermode_loop+0x33/0xac
       [ 3006.792551]  [<ffffffff8108e785>] exit_to_usermode_loop+0x5e/0xac
       [ 3006.799907]  [<ffffffff81003dca>] do_syscall_64+0x12a/0x190
       [ 3006.806664]  [<ffffffff81777a7f>] entry_SYSCALL64_slow_path+0x25/0x25
       [ 3006.814396] Code: 24 08 44 89 44 24 10 89 4c 24 18 e8 a8 d8 ff ff 48 85 c0
       8b 4c 24 18 44 8b 44 24 10 44 8b 4c 24 08 4c 8b 14 24 0f 84 30
       08 00 00 <f0> ff 80 98 01 00 00 8b 3d 48 ad be 01 45 8b a2 90 0b 00 00 85
       [ 3006.837158] RIP  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
       [ 3006.844401]  RSP <ffff8804102fb908>
       [ 3006.851170] ---[ end trace b7b9f21cf06c27df ]---
       [ 3006.927420] Kernel panic - not syncing: Fatal exception
       [ 3006.933954] Kernel Offset: disabled
       [ 3006.940961] ---[ end Kernel panic - not syncing: Fatal exception
       [ 3006.948249] ------------[ cut here ]------------
      
      Fixes: 3faa3d9a ("IB/hfi1: Make use of mm consistent")
      Reviewed-by: default avatarDean Luick <dean.luick@intel.com>
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      e0cf75de
    • Mike Marciniszyn's avatar
      IB/rdmvat: Fix double vfree() in rvt_create_qp() error path · 56c8ca51
      Mike Marciniszyn authored
      The unwind logic for creating a user QP has a double vfree
      of the non-shared receive queue when handling a "too many qps"
      failure.
      
      The code unwinds the mmmap info by decrementing a reference
      count which will call rvt_release_mmap_info() which in turn
      does the vfree() of the r_rq.wq.  The unwind code then does
      the same free.
      
      Fix by guarding the vfree() with the same test that is done
      in close and only do the vfree() if qp->ip is NULL.
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      56c8ca51