1. 25 Aug, 2015 1 commit
  2. 02 Jun, 2015 1 commit
    • Wengang Wang's avatar
      rds: re-entry of rds_ib_xmit/rds_iw_xmit · d655a9fb
      Wengang Wang authored
      
      The BUG_ON at line 452/453 is triggered in function rds_send_xmit.
      
       441                         while (ret) {
       442                                 tmp = min_t(int, ret, sg->length -
       443                                                       conn->c_xmit_data_off);
       444                                 conn->c_xmit_data_off += tmp;
       445                                 ret -= tmp;
       446                                 if (conn->c_xmit_data_off == sg->length) {
       447                                         conn->c_xmit_data_off = 0;
       448                                         sg++;
       449                                         conn->c_xmit_sg++;
       450                                         if (ret != 0 && conn->c_xmit_sg == rm->data.op_nents)
       451                                                 printk(KERN_ERR "conn %p rm %p sg %p ret %d\n", conn, rm, sg, ret);
       452                                         BUG_ON(ret != 0 &&
       453                                                conn->c_xmit_sg == rm->data.op_nents);
       454                                 }
       455                         }
      
      it is complaining the total sent length is bigger that we want to send.
      
      rds_ib_xmit() is wrong for the second entry for the same rds_message returning
      wrong value.
      
      the sg and off passed by rds_send_xmit to rds_ib_xmit is based on
      scatterlist.offset/length, but the rds_ib_xmit action is based on
      scatterlist.dma_address/dma_length. in case dma_length is larger than length
      there is problem. for the 2nd and later entries of rds_ib_xmit for same
      rds_message, at least one of the following two is wrong:
      
      1) the scatterlist to start with,  the choosen one can far beyond the correct
         one.
      2) the offset to start with within the scatterlist.
      
      fix:
      add op_dmasg and op_dmaoff to rm_data_op structure indicating the scatterlist
      and offset within the it to start with for rds_ib_xmit respectively. op_dmasg
      and op_dmaoff are initialized to zero when doing dma mapping for the first see
      of the message and are changed when filling send slots.
      
      the same applies to rds_iw_xmit too.
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d655a9fb
  3. 18 May, 2015 1 commit
  4. 08 Feb, 2015 1 commit
  5. 19 May, 2014 1 commit
  6. 03 Dec, 2013 1 commit
    • Venkat Venkatsubra's avatar
      rds: prevent BUG_ON triggered on congestion update to loopback · 18fc25c9
      Venkat Venkatsubra authored
      After congestion update on a local connection, when rds_ib_xmit returns
      less bytes than that are there in the message, rds_send_xmit calls
      back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
      to trigger.
      
      For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
      the message contains 8240 bytes. rds_send_xmit thinks there is more to send
      and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
      =4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].
      
      The commit 6094628b
      
      
      "rds: prevent BUG_ON triggering on congestion map updates" introduced
      this regression. That change was addressing the triggering of a different
      BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
       	BUG_ON(ret != 0 &&
          		 conn->c_xmit_sg == rm->data.op_nents);
      This was the sequence it was going through:
      (rds_ib_xmit)
      /* Do not send cong updates to IB loopback */
      if (conn->c_loopback
         && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
        	rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
          	return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
      }
      rds_ib_xmit returns 8240
      rds_send_xmit:
        c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
         		 = 8192
        c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
      rds_ib_xmit returns 8240
      rds_send_xmit:
        c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
        and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
      rds_ib_xmit returns 8240
      On this iteration this sequence causes the BUG_ON in rds_send_xmit:
          while (ret) {
          	tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
          	[tmp = 65536 - 57632 = 7904]
          	conn->c_xmit_data_off += tmp;
          	[c_xmit_data_off = 57632 + 7904 = 65536]
          	ret -= tmp;
          	[ret = 8240 - 7904 = 336]
          	if (conn->c_xmit_data_off == sg->length) {
          		conn->c_xmit_data_off = 0;
          		sg++;
          		conn->c_xmit_sg++;
          		BUG_ON(ret != 0 &&
          			conn->c_xmit_sg == rm->data.op_nents);
          		[c_xmit_sg = 1, rm->data.op_nents = 1]
      
      What the current fix does:
      Since the congestion update over loopback is not actually transmitted
      as a message, all that rds_ib_xmit needs to do is let the caller think
      the full message has been transmitted and not return partial bytes.
      It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
      And 64Kb+48 when page size is 64Kb.
      Reported-by: default avatarJosh Hunt <joshhunt00@gmail.com>
      Tested-by: default avatarHonggang Li <honli@redhat.com>
      Acked-by: default avatarBang Nguyen <bang.nguyen@oracle.com>
      Signed-off-by: default avatarVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18fc25c9
  7. 17 Jun, 2011 1 commit
  8. 31 Mar, 2011 1 commit
  9. 08 Mar, 2011 1 commit
    • Neil Horman's avatar
      rds: prevent BUG_ON triggering on congestion map updates · 6094628b
      Neil Horman authored
      
      Recently had this bug halt reported to me:
      
      kernel BUG at net/rds/send.c:329!
      Oops: Exception in kernel mode, sig: 5 [#1]
      SMP NR_CPUS=1024 NUMA pSeries
      Modules linked in: rds sunrpc ipv6 dm_mirror dm_region_hash dm_log ibmveth sg
      ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt
      dm_mod [last unloaded: scsi_wait_scan]
      NIP: d000000003ca68f4 LR: d000000003ca67fc CTR: d000000003ca8770
      REGS: c000000175cab980 TRAP: 0700   Not tainted  (2.6.32-118.el6.ppc64)
      MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 44000022  XER: 00000000
      TASK = c00000017586ec90[1896] 'krdsd' THREAD: c000000175ca8000 CPU: 0
      GPR00: 0000000000000150 c000000175cabc00 d000000003cb7340 0000000000002030
      GPR04: ffffffffffffffff 0000000000000030 0000000000000000 0000000000000030
      GPR08: 0000000000000001 0000000000000001 c0000001756b1e30 0000000000010000
      GPR12: d000000003caac90 c000000000fa2500 c0000001742b2858 c0000001742b2a00
      GPR16: c0000001742b2a08 c0000001742b2820 0000000000000001 0000000000000001
      GPR20: 0000000000000040 c0000001742b2814 c000000175cabc70 0800000000000000
      GPR24: 0000000000000004 0200000000000000 0000000000000000 c0000001742b2860
      GPR28: 0000000000000000 c0000001756b1c80 d000000003cb68e8 c0000001742b27b8
      NIP [d000000003ca68f4] .rds_send_xmit+0x4c4/0x8a0 [rds]
      LR [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
      Call Trace:
      [c000000175cabc00] [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
      (unreliable)
      [c000000175cabd30] [d000000003ca7e64] .rds_send_worker+0x54/0x100 [rds]
      [c000000175cabdb0] [c0000000000b475c] .worker_thread+0x1dc/0x3c0
      [c000000175cabed0] [c0000000000baa9c] .kthread+0xbc/0xd0
      [c000000175cabf90] [c000000000032114] .kernel_thread+0x54/0x70
      Instruction dump:
      4bfffd50 60000000 60000000 39080001 935f004c f91f0040 41820024 813d017c
      7d094a78 7d290074 7929d182 394a0020 <0b090000> 40e2ff68 4bffffa4 39200000
      Kernel panic - not syncing: Fatal exception
      Call Trace:
      [c000000175cab560] [c000000000012e04] .show_stack+0x74/0x1c0 (unreliable)
      [c000000175cab610] [c0000000005a365c] .panic+0x80/0x1b4
      [c000000175cab6a0] [c00000000002fbcc] .die+0x21c/0x2a0
      [c000000175cab750] [c000000000030000] ._exception+0x110/0x220
      [c000000175cab910] [c000000000004b9c] program_check_common+0x11c/0x180
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6094628b
  10. 09 Sep, 2010 23 commits
    • Andy Grover's avatar
      RDS: Implement masked atomic operations · 20c72bd5
      Andy Grover authored
      
      Add two CMSGs for masked versions of cswp and fadd. args
      struct modified to use a union for different atomic op type's
      arguments. Change IB to do masked atomic ops. Atomic op type
      in rds_message similarly unionized.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      20c72bd5
    • Zach Brown's avatar
      RDS/IB: print string constants in more places · 59f740a6
      Zach Brown authored
      
      This prints the constant identifier for work completion status and rdma
      cm event types, like we already do for IB event types.
      
      A core string array helper is added that each string type uses.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      59f740a6
    • Zach Brown's avatar
      RDS/IB: track signaled sends · f046011c
      Zach Brown authored
      
      We're seeing bugs today where IB connection shutdown clears the send
      ring while the tasklet is processing completed sends.  Implementation
      details cause this to dereference a null pointer.  Shutdown needs to
      wait for send completion to stop before tearing down the connection.  We
      can't simply wait for the ring to empty because it may contain
      unsignaled sends that will never be processed.
      
      This patch tracks the number of signaled sends that we've posted and
      waits for them to complete.  It also makes sure that the tasklet has
      finished executing.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      f046011c
    • Zach Brown's avatar
      rds: fix rds_send_xmit() serialization · 0f4b1c7e
      Zach Brown authored
      
      rds_send_xmit() was changed to hold an interrupt masking spinlock instead of a
      mutex so that it could be called from the IB receive tasklet path.  This broke
      the TCP transport because its xmit method can block and masks and unmasks
      interrupts.
      
      This patch serializes callers to rds_send_xmit() with a simple bit instead of
      the current spinlock or previous mutex.  This enables rds_send_xmit() to be
      called from any context and to call functions which block.  Getting rid of the
      c_send_lock exposes the bare c_lock acquisitions which are changed to block
      interrupts.
      
      A waitqueue is added so that rds_conn_shutdown() can wait for callers to leave
      rds_send_xmit() before tearing down partial send state.  This lets us get rid
      of c_senders.
      
      rds_send_xmit() is changed to check the conn state after acquiring the
      RDS_IN_XMIT bit to resolve races with the shutdown path.  Previously both
      worked with the conn state and then the lock in the same order, allowing them
      to race and execute the paths concurrently.
      
      rds_send_reset() isn't racing with rds_send_xmit() now that rds_conn_shutdown()
      properly ensures that rds_send_xmit() can't start once the conn state has been
      changed.  We can remove its previous use of the spinlock.
      
      Finally, c_send_generation is redundant.  Callers can race to test the c_flags
      bit by simply retrying instead of racing to test the c_send_generation atomic.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      0f4b1c7e
    • Zach Brown's avatar
      RDS/IB: get the xmit max_sge from the RDS IB device on the connection · 89bf9d41
      Zach Brown authored
      
      rds_ib_xmit_rdma() was calling ib_get_client_data() to get at the rds_ibdevice
      just to get the max_sge for the transmit.  This patch instead has it get it
      directly off the rds_ibdev which is stored on the connection.
      
      The current code won't free the rds_ibdev until all the IB connections that use
      it are freed.  So it's safe to reference the rds_ibdev this way.  In the future
      it also makes it easier to support proper reference counting of the rds_ibdev
      struct.
      
      As an additional bonus, this gets rid of the performance hit of calling in to
      the IB stack to look up the rds_ibdev.  The current implementation in the IB
      stack acquires an interrupt blocking spinlock to protect the registration of
      client callback data.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      89bf9d41
    • Chris Mason's avatar
      rds: Fix reference counting on the for xmit_atomic and xmit_rdma · 1cc2228c
      Chris Mason authored
      
      This makes sure we have the proper number of references in
      rds_ib_xmit_atomic and rds_ib_xmit_rdma.  We also consistently
      drop references the same way for all message types as the IOs end.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      1cc2228c
    • Chris Mason's avatar
      rds: Fix RDMA message reference counting · c9e65383
      Chris Mason authored
      
      The RDS send_xmit code was trying to get fancy with message
      counting and was dropping the final reference on the RDMA messages
      too early.  This resulted in memory corruption and oopsen.
      
      The fix here is to always add a ref as the parts of the message passes
      through rds_send_xmit, and always drop a ref as the parts of the message
      go through completion handling.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      c9e65383
    • Andy Grover's avatar
    • Andy Grover's avatar
      RDS: Perform unmapping ops in stages · ff3d7d36
      Andy Grover authored
      
      Previously, RDS would wait until the final send WR had completed
      and then handle cleanup. With silent ops, we do not know
      if an atomic, rdma, or data op will be last. This patch
      handles any of these cases by keeping a pointer to the last
      op in the message in m_last_op.
      
      When the TX completion event fires, rds dispatches to per-op-type
      cleanup functions, and then does whole-message cleanup, if the
      last op equalled m_last_op.
      
      This patch also moves towards having op-specific functions take
      the op struct, instead of the overall rm struct.
      
      rds_ib_connection has a pointer to keep track of a a partially-
      completed data send operation. This patch changes it from an
      rds_message pointer to the narrower rm_data_op pointer, and
      modifies places that use this pointer as needed.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      ff3d7d36
    • Andy Grover's avatar
      RDS: Rename data op members prefix from m_ to op_ · 6c7cc6e4
      Andy Grover authored
      
      For consistency.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      6c7cc6e4
    • Andy Grover's avatar
      RDS: Remove struct rds_rdma_op · f8b3aaf2
      Andy Grover authored
      
      A big changeset, but it's all pretty dumb.
      
      struct rds_rdma_op was already embedded in struct rm_rdma_op.
      Remove rds_rdma_op and put its members in rm_rdma_op. Rename
      members with "op_" prefix instead of "r_", for consistency.
      
      Of course this breaks a lot, so fixup the code accordingly.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      f8b3aaf2
    • Andy Grover's avatar
      RDS: Implement silent atomics · 241eef3e
      Andy Grover authored
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      241eef3e
    • Andy Grover's avatar
      RDS/IB: Make all flow control code conditional on i_flowctl · c8de3f10
      Andy Grover authored
      
      Maybe things worked fine with the flow control code running
      even in the non-flow-control case, but making it explicitly
      conditional helps the non-fc case be easier to read.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      c8de3f10
    • Andy Grover's avatar
      RDS: Remove unsignaled_bytes sysctl · 1d34f175
      Andy Grover authored
      
      Removed unsignaled_bytes sysctl and code to signal
      based on it. I believe unsignaled_wrs is more than
      sufficient for our purposes.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      1d34f175
    • Andy Grover's avatar
      RDS: rewrite rds_ib_xmit · da5a06ce
      Andy Grover authored
      
      Now that the header always goes first, it is possible to
      simplify rds_ib_xmit. Instead of having a path to handle 0-byte
      dgrams and another path to handle >0, these can both be handled
      in one path. This lets us eliminate xmit_populate_wr().
      
      Rename sent to bytes_sent, to differentiate better from other
      variable named "send".
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      da5a06ce
    • Andy Grover's avatar
      RDS/IB: Remove ib_[header/data]_sge() functions · 919ced4c
      Andy Grover authored
      
      These functions were to cope with differently ordered
      sg entries depending on RDS 3.0 or 3.1+. Now that
      we've dropped 3.0 compatibility we no longer need them.
      
      Also, modify usage sites for these to refer to sge[0] or [1]
      directly. Reorder code to initialize header sgs first.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      919ced4c
    • Andy Grover's avatar
      RDS/IB: Remove dead code · 6f3d05db
      Andy Grover authored
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      6f3d05db
    • Andy Grover's avatar
      RDS/IB: eliminate duplicate code · 9c030391
      Andy Grover authored
      
      both atomics and rdmas need to convert ib-specific completion codes
      into RDS status codes. Rename rds_ib_rdma_send_complete to
      rds_ib_send_complete, and have it take a pointer to the function to
      call with the new error code.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      9c030391
    • Andy Grover's avatar
      RDS: Implement atomic operations · 15133f6e
      Andy Grover authored
      
      Implement a CMSG-based interface to do FADD and CSWP ops.
      
      Alter send routines to handle atomic ops.
      
      Add atomic counters to stats.
      
      Add xmit_atomic() to struct rds_transport
      
      Inline rds_ib_send_unmap_rdma into unmap_rm
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      15133f6e
    • Andy Grover's avatar
      RDS: make m_rdma_op a member of rds_message · ff87e97a
      Andy Grover authored
      
      This eliminates a separate memory alloc, although
      it is now necessary to add an "r_active" flag, since
      it is no longer to use the m_rdma_op pointer as an
      indicator of if an rdma op is present.
      
      rdma SGs allocated from rm sg pool.
      
      rds_rm_size also gets bigger. It's a little inefficient to
      run through CMSGs twice, but it makes later steps a lot smoother.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      ff87e97a
    • Andy Grover's avatar
      RDS: fold rdma.h into rds.h · 21f79afa
      Andy Grover authored
      
      RDMA is now an intrinsic part of RDS, so it's easier to just have
      a single header.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      21f79afa
    • Andy Grover's avatar
      RDS: break out rdma and data ops into nested structs in rds_message · e779137a
      Andy Grover authored
      
      Clearly separate rdma-related variables in rm from data-related ones.
      This is in anticipation of adding atomic support.
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      e779137a
    • Andy Grover's avatar
      RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons · 8690bfa1
      Andy Grover authored
      
      Favor "if (foo)" style over "if (foo != NULL)".
      Signed-off-by: default avatarAndy Grover <andy.grover@oracle.com>
      8690bfa1
  11. 17 Mar, 2010 3 commits
  12. 30 Nov, 2009 1 commit
  13. 10 Apr, 2009 2 commits
  14. 27 Feb, 2009 1 commit