Browse Source

staging: lustre: o2iblnd: Put back work queue check previously removed

The previous patch, http://review.whamcloud.com/21304/, removed
a check needed until LU-5718 is properly addressed.  With
the check, LU-5718 results in an error message and a lost
RDMA operation.  Without it, we have memory corruption and
a crash (much harder to debug).

Putting the check back in case LU-5718 is not fixed soon.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7650
Reviewed-on: http://review.whamcloud.com/22281
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doug Oucharek 9 years ago
parent
commit
d566b9aec9
1 changed files with 10 additions and 0 deletions
  1. 10 0
      drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c

+ 10 - 0
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c

@@ -1093,6 +1093,16 @@ kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type,
 			break;
 		}
 
+		if (tx->tx_nwrq >= IBLND_MAX_RDMA_FRAGS) {
+			CERROR("RDMA has too many fragments for peer %s (%d), src idx/frags: %d/%d dst idx/frags: %d/%d\n",
+			       libcfs_nid2str(conn->ibc_peer->ibp_nid),
+			       IBLND_MAX_RDMA_FRAGS,
+			       srcidx, srcrd->rd_nfrags,
+			       dstidx, dstrd->rd_nfrags);
+			rc = -EMSGSIZE;
+			break;
+		}
+
 		wrknob = min(min(kiblnd_rd_frag_size(srcrd, srcidx),
 				 kiblnd_rd_frag_size(dstrd, dstidx)),
 			     (__u32)resid);