Send data from sendbuff to rank peer.
Rank peer needs to call ncclRecv with the same datatype and the same count as this rank.
This operation is blocking for the GPU. If multiple ncclSend() and ncclRecv() operations need to progress concurrently to complete, they must be fused within a ncclGroupStart()/ ncclGroupEnd() section.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
real(kind=c_float) | :: | sendbuff |
Buffer to send data from |
|||
integer(kind=c_size_t), | value | :: | count |
Number of elements to send |
||
type(ncclDataType), | value | :: | datatype |
Datatype to send |
||
integer(kind=c_int), | value | :: | peer |
Target GPU |
||
type(ncclComm), | value | :: | comm |
Communicator |
||
type(dtfft_stream_t), | value | :: | stream |
CUDA Stream |
Completion status