HcclAlltoAllV
Description
Sends data (whose size can be customized) to all ranks in the communicator and receives data from all ranks.
Prototype
HcclResult HcclAlltoAllV(const void *sendBuf, const void *sendCounts, const void *sdispls, HcclDataType sendType, const void *recvBuf, const void *recvCounts, const void *rdispls, HcclDataType recvType,HcclComm comm, aclrtStream stream)
Parameters
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
sendBuf |
Input |
Address of the buffer to send source data. |
|
sendCounts |
Input |
Amount of data to be sent, a uint64 array. sendCounts[i] = n indicates that the amount of data sent by the current rank to rank i is n. For example, if sendType is set to float32 and sendCounts[i] is set to n, the current rank sends n pieces of float32 data to rank i. |
|
sdispls |
Input |
Sending offset, a uint64 array. sdispls[i] = n indicates the offset of the start position of the data to be sent from the current rank to rank i relative to sendBuf. The basic unit is sendType. |
|
sendType |
Input |
Data type of the data to be sent, which is of the HcclDataType type. |
|
recvBuf |
Output |
Address of the buffer to receive collective communication result. |
|
recvCounts |
Input |
Amount of data received, a uint64 array. recvCounts[i] = n indicates that the amount of data received by the current rank from rank i is n. For example, if recvType is float32 and recvCounts[i] is n, the rank receives n pieces of float32 data from rank i. |
|
rdispls |
Input |
Receiving offset, a uint64 array. rdispls[i] = n indicates the offset of the start position where the data received by the current rank from rank i is stored relative to recvBuf. The basic unit is recvType. |
|
recvType |
Input |
Data type of the data to be received, which is of the HcclDataType type. |
|
comm |
Input |
Communicator where the operation is performed. |
|
stream |
Input |
Stream of the rank. |
Returns
HcclResult: HCCL_SUCCESS on success; else, failure.
Constraints
- The performance of the AlltoAllV operation is related to the size of the buffer for storing shared data between NPUs. When the communication data size exceeds the buffer size, the performance deteriorates significantly. If the AlltoAllV communication data size in the service is large, you are advised to increase the buffer size appropriately by setting environment variable HCCL_BUFFSIZE to improve the communication performance.
- For the
Atlas Training Series Product , the AlltoAllV communicators must meet the following requirement:In a cluster network, the communicators of 1p and 2p in a single server must be in the same cluster (with devices 0–3 and devices 4–7 each belonging to a separate cluster). In the communicators of 4p and 8p in a single server and multiple servers, the ranks must be based on the clusters, and the selected clusters in servers must be consistent.
- This API cannot be used in non-cluster scenarios.