HcclAlltoAll
Description
Sends the same-sized data to all ranks in the communicator and receives the same-sized data from all ranks.
Prototype
HcclResult HcclAlltoAll(const void *sendBuf, uint64_t sendCount, HcclDataType sendType, const void *recvBuf, uint64_t recvCount, HcclDataType recvType, HcclComm comm, aclrtStream stream)
Parameters
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
sendBuf |
Input |
Address of the buffer to send source data. |
|
sendCount |
Input |
Volume of data sent to each rank. |
|
sendType |
Input |
Data type of the data to be sent, which is of the HcclDataType type. |
|
recvBuf |
Output |
Address of the buffer to receive collective communication result. |
|
recvCount |
Input |
Volume of data received from each rank. |
|
recvType |
Input |
Data type of the data to be received, which is of the HcclDataType type. |
|
comm |
Input |
Communicator where the operation is performed. |
|
stream |
Input |
Stream of the rank. |
Returns
HcclResult: HCCL_SUCCESS on success; else, failure.
Constraints
- The ranks must have the same sendCount, sendType, recvCount, and recvType.
- The performance of the alltoall operation is related to the size of the buffer for storing shared data between NPUs. When the communication data size exceeds the buffer size, the performance deteriorates significantly. If the alltoall communication data size in the service is large, you are advised to increase the buffer size appropriately by setting environment variable HCCL_BUFFSIZE to improve the communication performance.
- For the
Atlas Training Series Product , the alltoall communicators must meet the following requirement:The communicators of 1p and 2p in a single server must be in the same cluster (with devices 0–3 and devices 4–7 each belonging to a separate cluster). In the communicators of 4p and 8p in a single server and multiple servers, the ranks must be based on the clusters, and the selected clusters in servers must be consistent.