HcclReduceScatter
Description
Performs the sum operation (or other reduction operations) on the inputs of all ranks, and then distributes the result evenly to the output buffers of ranks according to the rank IDs. Each process receives 1/ranksize portion of data from other processes for reduction.
Prototype
HcclResult HcclReduceScatter(void *sendBuf, void *recvBuf, uint64_t recvCount, HcclDataType dataType, HcclReduceOp op, HcclComm comm, aclrtStream stream)
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
sendBuf |
Input |
Address of the buffer to send source data. |
recvBuf |
Output |
Address of the buffer to receive collective communication result. |
recvCount |
Input |
recvBuf size involved in the ReduceScatter operation. The size of sendBuf data is calculated as: recvCount x rank size. |
dataType |
Input |
Data type of the ReduceScatter operation, which is of the HcclDataType type. |
op |
Input |
Reduction operation type. Currently, the following operation types are supported: sum, prod, max, and min. NOTE:
|
comm |
Input |
Communicator where the operation is performed. |
stream |
Input |
Stream of the rank. |
Returns
HcclResult: HCCL_SUCCESS on success; else, failure.
Constraints
The ranks must have the same recvCount, dataType, and op.