allreduce
Applicability
|
Product |
Supported |
|---|---|
|
|
√ |
|
|
√ |
|
|
☓ |
|
|
√ |
|
|
√ |
For the
Description
Performs the reduction operation on the input data of all ranks in a group and sends the result to the output buffer of all ranks. The reduction operation type is specified by the reduction parameter. This API operates the collective communication operator AllReduce.

Prototype
1
|
def allreduce(tensor, reduction, fusion=1, fusion_id=-1, group="hccl_world_group") |
Parameters
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
tensor |
Input |
TensorFlow tensor type. For the For the For the For the Atlas 300I Duo inference card, the supported data types are int8, int16, int32, float16, and float32. |
|
reduction |
Input |
String type. Reduction operation types, which can include max, min, prod, and sum.
NOTE:
For the For the For the Atlas 300I Duo inference card, the prod, max, and min operations do not support the int16 data type in the current version. |
|
fusion |
Input |
Int type. AllReduce operator fusion flag. The value can be one of the following:
|
|
fusion_id |
Input |
Int type. AllReduce operator fusion ID. When fusion is set to 2, AllReduce operators with the same fusion_id are fused during network compilation. |
|
group |
Input |
A string containing a maximum of 128 bytes, including the end character. Group name, which can be a user-defined value or hccl_world_group. |
Returns
The result tensor
Restrictions
- The caller rank must be within the range defined by the group argument passed to this API call. Otherwise, the API call fails.
- Each rank can have only one input.
- The upstream node of allreduce must not be variable.
- The input tensor size must be less than or equal to 8 GB.
- For the AllReduce operator fusion, only the reduction type sum is supported.
Example
1 2 3 |
from npu_bridge.hccl import hccl_ops tensor = tf.random_uniform((1, 3), minval=1, maxval=10, dtype=tf.float32) result = hccl_ops.allreduce(tensor, "sum") |