reduce

Description

Performs the sum operation (or other reduction operations) on the data of all ranks and sends the result to the specified position on the root rank.

Prototype

def reduce(tensor, reduction, root_rank, fusion=0, fusion_id=-1, group="hccl_world_group")

Parameters

Parameter

Input/Output

Description

tensor

Input

TensorFlow tensor type.

Atlas Training Series Product: The supported data types are int8, int32, int64, float16, and float32.

reduction

Input

A string.

Reduction operation types, which can be max, min, prod, and sum.

NOTE:

root_rank

Input

An int.

Rank ID of the root rank. Must be a rank ID in the group.

fusion

Input

An int.

Reduce operator fusion flag. The values are as follows:

  • 0: disabled. The Reduce operator is not fused with other Reduce operators.
  • 2: enabled. Operators with the same fusion_id are fused.

fusion_id

Input

An int.

Reduce operator fusion ID.

If fusion is set to 2, Reduce operators with the same fusion_id are fused during network compilation.

group

Input

A string containing a maximum of 128 bytes, including the end character.

Group name, which can be a user-defined value or hccl_world_group.

Returns

The result tensor

Constraints

  • The caller rank must be within the range defined by the group argument passed to this API call. Otherwise, the API call fails.
  • The input tensor size must be less than or equal to 8 GB.
  • The Reduce operator can be fused only when the reduction is set to sum.

Applicability

Atlas Training Series Product

Example

The following is only a code snippet and cannot be executed. For details about how to call the HCCL Python APIs to perform collective communication, see Sample Code.

1
2
3
from npu_bridge.npu_init import *
tensor = tf.random_uniform((1, 3), minval=1, maxval=10, dtype=tf.float32)
result = hccl_ops.reduce(tensor, "sum", 0)