allgather

Applicability

Product	Supported
Atlas A3 training products/Atlas A3 inference products	√
Atlas A2 training products/Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	☓
Atlas inference products	√
Atlas training products	√

For the Atlas inference products, only the Atlas 300I Duo inference card is supported.

Description

Re-sorts the inputs of all ranks in the communicator by rank ID, combines the inputs, and sends the results to the outputs of all ranks.

For the AllGather operation, each rank receives a set of data that is resorted by rank ID, that is, AllGather outputs of all ranks are the same.

Prototype

def allgather(tensor, rank_size, group="hccl_world_group", fusion=0, fusion_id=-1)

Parameters

Parameter	Input/Output	Description
tensor	Input	TensorFlow tensor type. For the Atlas A3 training products/Atlas A3 inference products, the supported data types are int8, uint8, int16, uint16, int32, uint32, int64, uint64, float16, float32, float64, and bfp16. For the Atlas A2 training products/Atlas A2 inference products, the supported data types are int8, uint8, int16, uint16, int32, uint32, int64, uint64, float16, float32, float64, and bfp16. For the Atlas training products, the supported data types are int8, uint8, int16, uint16, int32, uint32, int64, uint64, float16, float32, and float64. For the Atlas 300I Duo inference card, the supported data types are int8, uint8, int16, uint16, int32, uint32, int64, uint64, float16, float32, and float64.
rank_size	Input	Int type. Number of devices in a group. The maximum value is 32768.
group	Input	A string containing a maximum of 128 bytes, including the end character. Group name, which can be a user-defined value or hccl_world_group.
fusion	Input	Int type. AllGather operator fusion flag. The value can be one of the following: 0: The AllGather operator is not fused with other AllGather operators during network compilation. 2: AllGather operators with the same fusion_id are fused during network compilation.
fusion_id	Input	Int type. AllGather operator fusion ID. When fusion is set to 2, AllGather operators with the same fusion_id are fused during network compilation.

Returns

The result tensor

Restrictions

The caller rank must be within the range defined by the group argument passed to this API call. Otherwise, the API call fails.

Example

from npu_bridge.hccl import hccl_ops
tensor = tf.random_uniform((1, 3), minval=1, maxval=10, dtype=tf.float32)
rank_size = 2
result = hccl_ops.allgather(tensor, rank_size)

Parent topic: npu_bridge.hccl.hccl_ops