npu.distribute.broadcast
Description
Synchronizes variables between workers in distributed NPU training.
Prototype
npu.distribute.broadcast(values, root_rank, fusion=2, fusion_id=0, group="hccl_world_group")
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
values |
Input |
A TensorFlow variable or variable set. For the |
root_rank |
Input |
An int. Rank ID of the root node, which is the rank ID in the group. |
fusion |
Input |
An int. Broadcast operator fusion flag. The value can be one of the following:
|
fusion_id |
Input |
An int. Broadcast operator fusion ID. If fusion is set to 2, broadcast operators with the same fusion_id are fused during network compilation. |
group |
Input |
A string of up to 128 bytes, including the end character. Group name, which can be a user-defined value or hccl_world_group. |
Returns
None
Example
To broadcast the variables on device 0 to the rest devices:
1 2 3 4 5 6 | # rank_id = 0 rank_size = 8 import npu_device as npu x = tf.Variable(tf.random.normal(shape=())) print("before broadcast", x) npu.distribute.broadcast(x, root_rank=0) print("after_broadcast", x) |
Before the broadcast:

After the broadcast:
