tft_set_optimizer_replica
Function
Sets the replica relationship of the optimizer state corresponding to a rank.
Format
mindio_ttp.framework_ttp.tft_set_optimizer_replica(rank: int, replica_info: list)
Parameters
Parameter |
Mandatory/Optional |
Description |
Value |
|---|---|---|---|
rank |
Mandatory |
Rank ID of the NPU on which a training job is being executed. |
int, [0, 100000). |
replica_info |
Mandatory |
List of replica relationships. Each element is a dictionary. The dictionary is arranged in the sequence of ATTENTION (0) and MOE (1). |
[
{
"rank_list": list, rank list of a group of replica relationships. In the PyTorch scenario, it indicates the DP group rank list. In the MindSpore scenario, it indicates the list of all replica NPUs corresponding to one NPU.
"replica_cnt": int, number of replicas. In the PyTorch scenario, it indicates the number of replicas. In the MindSpore scenario, it is the length of rank_list.
"replica_shift": int, valid in the PyTorch scenario.
},
]
|
Return Value
No return value. If an error occurs, an error log is recorded and an exception is thrown.
Parent topic: API Reference