set_split_strategy_by_size
Description
Sets a backward gradient splitting strategy in a collective communication group based on the proportion of gradient data to implement AllReduce fusion and optimize the collective communication performance.
Prototype
def set_split_strategy_by_size(dataSizeList, group="hccl_world_group")
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
dataSizeList |
Input |
A list. List of gradient parameter data size percentages.
|
group |
Input |
A string containing a maximum of 128 bytes, including the end character. Group name, which can be a user-defined value or hccl_world_group. Defaults to hccl_world_group. |
Returns
None
Constraints
- The caller rank must be within the range defined by the group argument passed to this API call. Otherwise, the API call fails.
- When the backward gradient splitting strategy is set based on both the gradient data size percentage and the gradient index ID, the setting result based on the gradient data size percentage is preferred.
- If you do not call the gradient splitting API to set the splitting strategy, the default backward gradient splitting strategy is used.
Default splitting strategy: The optimal splitting location of ResNet-50 is as follows: ResNet-50 is divided into two segments based on the gradient data size. The data size of the first segment is 96.54%, and that of the second segment is 3.46%.
Applicability
Example
The following is only a code snippet and cannot be executed. For details about how to call the HCCL Python APIs to perform collective communication, see Sample Code.
1 2 | from npu_bridge.npu_init import * set_split_strategy_by_size([60, 20, 20], "group") |