Automatic AOE Tuning

The AOE tool continuously iterates tiling policies through a closed-loop feedback mechanism of policy generation, compilation, and verification in the operating environment, and finally obtains the optimal one. This helps fully utilize hardware resources, improve network performance, and achieve the optimal effect. During model training, the AOE tool can be enabled to tune subgraphs, operators, and gradients. After the tuning is complete, the optimal tiling policy is added to the repository. When the model is trained again, you can directly use the repository for efficient tuning, without enabling the tuning function.

You are advised to use the AOE tool to perform tuning in the following sequence:

You can enable AOE tuning in training scenarios using either of the following methods:

Set the environment variable.

 # 1: subgraph tuning; 2: operator tuning; 4: gradient tuning
export AOE_MODE=2

Add aoe_mode of AOE to the training script before initializing the NPU, to specify a tuning mode.

        
             import npu_device as npu
npu.global_options().aoe_config.aoe_mode="2"
npu.open().as_default()

For details about the restrictions and functions of the AOE tool, see "Tuning in TensorFlow-based Training Scenarios" in AOE Instructions.

Parent topic: Basic Tuning