Tuning a Model with the AOE Tool

This section describes how to use the aoe command to tune subgraphs and operators in the inference scenario. Before tuning, you need to set up the environment, understand the restrictions and how to use basic parameters by referring to AOE Instructions.

If you tune the model with the AOE tool, the following files will be generated: offline model .om file that adapts to Ascend AI Processors, model tuning repository based on Ascend AI Processors, and tuning result file. You can check the tuning result file for the performance optimization result of each operator in the model. Since a model file (.om) is generated, the AOE tool provides the model conversion function. Therefore, compared with the ATC tool, the AOE tool has the extra --job_type parameter and subgraph tuning and operator tuning functions. You are advised to perform subgraph tuning and then operator tuning. The reason is that performing subgraph tuning first can generate the graph partition mode. After subgraph tuning is complete, the operators are partitioned into the final shapes. Operator tuning can then be performed based on the final shapes. If operator tuning is performed first, the shapes of the tuned operators are not the final shapes after operator partitioning. This may compromise the tuning effect.

Procedure

Run AOE commands to tune subgraphs and operators.

A subgraph tuning command example is as follows:

aoe --model=${HOME}/module/resnet50_pytorch_1.4.onnx --framework=5 --job_type=1

An operator tuning command example is as follows:

aoe --model=${HOME}/module/resnet50_pytorch_1.4.onnx --framework=5 --job_type=2

View the tuning result.
If the following information is displayed, the tuning is complete and the performance is improved. Then the custom repository, model file (.om), and tuning result file are generated.
```
<xxxx> process finished. Performance improved by xx%    //xxxx indicates the tuning task name and xx% indicates the percentage of performance improvement.
```
The generated files are as follows:
- Model tuning repository based on Ascend AI Processors
  - If a subgraph is tuned, the file is generated in ${HOME}/Ascend/latest/data/aoe/custom/graph/${soc_version} by default.
  - If an operator is tuned, the file is generated in ${HOME}/Ascend/latest/data/aoe/custom/op/${soc_version} by default.
- OM file adapted to Ascend AI Processors
  By default, the tuned .om file is stored in the current directory where AOE commands are executed, that is ${model_name}_${timestamp}/tunespace/result/${model_name}_${timestamp}_tune.om or ${model_name}_${timestamp}_tune_${os}_${arch}.om.
- Tuning result file
  A file named aoe_result_opat_{timestamp}_{pidxxx}.json is generated in the current directory where AOE commands are executed. This file records the information about the tuned operators.
  The following is an example of a content segment in the .json file:
```
{
          "op_name": "Conv_125",
          "op_type": "Conv2D",
          "tune_performance": {
            "Schedule": {
              "performance_after_tune(us)": 72.046,
              "performance_before_tune(us)": 72.055,
              "performance_improvement": "0.01%",
              "update_mode": "add"
            }
          }
}
```
Specify the tuning repository and perform model inference.
1. Set the TUNE_BANK_PATH environment variable to the path for storing the custom repository generated after tuning. In the path, the graph directory stores subgraph tuning repository, and the op directory stores the operator tuning repository. Example:
```
export TUNE_BANK_PATH=/home/HwHiAiUser/custom
```
2. Perform model inference.
  Use the .om file generated in 1 to perform model inference.
  
  Alternatively, you can use the msame tool to perform quick inference and view the time consumption data.
  
  The scenario of performing one-time AOE tuning followed by multiple ATC model conversions is feasible. After tuning a model with the AOE tool, if there is a need to reconvert it due to other service requirements, you can use the environment variable to specify the path of the AOE tuning repository, and then use the ATC tool to reconvert the model. In this way, the model can be compiled and converted based on the tiling policy in the repository to generate the tuned model.

Parent topic: Model Inference Performance Tuning Suggestions