Overview

Model conversion and tuning refer to the process of converting models trained under frameworks such as Caffe and TensorFlow into offline models supported by the Ascend AI Processors using the ATC or AOE tool. For details about the architecture, see Figure 1 and Figure 2.

ATC implements operator scheduling optimization, weight data rearrangement, and memory usage optimization, as well as deep learning model tuning, to achieve higher performance and efficiency of model execution on the Ascend AI Processors.
AOE continuously iterates tiling policies through a closed-loop feedback mechanism of policy generation, compilation, and verification in the operating environment, and finally obtains the optimal one. This helps fully utilize hardware resources, improve network performance, and achieve the optimal effect. The two tuning modes are as follows:
- Subgraph tuning: Use SubGraph Auto Tune (SGAT) to tune the subgraph segmentation policy, verify the performance in the operating environment, and solidify the optimal tiling policy into the model repository to obtain the tuned model.
- Operator tuning: Use Operator Auto Tune (OPAT) to tune operators, verify the performance in the operating environment, and solidify the optimal operator tiling policy into the operator repository.

Figure 1 ATC architecture

Figure 2 AOE architecture

Parent topic: Model Conversion and Tuning