Model Inference Performance Tuning Strategies

Model inference accuracy may be compromised due to issues such as operator adaptation and data read/write problems on the Ascend AI Processor. This section describes the strategies for tuning model inference performance. Before tuning inference performance of your model, first debug its inference functionality.

Key tools required for tuning are the Ascend Optimization Engine (AOE) and Ascend Model Compression Toolkit (AMCT). The tuning also involves operations such as model conversion, time consumption recoding, and performance bottleneck analysis, which require tools such as the model conversion tool Ascend Tensor Compiler (ATC), performance data collection tool, and accuracy comparison tool.

Parent topic: Model Inference Performance Tuning Suggestions