Analyzing Performance Data Files
- Import the collected performance data files to the MindStudio Insight tool for analysis.
- Analyze the ratio of Free. Generally, the ratio of Free of the Atlas 200I A2 Inference Acceleration Module is less than 10%. As shown in Figure 1, the ratio of Free exceeds 30%, which is beyond the specified value. You need to further analyze whether the OS of the Atlas 200I A2 Inference Acceleration Module runs other services, which occupies resources and causes waiting.
- Analyze the operators running on the AI CPU. As shown in Figure 2, GridSampler2D runs on the AI CPU. Locate the faulty operator and contact the owner to determine whether the operator can be optimized to run on the AI Core or perform further analysis.
- Analyze the time-consuming operators running on the AI Core. As shown in Figure 3, the Conv2D operator takes most of the time. Locate the faulty operator and contact the owner to check whether the operator can be optimized.
- Analyze the op_summary file.
You are advised to sort the operators by task duration in descending order and pay attention to the operators that take a long time. These operators are the bottlenecks for high performance. If the values of vec_ratio and mac_ratio do not exceed 0.8, the operators can be further optimized. If the value of mtex_ratio is high, data movement takes a long time. In this case, you can combine the operators before and after the movement to reduce the movement. Table 1 describes the parameters.
Table 1 Parameters Parameter
Description
aic_mte1_time(us)
Time taken to execute MTE1 instructions (L1-to-L0A/L0B movement), excluding the movement wait time.
aic_mte1_ratio
Ratio of cycles taken to execute MTE1 instructions (L1-to-L0A/L0B movement) to the total cycles.
aic_mte2_time(us)
Time taken to execute MTE2 instructions (GM-to-AI Core movement)
aic_mte2_ratio
Ratio of cycles taken to execute MTE2 instructions (GM-to-AI Core movement) to the total cycles.
aic_mte3_time(us)
Time taken to execute MTE3 instructions (AI Core-to-GM movement).
aic_mte3_ratio
Ratio of cycles taken to execute MTE3 instructions (AI Core-to-GM movement) to the total cycles.


