Performance Tuning Process
If the performance of the network ported to the Ascend AI Processor for training is not satisfactory, you can perform the following steps to tune the performance:
Figure 1 Performance tuning process
- If the performance is not satisfactory, you are advised to perform the following common operations to improve it:
- Enable the automatic mixed precision mode.
- Check whether the affinity interfaces are replaced.
- Enable iteration offload.
- Use the AOE tool to tune subgraphs, operators, and gradient segmentation policies.
For details, see Basic Tuning.
- Perform model training again and evaluate whether the training performance is satisfactory.
- If the performance is satisfactory, the tuning is complete.
- If the performance is not satisfactory, go to 3.
- Use the Profiling tool to collect and analyze profile data.
Refer to Profile Data Collection and Analysis to collect, parse, export, and analyze profile data.
- Refer to Advanced Tuning to further improve the performance based on the identified performance bottleneck.
- Perform model training again, conduct a regression test, and evaluate whether the training performance is satisfactory.
- If the performance is satisfactory, the tuning is complete.
- If the performance does not meet the requirements for the following products, perform the operations in Automatic AOE Tuning again.
Atlas A3 training products /Atlas A3 inference products Atlas A2 training products /Atlas A2 inference products Atlas training products
Parent topic: Performance Tuning