Performance Tuning Process
If the performance of the network ported to the Ascend AI Processor for training is not satisfactory, you can perform the following steps to tune the performance:
Figure 1 Performance tuning process of TensorFlow network
- If the performance is not satisfactory, you are advised to perform the following common operations to improve it:
- Enable the automatic mixed precision mode.
- Replacing the GELU Activation Function
- Use the AOE tool to tune subgraphs, operators, and gradient splitting strategies.
For details, see Basic Tuning.
- Perform model training again and evaluate whether the training performance is satisfactory.
- If the performance is satisfactory, the tuning is complete.
- If the performance is not satisfactory, go to 3.
- Use the Profiling tool to collect and analyze profile data.
Refer to Profile Data Collection and Analysis to collect, parse, export, and analyze profile data.
- Refer to Advanced Tuning to further improve the performance based on the identified performance bottleneck.
- Perform model training again, conduct a regression test, and evaluate whether the training performance is satisfactory.
- If the performance is satisfactory, the tuning is complete.
- If the performance is not satisfactory, execute Automatic AOE Tuning again.
Parent topic: Performance Tuning