Performance Tuning Process

If the performance of the network ported to the Ascend AI Processor for training is not satisfactory, you can perform the following steps to tune the performance:

Figure 1 Performance tuning process
  1. If the performance is not satisfactory, you are advised to perform the following common operations to improve it:
    1. Enable the automatic mixed precision mode.
    2. Check whether the affinity interfaces are replaced.
    3. Enable iteration offload.
    4. Use the AOE tool to tune subgraphs, operators, and gradient segmentation policies.

    For details, see Basic Tuning.

  2. Perform model training again and evaluate whether the training performance is satisfactory.
    • If the performance is satisfactory, the tuning is complete.
    • If the performance is not satisfactory, go to 3.
  3. Use the Profiling tool to collect and analyze profile data.

    Refer to Profile Data Collection and Analysis to collect, parse, export, and analyze profile data.

  4. Refer to Advanced Tuning to further improve the performance based on the identified performance bottleneck.
  5. Perform model training again, conduct a regression test, and evaluate whether the training performance is satisfactory.
    • If the performance is satisfactory, the tuning is complete.
    • If the performance is not satisfactory, execute Automatic AOE Tuning again.