Mixed Precision Training on the NPU

Objective

The model is successfully ported to the NPU before accuracy tuning. Distributed training (if involved) is enabled.

Especially, mixed precision training is enabled during model porting.

Procedure

For a better trade-off between performance and accuracy, you can enable mixed precision training on the NPU by using either of the following methods:

  1. If you implement manual mixed precision training on the GPU for your benchmark model script, all operator data types are defined in the model. In this case, use the same method when porting the script to the NPU and ensure that precision_mode on the NPU is set to the default value in the current version.

    For the Atlas Training Series Product, the default value is allow_fp32_to_fp16.

  2. If you implement automatic mixed precision training on the GPU for your benchmark model script, define operator data types using TensorFlow or other third-party APIs (such as apex APIs). In this case, use the same method when porting the script to the NPU and ensure that precision_mode on the NPU is set to the default value in the current version.

    For the Atlas Training Series Product, the default value is allow_fp32_to_fp16.

  1. If your benchmark model uses float32 (high-precision mode), enable automatic mixed precision training on the NPU (by setting precision_mode to allow_mix_precision).
    1
    2
    3
    import npu_device as npu
    npu.global_options().precision_mode = 'allow_mix_precision'
    npu.open().as_default()
    

Enable only one of the preceding methods to avoid unexpected problems invited by frequent graph modification.