Overview

Mixed precision is a common way to improve performance in the industry. It increases the data computing parallelism by reducing some computing precisions. Mixed precision is the combined use of the float16 and float32 data types in training deep neural networks, which reduces memory usages and accesses. Training with mixed precision presents itself as a better choice for training large networks without compromising the network accuracy produced by float32.

You can enable the mixed precision by configuring precision_mode or precision_mode_v2 in the script. For details about precision_mode and precision_mode_v2 and their restrictions, see Accuracy Tuning.

If automatic mixed precision is enabled, you are advised to use Loss Scaling to compensate for the accuracy loss caused by precision reduction. To analyze profile data, you need to manually modify the precision mode of some operators. You can refer to Modifying the Blocklist, Trustlist, and Graylist for Mixed Precision to specify operators to reduce or preserve the precision.

Parent topic: Training with Mixed Precision