NPULossScaleOptimizer Constructor

Applicability

Product	Supported
Atlas A3 training products/Atlas A3 inference products	√
Atlas A2 training products/Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	☓
Atlas inference products	☓
Atlas training products	√

Description

Constructs an object of class NPULossScaleOptimizer, which is used to enable loss scaling in mixed precision training when the overflow/underflow mode of floating-point computation is saturation mode. Loss scaling solves the underflow problem caused by the small float16 representation range. The NPULossScaleOptimizer class inherits the LossScaleOptimizer class and can call native APIs of the base class.

For the Atlas A3 training products/Atlas A3 inference products, the overflow/underflow mode of floating-point computation can be saturation or Inf/NaN. Retain the default Inf/NaN mode. The saturation mode is used only for compatibility with earlier versions and will not evolve in the future. In addition, the computing accuracy in this mode may be unreliable.
For the Atlas A2 training products/Atlas A2 inference products, the overflow/underflow mode of floating-point computation can be saturation or Inf/NaN. Retain the default Inf/NaN mode. The saturation mode is used only for compatibility with earlier versions and will not evolve in the future. In addition, the computing accuracy in this mode may be unreliable.
For the Atlas training products, the default overflow/underflow mode of floating-point computation is saturation, and only the saturation mode is supported. This means when an overflow/underflow occurs during computation, the computation result is saturated to a floating-point extreme value (±MAX).

Prototype

class NPULossScaleOptimizer(lso.LossScaleOptimizer):
    def __init__(self, opt, loss_scale_manager, is_distributed=False)

Parameters

Parameter	Input/Output	Description
opt	Input	Single-server training optimizer for gradient calculation and weight update.
loss_scale_manager	Input	Loss scaling update mode, including static update and dynamic update. Before creating NPULossScaleOptimizer, you can instantiate a FixedLossScaleManager class to set the loss scaling with a static value. For details about the constructor of the FixedLossScaleManager class, see FixedLossScaleManager Constructor. Before creating NPULossScaleOptimizer, you can instantiate an ExponentialUpdateLossScaleManager class to dynamically configure loss scaling. For details about the constructor of the ExponentialUpdateLossScaleManager class, see ExponentialUpdateLossScaleManager Constructor.
is_distributed	Input	Used to support the loss scaling function in the distributed training scenario. True: Set this parameter to True for distributed training. False

Parameter

Input/Output

Description

opt

Input

Single-server training optimizer for gradient calculation and weight update.

loss_scale_manager

Input

Loss scaling update mode, including static update and dynamic update.

Before creating NPULossScaleOptimizer, you can instantiate a FixedLossScaleManager class to set the loss scaling with a static value. For details about the constructor of the FixedLossScaleManager class, see FixedLossScaleManager Constructor.
Before creating NPULossScaleOptimizer, you can instantiate an ExponentialUpdateLossScaleManager class to dynamically configure loss scaling. For details about the constructor of the ExponentialUpdateLossScaleManager class, see ExponentialUpdateLossScaleManager Constructor.

is_distributed

Input

Used to support the loss scaling function in the distributed training scenario.

True: Set this parameter to True for distributed training.
False

Returns

An object of the NPULossScaleOptimizer class

Example

from npu_bridge.npu_init import *

if FLAGS.use_fp16 and (FLAGS.npu_bert_loss_scale not in [None, -1]):
  opt_tmp = opt
  if FLAGS.npu_bert_loss_scale == 0:
    loss_scale_manager = ExponentialUpdateLossScaleManager(init_loss_scale=2**32, incr_every_n_steps=1000, decr_every_n_nan_or_inf=2, decr_ratio=0.5)
  elif FLAGS.npu_bert_loss_scale >= 1:
    loss_scale_manager = FixedLossScaleManager(loss_scale=FLAGS.npu_bert_loss_scale)
  else:
    raise ValueError("Invalid loss scale: %d" % FLAGS.npu_bert_loss_scale)
  if ops_adapter.size() > 1:
    opt = NPULossScaleOptimizer(opt_tmp, loss_scale_manager, is_distributed=True)
  else:
    opt = NPULossScaleOptimizer(opt_tmp, loss_scale_manager)

Parent topic: npu_bridge.estimator.npu.npu_loss_scale_optimizer