ExperimentalConfig Constructor

Description

Constructs an object of the ExperimentalConfig class. This constructor is an extended option for debugging and may be changed in later versions. Therefore, it cannot be used in commercial products.

Prototype

def __init__(self,
accelerate_train_mode=="fast|step|0.9",
......
)

Options

Option	Input/Output	Description
accelerate_train_mode	Input	If training takes more than one hour, you can trigger training acceleration to improve training performance by configuring this option. The software compiles and runs the corresponding proportion of training processes with reduced precision based on the acceleration type, mode of triggering acceleration, and proportion of low-precision training processes of your configurations. The remaining training processes are compiled and run based on the original precision. The value type of this option is a string. Three fields are separated by vertical bars (\|), for example, fast\|step\|0.9. The first field indicates the acceleration type, which can be fast or fast1. fast indicates that the compilation is performed based on the float16 data type during precision reduction. fast1 indicates that the compilation is performed based on the bf16 data type during precision reduction. The second field supports two values: step and loss, indicating that the entire training process is divided into low-precision training and high-precision training based on the step value or loss value, respectively. The third field indicates the proportion of the low-precision training process to the total step or loss values. When the value of the second field is step, its value ranges from 0.2 to 0.9. Defaults to 0.9. When the value of the second field is loss, its value ranges from 1.01 to 1.5. Defaults to 1.05. Example: Acceleration triggered by step: accelerate_train_mode="fast\|step\|0.9" Acceleration triggered by loss: accelerate_train_mode="fast\|loss\|1.05" Notes: If you need to trigger training acceleration by using this option, ensure that the network script can be properly converged. In scenarios where network script training takes a short time, the end-to-end performance duration may not yield positive benefits, if this option is enabled. The function of this option is related to the precision mode configured in the network script: When precision_mode is used to configure the precision mode, this option can be enabled only when precision_mode is set to allow_fp32_to_fp16, must_keep_origin_dtype, or none. When precision_mode_v2 is used to configure the precision mode, this option can be enabled only when precision_mode_v2 is set to origin or none. The function of this option is related to the number of iterations per loop. When the iterations per loop are enabled, the entire training process may not be split based on the specified value of step or loss, which may finally affect loss and precision. When this option is enabled, you need to modify the network script and use TellMeStepOrLossHook Constructor to notify the bottom-layer software of the serial number of the current step and the total number of steps, or the current loss and the target loss. Example: from npu_bridge.npu_init import * from npu_bridge.estimator.npu.npu_config import ExperimentalConfig from npu_bridge.estimator.npu.npu_hook import TellMeStepOrLossHook # Enable the fast acceleration mode. The training process is divided based on the ratio of 90% to the total steps. That is, low-precision training is performed on 90% of the total steps, and high-precision training is performed on the remaining steps. experimental_config = npu_config.ExperimentalConfig(accelerate_train_mode="fast\|step\|0.9") config = NPURunConfig(experimental_config=experimental_config) est = NPUEstimator( model_fn=model_fn, config=config, params=params) hooks = [] max_steps = 10000 # step splitting mode, which notifies the bottom-layer software of the serial number of the current step and the total number of steps. The value global_step:0 is only an example. Set it to the actual tensor name of the current step. my_hook = TellMeStepOrLossHook(step='global_step:0', total_step=max_steps ) # loss splitting mode, which notifies the bottom-layer software of the current loss and the target loss. The value loss:0 is only an example. Set it to the actual tensor name of the current loss. # my_hook = TellMeStepOrLossHook(loss='loss:0', final_loss=7.1) hooks.append(my_hook) # Start training. est.train( input_fn=imagenet_train.input_fn, max_steps=max_steps hooks=hooks)

Option

Input/Output

Description

accelerate_train_mode

Input

If training takes more than one hour, you can trigger training acceleration to improve training performance by configuring this option.

The software compiles and runs the corresponding proportion of training processes with reduced precision based on the acceleration type, mode of triggering acceleration, and proportion of low-precision training processes of your configurations. The remaining training processes are compiled and run based on the original precision.

The value type of this option is a string. Three fields are separated by vertical bars (|), for example, fast|step|0.9.

The first field indicates the acceleration type, which can be fast or fast1.
- fast indicates that the compilation is performed based on the float16 data type during precision reduction.
- fast1 indicates that the compilation is performed based on the bf16 data type during precision reduction.
The second field supports two values: step and loss, indicating that the entire training process is divided into low-precision training and high-precision training based on the step value or loss value, respectively.
The third field indicates the proportion of the low-precision training process to the total step or loss values.
- When the value of the second field is step, its value ranges from 0.2 to 0.9. Defaults to 0.9.
- When the value of the second field is loss, its value ranges from 1.01 to 1.5. Defaults to 1.05.

Example:

Acceleration triggered by step:
```
accelerate_train_mode="fast|step|0.9"
```
Acceleration triggered by loss:
```
accelerate_train_mode="fast|loss|1.05"
```

Notes:

If you need to trigger training acceleration by using this option, ensure that the network script can be properly converged.
In scenarios where network script training takes a short time, the end-to-end performance duration may not yield positive benefits, if this option is enabled.
The function of this option is related to the precision mode configured in the network script:
- When precision_mode is used to configure the precision mode, this option can be enabled only when precision_mode is set to allow_fp32_to_fp16, must_keep_origin_dtype, or none.
- When precision_mode_v2 is used to configure the precision mode, this option can be enabled only when precision_mode_v2 is set to origin or none.
The function of this option is related to the number of iterations per loop. When the iterations per loop are enabled, the entire training process may not be split based on the specified value of step or loss, which may finally affect loss and precision.

When this option is enabled, you need to modify the network script and use TellMeStepOrLossHook Constructor to notify the bottom-layer software of the serial number of the current step and the total number of steps, or the current loss and the target loss.

Example:

from npu_bridge.npu_init import *
from npu_bridge.estimator.npu.npu_config import ExperimentalConfig
from npu_bridge.estimator.npu.npu_hook import TellMeStepOrLossHook
# Enable the fast acceleration mode. The training process is divided based on the ratio of 90% to the total steps. That is, low-precision training is performed on 90% of the total steps, and high-precision training is performed on the remaining steps.
experimental_config = npu_config.ExperimentalConfig(accelerate_train_mode="fast|step|0.9")
config = NPURunConfig(experimental_config=experimental_config)
est = NPUEstimator(
model_fn=model_fn,
config=config,
params=params)
hooks = []
max_steps = 10000
# step splitting mode, which notifies the bottom-layer software of the serial number of the current step and the total number of steps. The value global_step:0 is only an example. Set it to the actual tensor name of the current step.
my_hook = TellMeStepOrLossHook(step='global_step:0', total_step=max_steps )
# loss splitting mode, which notifies the bottom-layer software of the current loss and the target loss. The value loss:0 is only an example. Set it to the actual tensor name of the current loss.
# my_hook = TellMeStepOrLossHook(loss='loss:0', final_loss=7.1)
hooks.append(my_hook)
# Start training.
est.train(
input_fn=imagenet_train.input_fn,
max_steps=max_steps
hooks=hooks)

Returns

An object of the ExperimentalConfig class, as an argument passed to the NPURunConfig call.

Restrictions

None

Examples

from npu_bridge.npu_init import *
from npu_bridge.estimator.npu.npu_config import ExperimentalConfig
...
experimental_config=ExperimentalConfig(accelerate_train_mode="fast|step|0.9")
session_config=tf.ConfigProto(allow_soft_placement=True)
config = NPURunConfig(experimental_config=experimental_config, session_config=session_config)

Parent topic: npu_bridge.estimator.npu.npu_config