Enabling Iteration Offload in Estimator Mode
Automated porting
- Search for npu_run_config_init in the ported script and find the run configuration parameter (such as run_config in the example). Pass the session_config parameter to the run configuration function, and add iterations_per_loop to the session_config parameter.
1 2 3 4 5 6 7 8 9 10 11 12 13
session_config = tf.ConfigProto(allow_soft_placement=True) custom_op = session_config.graph_options.rewrite_options.custom_optimizers.add() custom_op.name = 'NpuOptimizer' custom_op.parameter_map["enable_data_pre_proc"].b = True # The GetNext operator offload is a prerequisite for iteration offload. custom_op.parameter_map["iterations_per_loop"].i = 10 run_config = tf.estimator.RunConfig( train_distribute=distribution_strategy, session_config=session_config, # Add the session_config configuration to the run configuration parameter. save_checkpoints_secs=60*60*24) classifier = tf.estimator.Estimator( model_fn=model_function, model_dir=flags_obj.model_dir, config=npu_run_config_init(run_config=run_config))
- Add SetIterationsVarHook.
1 2 3 4 5
train_hooks = hooks_helper.get_train_hooks( flags_obj.hooks, model_dir=flags_obj.model_dir, batch_size=flags_obj.batch_size) train_hooks.append(SetIterationsVarHook(10))
- Add IterationOp to train_op.
1 2
train_op = opt.apply_gradients( grad_var_list, global_step = global_step ) train_op = tf.group(train_op, name="IterationOp") # Set name to the operator that receives the gradient update.
Manual porting
In Estimator mode, configure iterations_per_loop in NPURunConfig as follows.
1 2 3 4 | from npu_bridge.npu_init import * session_config=tf.ConfigProto(allow_soft_placement=True) config = NPURunConfig(session_config=session_config, iterations_per_loop=10) |
In addition, enable the GetNext operator offload, which is a prerequisite for iteration offload. In Estimator mode, the GetNext operator offload is enabled by default, that is, enable_data_pre_proc is set to True by default. Retain the default setting.
Checking Whether iterations_per_loop Takes Effect
After iteration offload is enabled, you can check whether the keyword "Insert op success" exists in the INFO log on the host to determine whether iterations_per_loop takes effect.
You can run the following command to set the log level on the host to INFO. The default output path of INFO logs is $HOME/ascend/log/run/plog/.
export ASCEND_GLOBAL_LOG_LEVEL=1
Parent topic: Iteration Offload