Run Configuration
Set run configuration using the resnet_main() function.
Function |
Description |
Location |
|---|---|---|
resnet_main() |
Main function for run configuration, training, and validation. |
official/r1/resnet/resnet_run_loop.py |
- Import the following header files to the official/r1/resnet/resnet_run_loop.py file:
1 2
from npu_bridge.estimator.npu.npu_config import NPURunConfig from npu_bridge.estimator.npu.npu_estimator import NPUEstimator
- Replace Runconfig with NPURunconfig to configure run parameters.Tweak: resnet_main() in official/r1/resnet/resnet_run_loop.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
############## NPU modify begin ############# # Replace Runconfig with NPURunconfig to adapt to the Ascend AI Processor. Save the checkpoint every 115200 steps and summary every 10000 times, # Preprocess data and enable the mixed precision mode to improve the training speed. run_config = NPURunConfig( model_dir=flags_obj.model_dir, session_config=session_config, save_checkpoints_steps=115200, enable_data_pre_proc=True, iterations_per_loop=100, # enable_auto_mix_precision=True, # Set precision_mode to allow_mix_precision. precision_mode='allow_mix_precision', hcom_parallel=True ) ############## npu modify end ############### # The run configuration in the code is as follows. # run_config = tf.estimator.RunConfig( # train_distribute=distribution_strategy, # session_config=session_config, # save_checkpoints_secs=60 * 60 * 24, # save_checkpoints_steps=None)
For details about how to set the mixed precision mode (precision_mode='allow_mix_precision' ), see Setting the Mixed Precision Mode.
- # Create NPUEstimator to replace tf.estimator.Estimator.
Tweak: resnet_main() in official/r1/resnet/resnet_run_loop.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
# Replace tf.estimator.Estimator with NPUEstimator. classifier = NPUEstimator( model_fn=model_function, model_dir=flags_obj.model_dir, config=run_config, params={ 'resnet_size': int(flags_obj.resnet_size), 'data_format': flags_obj.data_format, 'batch_size': flags_obj.batch_size, 'resnet_version': int(flags_obj.resnet_version), 'loss_scale': flags_core.get_loss_scale(flags_obj, default_for_fp16=128), 'dtype': flags_core.get_tf_dtype(flags_obj), 'fine_tune': flags_obj.fine_tune, 'num_workers': num_workers, 'num_gpus': flags_core.get_num_gpus(flags_obj), }) # The creation of Estimator in the code is as follows. # classifier = tf.estimator.Estimator( # model_fn=model_function, model_dir=flags_obj.model_dir, config=run_config, # warm_start_from=warm_start_settings, params={ # 'resnet_size': int(flags_obj.resnet_size), # 'data_format': flags_obj.data_format, # 'batch_size': flags_obj.batch_size, # 'resnet_version': int(flags_obj.resnet_version), # 'loss_scale': flags_core.get_loss_scale(flags_obj, # default_for_fp16=128), # 'dtype': flags_core.get_tf_dtype(flags_obj), # 'fine_tune': flags_obj.fine_tune, # 'num_workers': num_workers, # })
Parent topic: Manual Porting Samples (ResNet-50 Models)