Restrictions

iterations_per_loop is the number of iterations per training loop performed on the device per sess.run call. Training is performed according to the specified number of iterations per loop (iterations_per_loop) on the device and then the result is returned to the host. This parameter can save unnecessary interactions between the host and device and reduce the training time.

iterations_per_loop defaults to 1. You can enable the iteration offload feature by setting this parameter to a value greater than 1. Note the following restrictions when using this feature:

  • The training script must read data in TensorFlow's dataset mode instead of the one-shot iterator for preprocessing initialization. For example, use the tf.data.make_initializable_iterator() iterator. Reading data using the Dataset method is a prerequisite for data preprocessing offload and training iteration loop offload. For detailed usage of Datasets, see the TensorFlow official website.
  • Enable data preprocessing offload by setting enable_data_pre_proc to True. This will generate the GetNext operator that runs on the device, thereby enabling training iteration loop offload.
    • The following is an example of enabling data preprocessing offload in sess.run:
      custom_op.parameter_map["enable_data_pre_proc"].b = True 
    • The following is an example of enabling data preprocessing offload in NPURunConfig:
      config = NPURunConfig(enable_data_pre_proc=True)
  • The total number of training iterations must be evenly divisible by iterations_per_loop.
  • When saving checkpoint data in iteration offload mode, set save_checkpoints_steps to a positive integer multiple of iterations_per_loop, so that checkpoints can be saved in strict accordance with save_checkpoints_steps. If the value of iterations_per_loop is greater than 1, data may not be saved as defined by save_summary_steps and log_step_count_steps. In this case, follow Log and Summary Operators to resolve this problem.
  • In mixed computing mode (with mix_compile_mode set to True), iteration offload must not be enabled. That is, iterations_per_loop must be set to 1.
  • During network development, you are advised to set iterations_per_loop to 1 to facilitate log printing every iteration. After the network is set up correctly, you can set the iterations_per_loop parameter to shorten the training time.