Setting the Number of Iterations Offloaded to NPU

Determine if iteration offload is implemented in the script.

The official/vision/image_classification/resnet/common.py script provides two input parameters:

  • steps_per_loop indicates the training loop size. As the comment suggests, only training steps are performed inside a loop, with no additional operations such as callbacks.
  • use_tf_while_loop defaults to True, meaning that iteration offload is enabled and each training loop is executed as operator While.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
flags.DEFINE_integer(
    name='steps_per_loop',
    default=None,
    help='Number of steps per training loop. Only training step happens '
    'inside the loop. Callbacks will not be called inside. Will be capped at '
    'steps per epoch.')
flags.DEFINE_boolean(
    name='use_tf_while_loop',
    default=True,
    help='Whether to build a tf.while_loop inside the training loop on the '
    'host. Setting it to True is critical to have peak performance on '
    'TPU.')

Set the NPU_LOOP_SIZE environment variable according to the steps_per_loop argument. For details about how to set the environment variable, see Starting Single-Device Training.