Setting the Number of Iterations Offloaded to NPU
Determine if iteration offload is implemented in the script.
The official/vision/image_classification/resnet/common.py script provides two input parameters:
- steps_per_loop indicates the training loop size. As the comment suggests, only training steps are performed inside a loop, with no additional operations such as callbacks.
- use_tf_while_loop defaults to True, meaning that iteration offload is enabled and each training loop is executed as operator While.
1 2 3 4 5 6 7 8 9 10 11 12 | flags.DEFINE_integer( name='steps_per_loop', default=None, help='Number of steps per training loop. Only training step happens ' 'inside the loop. Callbacks will not be called inside. Will be capped at ' 'steps per epoch.') flags.DEFINE_boolean( name='use_tf_while_loop', default=True, help='Whether to build a tf.while_loop inside the training loop on the ' 'host. Setting it to True is critical to have peak performance on ' 'TPU.') |
Set the NPU_LOOP_SIZE environment variable according to the steps_per_loop argument. For details about how to set the environment variable, see Starting Single-Device Training.
Parent topic: Manual Porting and Training