Porting with sess.run
If the original TensorFlow network is constructed based on the sess.run API, see this section to understand the manual porting process.
About sess.run
As a low-level API of TensorFlow, sess.run appears more flexible than Estimator. On the flip side, using it for model implementation could be complex.
Develop your training script with the sess.run API as follows:
- Preprocess data.
- Construct a model, calculate the loss, and update the gradient.
- Create a session and initialize resources.
- Start training.
To perform training on Ascend AI Processor, the following guides you to port your training script developed with sess.run.
Header File
To import NPU-related libraries, add this header file reference in related Python files as follows.
1
|
from npu_bridge.npu_init import * |
After the preceding header file is imported, the training script is executed on the Ascend AI Processor by default.
Data Preprocessing
The code snippet is ready to use in normal cases. Manual tweaking is required only in the following scenario:
1
|
dataset = dataset.batch(batch_size, drop_remainder=True) |
1
|
assert num_written_lines == num_actual_predict_examples |
Model Construction, Loss Calculation, and Gradient Update
The code snippet is ready to use in normal cases. Manual tweaking is required only in the following scenarios:
- Replace dropout in the original network with the corresponding CANN API for better performance. You must also pay attention to the impact on the accuracy.
- If tf.nn.dropout exists, modify it as follows:
1layers = npu_ops.dropout()
- If tf.layers.dropout, tf.layers.Dropout, tf.keras.layers.Dropout, tf.keras.layers.SpatialDropout1D, tf.keras.layers.SpatialDropout2D, or tf.keras.layers.SpatialDropout3D exists, add the following header file reference:
1from npu_bridge.estimator.npu import npu_convert_dropout
- If tf.nn.dropout exists, modify it as follows:
- Replace gelu in the original network with the corresponding CANN API to achieve optimal performance.
Original TensorFlow code:
1 2 3 4 5
def gelu(x): cdf = 0.5 * (1.0 + tf.tanh( (np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))) return x*cdf layers = gelu()
Code after porting:
1layers = npu_unary_ops.gelu(x)
Session Creation and Resource Initialization
When running your training script on Ascend AI Processor by using sess.run, note the following configurations:
- The following configuration option is deactivated by default and should remain deactivated:
- The following configuration options are activated by default and should remain activated:
- rewrite_options.function_optimization
- rewrite_options.constant_folding
- rewrite_options.shape_optimization
- rewrite_options.arithmetic_optimization
- rewrite_options.loop_optimization
- rewrite_options.dependency_optimization
- rewrite_options.layout_optimizer
- The following configuration option is enabled by default and should be disabled explicitly:
- rewrite_options.remapping
- rewrite_options.memory_optimization
- If tf.device code is used on the original network, add the session configuration allow_soft_placement=True to allow TensorFlow to automatically allocate devices.
Original TensorFlow code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Construct an iterator. iterator=Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes) # Obtain the batch data. next_batch=iterator.get_next() # Initialize the iterator. training_init_op=iterator.make_initializer(train_dataset) # Initialize the variables. init=tf.global_variables_initializer() sess=tf.Session() sess.run(init) # Get the number of training/validation steps per epoch. train_batches_per_epoch=int(np.floor(train_size/batch_size)) |
Code after porting:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# Construct an iterator. iterator=Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes) # Obtain the batch data. next_batch=iterator.get_next() # Initialize the iterator. training_init_op=iterator.make_initializer(train_dataset) # Initialize the variables. init=tf.global_variables_initializer() # Add allow_soft_placement=True for the session configurations to allow TensorFlow to automatically allocate devices. config = tf.ConfigProto(allow_soft_placement=True) # Add an NPU optimizer named NpuOptimizer. During network compilation, the NPU traverses only the session configurations under NpuOptimizer. custom_op = config.graph_options.rewrite_options.custom_optimizers.add() custom_op.name = "NpuOptimizer" # Explicitly disable the remapping and memory_optimization functions of TensorFlow to avoid conflicts with the functions of the NPU. config.graph_options.rewrite_options.remapping = RewriterConfig.OFF # Explicitly disable the function. config.graph_options.rewrite_options.memory_optimization = RewriterConfig.OFF # Explicitly disable the function. sess = tf.Session(config=config) sess.run(init) # Get the number of training/validation steps per epoch. train_batches_per_epoch=int(np.floor(train_size/batch_size)) |
The Ascend platform supports all native functions of tf.Session.
It also allows you to enable functions such as automatic mixed precision. For details, see Session Configuration.
Training
The code snippet is ready to use. See the following example.
1 2 3 4 5 6 7 8 9 |
# Start epochs. for epoch in range(num_epochs): ##Initialize iterator with the training dataset sess.run(training_init_op) for step in range(train_batches_per_epoch): #get next batch of data img_batch,label_batch=sess.run(next_batch) #run the training op _,train_loss = sess.run([train_op, loss],feed_dict={x:img_batch, y_:label_batch, is_training:True}) |
However, you need an explicit call to sess.close() in your ported script if you create a session without a with block, for example, you define a session object as a class member.
1 2 3 |
sess = tf.Session(config=config) sess.run(...) sess.close() |
That is because the GEOP destructor function is called in the close method of tf.Session. If you use a with block that calls __exit__ to close the session automatically, there is no need to call sess.close().
1 2 |
with tf.Session(config=config) as sess: sess.run(...) |
In other cases, for example, taking a session object as a user-defined class member, you should explicitly call sess.close() to exit the session.