What Do I Do If My TensorFlow Network Output Node Is Changed by AMCT?
Symptom
When AMCT's quantize_model API is called to modify the source TensorFlow model, the output node at the output layer changes because a searchN layer has been inserted. In the quantization script, you need to replace the output node for inference with the new output node after graph modification as prompted. The AMCT quantization log provides the original output node name and new output node name after graph modification.
When a graph is modified, the output node of the tail layer changes in the following scenarios:
- Scenario 1: The tail layer of the network model is ADD/ADDV2, and the ADD addend is one-dimensional, which meets the biasadd function.
Figure 1 The tail layer is ADD/ADDV2.
- If the bottom layer is Add, the following message is displayed during graph modification:
1 2 3 4 5 6
2020-09-01 09:31:04,896 - WARNING - [AMCT]:[replace_add_pass]: Replace ADD at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'Add:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add/BiasAdd:0' 2020-09-01 09:31:04,979 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add/BiasAdd:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
- If the tail layer is AddV2, the following message is displayed when you modify a graph:
1 2 3 4 5 6
2020-09-01 09:32:42,281 - WARNING - [AMCT]:[replace_add_pass]: Replace ADD at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'add:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add/BiasAdd:0' 2020-09-01 09:32:42,362 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add/BiasAdd:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
- If the bottom layer is Add, the following message is displayed during graph modification:
- Scenario 2: The tail layer of the network is BiasAdd, and the layer before BiasAdd is Conv2D, DepthwiseConv2dNative, Conv2DBackpropInput or MatMul.
Figure 2 The tail layer of the network is BiasAdd, and its front layer is Conv2D.
In this scenario, the following message is displayed when you modify a graph:
1 2 3
2020-09-01 09:39:26,130 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'BiasAdd:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
- Scenario 3: The tail layer of the network is Conv2D, DepthwiseConv2dNative, Conv2DBackpropInput, MatMul or AvgPool.
Figure 3 The tail layer of the network is Conv2D.
In this scenario, the following message is displayed when you modify a graph:
1 2 3
2020-09-01 09:40:28,717 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'Conv2D:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
- Scenario 4: The tail layer of the network is one of FusedBatchNorm\FusedBatchNormV2\FusedBatchNormV3s, and Conv2D+ (BiasAdd) or DepthwiseConv2dNative+(BiasAdd) precedes the tail layer.
Figure 4 The tail layer is FusedBatchNormV3.
In this scenario, the following message is displayed when you modify a graph:
1 2 3 4 5 6
2020-09-01 09:44:08,637 - WARNING - [AMCT]:[conv_bn_fusion_pass]: Fused BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/FusedBatchNormV3:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add:0' 2020-09-01 09:44:08,717 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
- Scenario 5: The bottom layer of the network uses the BN small operator structure, and the input is 4-dimensional data.
Figure 5 BN small operator structure as the bottom layer
In this scenario, the following message is displayed when you modify a graph:
1 2 3 4 5 6 7 8 9
2020-09-01 09:46:46,426 - WARNING - [AMCT]:[replace_bn_pass]: Replace BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/batchnorm/add_1:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'batch_normalization/batchnorm/bn_replace/batch_normalization/FusedBatchNormV3:0' 2020-09-01 09:46:46,439 - WARNING - [AMCT]:[conv_bn_fusion_pass]: Fused BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/batchnorm/bn_replace/batch_normalization/FusedBatchNormV3:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add:0' 2020-09-01 09:46:46,518 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add:0' //Name of the network output node before the change <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //Name of the network output node after the change
Modify the script.
When the quantize_model API is called to modify the graph of the original TensorFlow model, the output node of the tail layer changes because the searchN layer is inserted at the end of the network. In this case, you need to modify the quantization script based on the log information, replace the name of the output node during network inference with the new node name. The modification method is as follows:
Quantization script before modification (The following script is only an example.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
import tensorflow as tf import amct_tensorflow as amct def load_pb(model_name): with tf.gfile.GFile(model_name, "rb") as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, name='') def main(): # Name of the network .pb file model_name = './pb_model/case_1_1.pb' # Name of the output node of network quantization inference infer_output_name = 'Add:0' # Name of the output node of the quantized model save_output_name = 'Add:0' # Load the .pb file of the network. load_pb(model_name) # Obtain the network graph structure. graph = tf.get_default_graph() # Create a quantization configuration file. amct.create_quant_config( config_file='./configs/config.json', graph=graph) # Insert quantization operators. amct.quantize_model( graph=graph, config_file='./configs/config.json', record_file='./configs/record_scale_offset.txt') # Execute the network inference process. with tf.Session() as sess: output_tensor = graph.get_tensor_by_name(infer_output_name) sess.run(tf.global_variables_initializer()) sess.run(output_tensor) # Save the quantized .pb model file. amct.save_model( pb_model=model_name, outputs=[save_output_name[:-2]], record_file='./configs/record_scale_offset.txt', save_path='./pb_model/case_1_1') if __name__ == '__main__': main() |
Modified quantization script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
import tensorflow as tf import amct_tensorflow as amct def load_pb(model_name): with tf.gfile.GFile(model_name, "rb") as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, name='') def main(): # Name of the network .pb file model_name = './pb_model/case_1_1.pb' # Name of the output node of network quantization inference. Replace it with the new node name printed in the log. infer_output_name = 'search_n_quant/search_n_quant_SEARCHN/Identity:0' # Name of the output node of the quantized model save_output_name = 'Add:0' # Load the .pb file of the network. load_pb(model_name) # Obtain the network graph structure. graph = tf.get_default_graph() # Create a quantization configuration file. amct.create_quant_config( config_file='./configs/config.json', graph=graph) # Insert quantization operators. amct.quantize_model( graph=graph, config_file='./configs/config.json', record_file='./configs/record_scale_offset.txt') # Execute the network inference process. with tf.Session() as sess: output_tensor = graph.get_tensor_by_name(infer_output_name) sess.run(tf.global_variables_initializer()) sess.run(output_tensor) # Save the quantized .pb model file. amct.save_model( pb_model=model_name, outputs=[save_output_name[:-2]], record_file='./configs/record_scale_offset.txt', save_path='./pb_model/case_1_1') if __name__ == '__main__': main() |