save_model_ascend
Function Usage
Saves the source .pb model to be quantized as a .pb model file that can be used for online inference in the NPU () environment based on the quantization factor record file record_file.
Constraints
- This API receives only the model file in .pb format. You need to convert the model to be quantized to the .pb format in advance.
- This API requires the input of a quantization factor record file, which is generated in the quantize_model_ascend phase and has its factor values filled in the model inference phase.
Prototype
save_model_ascend(pb_model, outputs, record_file, save_path)
Command-Line Options
Option |
Input/Return |
Description |
Restriction |
|---|---|---|---|
pb_model |
Input |
Source .pb model file to be quantized. |
A string Must be an inference graph containing no training-mode operators. For example, is_training of the FusedBatchNormV3 operator must be False. |
outputs |
Input |
List of output operators of the graph. |
A list of strings. |
record_file |
Input |
Directory of the quantization factor record file, including the file name. Generate a quantized model file based on the file, quantization configuration file, and source .pb model file. |
A string |
save_path |
Input |
Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model. |
A string |
Return Value
None
Outputs
A .pb model file that can be used for online inference in the NPU (Ascend AI Processor) environment. When quantization is performed again, the preceding files output by the API will be overwritten.
Examples
1 2 3 4 5 6 7 8 9 10 | import amct_tensorflow as amct # Perform network inference and complete quantization during the inference. with calibration_graph.as_default(): sess = tf.session(prepare_config("npu")) sess.run(calibration_outputs, feed_dict={inputs: inputs_data}) # Insert the API call and save the quantized model as an ONNX file. amct.save_model_ascend(pb_model="./user_model.pb", outputs=["model/outputs"], record_file="./record_scale_offset.txt", save_path="./inference/model") |