save_model_ascend
Applicability
Product |
Supported |
|---|---|
x |
|
x |
|
x |
|
x |
|
√ |
Description
Saves the original .pb model to be quantized as a .pb model file that serves for online inference on the NPU (Ascend AI Processor), based on the quantization factor record file (record_file).
Prototype
1 | save_model_ascend(pb_model, outputs, record_file, save_path) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
pb_model |
Input |
Original .pb model file to be quantized. A string. Restrictions: Must be an inference graph containing no training-mode operators. For example, is_training of the FusedBatchNormV3 operator must be set to False. |
outputs |
Input |
List of output operators in a graph. A list of strings. |
record_file |
Input |
Path (including the file name) of the quantization factor record file. A quantized model file is generated based on the record file, quantization configuration file, and original .pb model file. A string. |
save_path |
Input |
Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model. A string. |
Returns
None
Restrictions
- This API supports only a .pb model file as input, and therefore you need to convert your model into .pb format in advance.
- This API requires the input of a quantization factor record file, which is generated in the quantize_model_ascend phase and has its factor values filled in the model inference phase.
Example
1 2 3 4 5 6 7 8 9 10 | import amct_tensorflow as amct # Perform network inference and complete quantization during the inference. with calibration_graph.as_default(): sess = tf.session(prepare_config("npu")) sess.run(calibration_outputs, feed_dict={inputs: inputs_data}) # Insert the API and save the quantized model as a PB file. amct.save_model_ascend(pb_model="./user_model.pb", outputs=["model/outputs"], record_file="./record_scale_offset.txt", save_path="./inference/model") |
Flush file: a .pb model file for online inference in the NPU (Ascend AI Processor) environment. If quantization is performed again, the file generated by the API is overwritten.