save_model_ascend

Applicability

Product	Supported
Atlas A3 training series products/Atlas A3 inference series products	x
Atlas A2 training products/Atlas A2 inference products	x
Atlas 200I/500 A2 inference product	x
Atlas inference series products	x
Atlas training products	√

Description

Saves the original .pb model to be quantized as a .pb model file that serves for online inference on the NPU (Ascend AI Processor), based on the quantization factor record file (record_file).

Prototype

save_model_ascend(pb_model, outputs, record_file, save_path)

Parameters

Parameter	Input/Output	Description
pb_model	Input	Original .pb model file to be quantized. A string. Restrictions: Must be an inference graph containing no training-mode operators. For example, is_training of the FusedBatchNormV3 operator must be set to False.
outputs	Input	List of output operators in a graph. A list of strings.
record_file	Input	Path (including the file name) of the quantization factor record file. A quantized model file is generated based on the record file, quantization configuration file, and original .pb model file. A string.
save_path	Input	Model save path. Must include the prefix of the model name, for example, *./quantized_model/model**. A string.

Returns

None

Restrictions

This API supports only a .pb model file as input, and therefore you need to convert your model into .pb format in advance.
This API requires the input of a quantization factor record file, which is generated in the quantize_model_ascend phase and has its factor values filled in the model inference phase.

Example

import amct_tensorflow as amct
# Perform network inference and complete quantization during the inference.
with calibration_graph.as_default():
    sess = tf.session(prepare_config("npu"))
    sess.run(calibration_outputs, feed_dict={inputs: inputs_data})
# Insert the API and save the quantized model as a PB file.
amct.save_model_ascend(pb_model="./user_model.pb",
                outputs=["model/outputs"],
                record_file="./record_scale_offset.txt",
                save_path="./inference/model")

Flush file: a .pb model file for online inference in the NPU (Ascend AI Processor) environment. If quantization is performed again, the file generated by the API is overwritten.

Parent topic: PTQ APIs