save_model_ascend

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

x

Atlas A2 training products/Atlas A2 inference products

x

Atlas 200I/500 A2 inference product

x

Atlas inference series products

x

Atlas training products

Description

Saves the original .pb model to be quantized as a .pb model file that serves for online inference on the NPU (Ascend AI Processor), based on the quantization factor record file (record_file).

Prototype

1
save_model_ascend(pb_model, outputs, record_file, save_path)

Parameters

Parameter

Input/Output

Description

pb_model

Input

Original .pb model file to be quantized.

A string.

Restrictions: Must be an inference graph containing no training-mode operators. For example, is_training of the FusedBatchNormV3 operator must be set to False.

outputs

Input

List of output operators in a graph.

A list of strings.

record_file

Input

Path (including the file name) of the quantization factor record file. A quantized model file is generated based on the record file, quantization configuration file, and original .pb model file.

A string.

save_path

Input

Model save path.

Must include the prefix of the model name, for example, ./quantized_model/*model.

A string.

Returns

None

Restrictions

  • This API supports only a .pb model file as input, and therefore you need to convert your model into .pb format in advance.
  • This API requires the input of a quantization factor record file, which is generated in the quantize_model_ascend phase and has its factor values filled in the model inference phase.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import amct_tensorflow as amct
# Perform network inference and complete quantization during the inference.
with calibration_graph.as_default():
    sess = tf.session(prepare_config("npu"))
    sess.run(calibration_outputs, feed_dict={inputs: inputs_data})
# Insert the API and save the quantized model as a PB file.
amct.save_model_ascend(pb_model="./user_model.pb",
                outputs=["model/outputs"],
                record_file="./record_scale_offset.txt",
                save_path="./inference/model")

Flush file: a .pb model file for online inference in the NPU (Ascend AI Processor) environment. If quantization is performed again, the file generated by the API is overwritten.