save_model

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Inserts operators such as AscendQuant and AscendDequant into the original .pb model based on the quantization factor record file record_file and generates a .pb model file that serves for both accuracy simulation in the TensorFlow environment and inference deployment on the Ascend AI Processor.

Prototype

1
save_model(pb_model, outputs, record_file, save_path)

Parameters

Parameter

Input/Output

Description

pb_model

Input

Original .pb model file for quantization.

A string.

outputs

Input

List of output operators in a graph.

A list of strings.

record_file

Input

Path (including the file name) of the quantization factor record file. A quantized model file is generated based on the record file, quantization configuration file, and original .pb model file.

A string.

save_path

Input

Model save path.

Must include the prefix of the model name, for example, ./quantized_model/*model.

A string.

Returns

None

Restrictions

  • This API can be called only after batch_num forward passes are completed. Failure to do so may lead to incorrect quantization factors and thus unsatisfactory quantization result.
  • This API supports only a .pb model file as input, and therefore you need to convert your model into .pb format in advance.
  • This API requires the input of a quantization factor record file, which is generated in the quantize_model phase and has its factor values filled in the model inference phase.

Example

1
2
3
4
5
6
7
8
9
import amct_tensorflow as amct
# Perform network inference and complete quantization during the inference.
for i in batch_num:
    sess.run(outputs, feed_dict={inputs: inputs_data})
# Insert the API and save the quantized model as a PB file.
amct.save_model(pb_model="./user_model.pb",
                outputs=["model/outputs"],
                record_file="./record_scale_offset.txt",
                save_path="./inference/model")

Quantized model after flushing: a .pb model file that can be used for accuracy simulation in the TensorFlow environment or offline inference on the Ascend AI Processor. When quantization is performed again, the preceding files output by the API will be overwritten.