quantize_model

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Quantizes a graph based on the quantization configuration file, inserts the weight and activation quantization operators, generates a quantization factor record file record_file, and returns an ONNX model ready for calibration.

Prototype

1
quantize_model(config_file, model_file, modified_onnx_file, record_file)

Parameters

Parameter

Input/Output

Description

config_file

Input

User-generated distillation configuration file, which is used to specify the configuration of the quantization layer in the model network.

A string.

model_file

Input

Original ONNX model file or updated model generated by the create_quant_config API.

A string.

modified_onnx_file

Input

Name of the file for storing the ONNX calibration model for activation quantization.

A string.

record_file

Input

Path (including the file name) of the quantization factor record file.

A string.

Returns

None

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import amct_onnx as amct

model_file = "resnet101.onnx"
scale_offset_record_file = os.path.join(TMP, 'scale_offset_record.txt')
modified_model = os.path.join(TMP, 'modified_model.onnx')
config_file="./configs/config.json"
# Insert the quantization API.
amct.quantize_model(config_file,
                    model_file,
                    modified_model,
                    scale_offset_record_file)