quantize_model

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

x

Atlas A2 training products/Atlas A2 inference products

x

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Performs quantization on a graph based on the quantization configuration file config_file, inserts weight and activation quantization layers, and saves the modified network to a new model file.

Prototype

1
quantize_model(graph, modified_model_file, modified_weights_file)

Parameters

Parameter

Input/Output

Restriction

graph

Input

Graph structure parsed by the init API from the user model

An AMCT-defined Graph.

modified_model_file

Input

Name of the resultant Caffe model definition file (.prototxt) for storing the inserted quantization layers.

A string.

modified_weights_file

Input

Name of the resultant Caffe model weight file (.caffemodel) for storing the inserted quantization layers.

A string.

Returns

None

Example

1
2
3
4
5
from amct_caffe import quantize_model
# Insert the quantization API.
quantize_model(graph=graph,
               modified_model_file="./quantized_model/modified_model.prototxt",
               modified_weights_file="./quantized_model/modified_model.caffemodel")

Flush files:

  • A quantization factor record file (scale_offset_record_file), which records the weight quantization factors (scale_w and offset_w) of each layer to be quantized. See init.
  • modified_model_file: definition file of the modified model, with quantization layers inserted into the original model.
  • modified_weights_file: weight file of the modified model, with quantization layers inserted into the original model.

When quantization is performed again, the preceding files output by the API will be overwritten.