quantize_model

Function Usage

Quantizes the graph structure based on the quantization configuration file set by the user. This function inserts a weight quantization layer into the layer specified by config_file to complete weight quantization, inserts a data quantization layer, and saves the modified network as a new model file.

Prototype

quantize_model(graph, modified_model_file, modified_weights_file)

Command-line options

Option	Input/Return	Description	Restriction
graph	Input	Graph structure parsed by the init API from the user model.	An AMCT-defined Graph.
modified_model_file	Input	File name of the Caffe model definition file after quantization (.prototxt)	A string
modified_weights_file	Input	File name of the Caffe model weight file after quantization (.caffemodel).	A string.

Returns

None

Outputs

A quantization factor record file (scale_offset_record_file): recording the weight quantization factors (scale_w and offset_w) of each layer to be quantized. See the init API.
modified_model_file: definition file of the modified model. The quantization layer is inserted into the original model.

modified_weights_file: weight file of the modified model. The quantization layer is inserted into the original model.

When quantization is performed again, the preceding files output by the API will be overwritten.

Examples

from amct_caffe import quantize_model
# Insert the quantization API.
quantize_model(graph=graph,
               modified_model_file="./quantized_model/modified_model.prototxt",
               modified_weights_file="./quantized_model/modified_model.caffemodel")

Parent topic: PTQ APIs