quantize_model

Function Usage

Quantizes the graph structure based on the quantization configuration file set by the user. This function inserts a weight quantization layer into the layer specified by config_file to complete weight quantization, inserts a data quantization layer, and saves the modified network as a new model file.

Prototype

quantize_model(graph, modified_model_file, modified_weights_file)

Command-line options

Option

Input/Return

Description

Restriction

graph

Input

Graph structure parsed by the init API from the user model.

An AMCT-defined Graph.

modified_model_file

Input

File name of the Caffe model definition file after quantization (.prototxt)

A string

modified_weights_file

Input

File name of the Caffe model weight file after quantization (.caffemodel).

A string.

Returns

None

Outputs

  • A quantization factor record file (scale_offset_record_file): recording the weight quantization factors (scale_w and offset_w) of each layer to be quantized. See the init API.
  • modified_model_file: definition file of the modified model. The quantization layer is inserted into the original model.
  • modified_weights_file: weight file of the modified model. The quantization layer is inserted into the original model.

When quantization is performed again, the preceding files output by the API will be overwritten.

Examples

1
2
3
4
5
from amct_caffe import quantize_model
# Insert the quantization API.
quantize_model(graph=graph,
               modified_model_file="./quantized_model/modified_model.prototxt",
               modified_weights_file="./quantized_model/modified_model.caffemodel")