quantize_model
Function Usage
Quantizes the graph structure based on the quantization configuration file set by the user. This function inserts a weight quantization layer into the layer specified by config_file to complete weight quantization, inserts a data quantization layer, and saves the modified network as a new model file.
Prototype
quantize_model(graph, modified_model_file, modified_weights_file)
Command-line options
Option |
Input/Return |
Description |
Restriction |
|---|---|---|---|
graph |
Input |
Graph structure parsed by the init API from the user model. |
An AMCT-defined Graph. |
modified_model_file |
Input |
File name of the Caffe model definition file after quantization (.prototxt) |
A string |
modified_weights_file |
Input |
File name of the Caffe model weight file after quantization (.caffemodel). |
A string. |
Returns
None
Outputs
- A quantization factor record file (scale_offset_record_file): recording the weight quantization factors (scale_w and offset_w) of each layer to be quantized. See the init API.
- modified_model_file: definition file of the modified model. The quantization layer is inserted into the original model.
- modified_weights_file: weight file of the modified model. The quantization layer is inserted into the original model.
When quantization is performed again, the preceding files output by the API will be overwritten.
Examples
1 2 3 4 5 | from amct_caffe import quantize_model # Insert the quantization API. quantize_model(graph=graph, modified_model_file="./quantized_model/modified_model.prototxt", modified_weights_file="./quantized_model/modified_model.caffemodel") |