quantize_model
Applicability
Product |
Supported |
|---|---|
x |
|
x |
|
√ |
|
√ |
|
√ |
Description
Performs quantization on a graph based on the quantization configuration file config_file, inserts weight and activation quantization layers, and saves the modified network to a new model file.
Prototype
1 | quantize_model(graph, modified_model_file, modified_weights_file) |
Parameters
Parameter |
Input/Output |
Restriction |
|---|---|---|
graph |
Input |
Graph structure parsed by the init API from the user model An AMCT-defined Graph. |
modified_model_file |
Input |
Name of the resultant Caffe model definition file (.prototxt) for storing the inserted quantization layers. A string. |
modified_weights_file |
Input |
Name of the resultant Caffe model weight file (.caffemodel) for storing the inserted quantization layers. A string. |
Returns
None
Example
1 2 3 4 5 | from amct_caffe import quantize_model # Insert the quantization API. quantize_model(graph=graph, modified_model_file="./quantized_model/modified_model.prototxt", modified_weights_file="./quantized_model/modified_model.caffemodel") |
Flush files:
- A quantization factor record file (scale_offset_record_file), which records the weight quantization factors (scale_w and offset_w) of each layer to be quantized. See init.
- modified_model_file: definition file of the modified model, with quantization layers inserted into the original model.
- modified_weights_file: weight file of the modified model, with quantization layers inserted into the original model.
When quantization is performed again, the preceding files output by the API will be overwritten.