convert_model
Function Usage
Converts a Caffe model based on user-defined quantization factors into two models — one fake-quantized for accuracy simulation in the Caffe environment and the other deployable on Ascend AI Processor for online inference.
Constraints
- The user model must match the quantization factor record file. For example, if the Conv+BN+Scale composite is fused before computation to generate the quantization factors, the Conv+BN+Scale composite in the Caffe model to be converted also needs to be fused in advance.
- The format and content of the quantization factor record file must comply with the AMCT requirements defined in Record Files.
- AMCT can quantize the following layers: InnerProduct (quantization not supported if transpose = true or axis! = 1), Convolution (using a 4 x 4 filter), Deconvolution (using a 1-dilated 4 x 4 filter with group = 1), and AVE Pooling.
- This API allows the fusion of the Conv+BN+Scale composite with per-layer fusion switch.
- Only a standard floating-point model of the Caffe framework can be passed to this API call. Secondary quantization on a quantized model (inserted with a Quant, DeQuant, or AntiQuant layer or whose parameters have been quantized to the INT8 or INT32 data type) is not supported.
Prototype
convert_model(model_file,weights_file,scale_offset_record_file,save_path)
Command-Line Options
Option |
Input/Return |
Description |
Restriction |
|---|---|---|---|
model_file |
Input |
Definition file of the Caffe model (.prototxt). |
A string Restriction: For layers for inference, the settings in LayerParameter in model_file must meet inference requirements. For example, use_global_stats of the BatchNorm layer must be set to 1. |
weights_file |
Input |
Weight file of the Caffe model (.caffemodel). |
A string |
scale_offset_record_file |
Input |
Computed quantization factor record file (.txt). |
A string |
save_path |
Input |
Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model. |
A string |
Returns
None
Outputs
- A fake-quantized model for accuracy simulation in the Caffe environment and its weight file, with names containing the fake_quant keyword.
- A deployable model and its weight file, with names containing the deploy keyword. The model can be deployed on Ascend AI Processor after being converted by the ATC tool.
- A quantization information file that records the locations of the quantization layers inserted by AMCT and operator fusion information, used for accuracy analysis of the quantized model.
When distillation is performed again, the preceding files output by this API will be overwritten.
Examples
1 2 3 4 5 | from amct_caffe import convert_model convert_model(model_file='ResNet-50-deploy.prototxt', weights_file='ResNet-50-weights.caffemodel', scale_offset_record_file='record.txt', save_path='./quantized_model/model') |