convert_model

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

x

Atlas A2 training products/Atlas A2 inference products

x

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Converts a Caffe model based on user-defined quantization factors into two models — one fake-quantized for accuracy simulation in the Caffe environment and the other deployable on the Ascend AI Processor for online inference.

Prototype

1
convert_model(model_file,weights_file,scale_offset_record_file,save_path)

Parameters

Parameter

Input/Output

Description

model_file

Input

Definition file (.prototxt) of the Caffe model.

A string.

Restrictions: For layers for inference, the settings in LayerParameter in model_file must meet inference requirements. For example, use_global_stats of the BatchNorm layer must be set to 1.

weights_file

Input

Weight file (.caffemodel) of the trained Caffe model.

A string.

scale_offset_record_file

Input

Quantization factor record file (.txt) from calculation by the user.

A string.

save_path

Input

Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model.

A string.

Returns

None

Restrictions

  • The user model must match the quantization factor record file. For example, if the "Conv+BN+Scale" composite is fused before computation to generate the quantization factors, the "Conv+BN+Scale" composite in the Caffe model to be converted also needs to be fused in advance.
  • The format and content of the quantization factor record file must comply with the AMCT requirements defined in Record Files.
  • AMCT can quantize the following layers: InnerProduct (quantization not supported if transpose = true or axis! = 1), Convolution (using a 4 × 4 filter), Deconvolution (using a 1-dilated 4 × 4 filter with group = 1), and AVE Pooling.
  • This API allows the fusion of the "Conv+BN+Scale" composite with per-layer fusion switch.
  • Only an original floating-point model can be adapted. Secondary quantization on a quantized model (inserted with a Quant, DeQuant, or AntiQuant layer or whose parameters have been quantized to the INT8 or INT32 data type) is not supported.

Example

1
2
3
4
5
from amct_caffe import convert_model
convert_model(model_file='ResNet-50-deploy.prototxt',
              weights_file='ResNet-50-weights.caffemodel',
              scale_offset_record_file='record.txt',
              save_path='./quantized_model/model')

Flush files:

  • A fake-quantized model file for accuracy simulation in the Caffe environment and its weight file, with names containing the fake_quant keyword.
  • A deployable model file and its weight file, with names containing the deploy keyword. The model can be deployed on the Ascend AI Processor after being converted by ATC.
  • A quantization information file that records the locations of the quantization layers inserted by AMCT and operator fusion information, used for accuracy analysis of the quantized model.

When adaptation is performed again, the preceding files output by the API will be overwritten.