convert_model
Applicability
Product |
Supported |
|---|---|
√ |
|
√ |
|
√ |
|
√ |
|
√ |
Description
Converts a TensorFlow model based on the user-defined quantization factors into a model serving for both accuracy simulation in the TensorFlow environment and inference on the Ascend AI Processor.
Prototype
1 | convert_model(pb_model, outputs, record_file, save_path) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
pb_model |
Input |
Original .pb model file for adaptation. A string. |
outputs |
Input |
List of output operators in a graph. A list. |
record_file |
Input |
Path of the quantization factor record file (.txt) computed by the user. A string. |
save_path |
Input |
Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model. A string. |
Returns
None
Restrictions
- The user model must match the quantization factor record file. For example, if "Conv+BN" fusion is performed before computing the quantization factors of Conv, "Conv+BN" fusion should also be performed in advance in the TensorFlow model to be converted.
- The format and content of the quantization factor record file must comply with the AMCT requirements defined in Record Files.
- AMCT quantizes the following layers: Conv2D, MatMul, DepthwiseConv2dNative (dilation = 1), Conv2DBackpropInput (dilation = 1), and AvgPool.
- This API supports the fusion of the "Conv+BN", "Depthwise_Conv+BN", and "Group_Conv+BN" in the user model. Per-layer fusion switch is supported.
- Adaptation of only an original floating-point model is supported. The model cannot be adapted if the input model contains any of the following custom quantization layers: QuantIfmr, QuantArq, SearchN, AscendQuant, AscendDequant, AscendAntiQuant, and AscendWeightQuant.
Example
1 2 3 4 5 | import amct_tensorflow as amct convert_model(pb_model='./user_model.pb', outputs=["model/outputs"], record_file='./record_quantized.txt', save_path='./quantized_model/model') |
Flush file: a .pb model file that can be used for accuracy simulation in the TensorFlow environment or offline inference on the Ascend AI Processor.
When adaptation is performed again, the preceding files output by the API will be overwritten.