Model adaptation using convert_model API

You can choose to quantize your source TensorFlow model using quantization factors calculated yourself. Before using ATC to generate an offline model adapted to Ascend AI Processor, use the function provided in this section to convert the model into the CANN format, as models quantized using user-defined quantization factors are not directly convertible by ATC.

Adaptation Principles

Figure 1 API call sequence for uniform quantization shows the adaptation principles. The user implements the operations in blue, while those in gray are implemented by using AMCT's convert_model API. Specifically, import the package to the TensorFlow network inference code and call the API where appropriate for adaptation. For the adaptation example, see Sample List.

Figure 1 Model adaptation principles

Calling Example

This example demonstrates how to use the convert_model API to adapt a source model.

Take the following steps to get started. Update the sample code based on your situation.
To reuse the following code for quantizing a different model, prepare the source model and build quantization factor record file based on user-defined quantization factors yourself.

Import the AMCT package and set the log level.

        
             import amct_tensorflow as amct
amct.set_logging_level(print_level='info', save_level='info')

(Optional) Run inference on the source model in the TensorFlow environment based on the test dataset to validate the inference script and environment setup. (Update the sample code based on your situation.)
This step is recommended as it guarantees a properly functioning source model for inference with acceptable accuracy. You can use a subset from the test dataset to improve the efficiency.
1

user_do_inference(ori_model, test_data)

Call AMCT's convert_model API.

This API parses the model in .pb format into a graph, preprocesses the graph, parses the input quantization factor file, inserts operators such as AscendQuant and AscendDequant based on the quantization factors and the modified graph structure, and saves the model as a quantization model.

         
              quant_model_path = './result/user_model'
record_file = './result/record.txt'
amct.convert_model(pb_model=ori_model,
     outputs=ori_model_outputs,
            record_file=record_file,
     save_path=quant_model_path)

(Optional) Run inference on the fake-quantized model user_model_quantized.pb in the TensorFlow environment based on the test dataset to test the accuracy. (Update the sample code based on your situation.)

Check the accuracy loss of the fake-quantized model by comparing with that of the source model (see 2).

        
             quant_model = './results/user_model_quantized.pb'
user_do_inference(quant_model, test_data)

Parent topic: Model Adaptation