What Do I Do If Model Quantization Fails Because the Model Contains Layers That Do Not Support Quantization?
Symptom
During model conversion using ATC, the --compression_optimize_conf parameter is used to configure options related to model quantization (quantizing the weights of the model from float32 to int8). The following error message is displayed:
ATC start working now, please wait for a moment. [ERROR][ProcessScale][52] Not support scale greater than 1 / FLT_EPSILON. [ERROR][WtsArqCalibrationCpuKernel][188] ArqQuantCPU scale is illegal. [ERROR][ArqQuant][301] WtsArqCalibrationCpuKernel of format CO_CI_KH_KW failed. [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_algorithm.cpp:137]Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311 arq weight fake quant failed! [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_calibration_pass.cpp:90]Fail to excute WeightFakeQuant without trans! [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_calibration_pass.cpp:185]layer Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311 run WeightFakeQuantArq failed [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[graph_optimizer.cpp:43]pass run failed [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[quantize_api.cpp:227]Do GenerateCalibrationGraph optimizer pass failed. [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[quantize_api.cpp:363]Generate calibration Graph failed. [ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:22[inner_graph_calibration.cpp:78]Failed to excute InnerQuantizeGraph failed.
Solutions
According to the error message layer xxxxxx run WeightFakeQuantArq failed, some weight-related layers in the current model do not support quantization. You can skip such layers that do not support quantization through configuration.
- Add a configuration to skip the layers that do not support quantization.
Add a configuration file whose file name extension is .cfg, for example, simple_config.cfg. The file content is as follows (the information in bold indicates the layers that do not support quantization in the error message):
skip_layers:"Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311"
In addition, add the config_file parameter to the quantization configuration file specified by the --compression_optimize_conf parameter.
calibration: { input_data_dir: xxxxxx config_file: simple_config.cfg input_shape: xxxxxx infer_soc: xxxxxx } - Perform model conversion again.
- Perform inference again.
If skipping layers that do not support quantization affects the model inference result, adjust the model and then quantize the model again.