What Do I Do If Model Quantization Fails Because the Model Contains Layers That Do Not Support Quantization?

Symptom

During model conversion using ATC, the --compression_optimize_conf parameter is used to configure options related to model quantization (quantizing the weights of the model from float32 to int8). The following error message is displayed:

ATC start working now, please wait for a moment.
[ERROR][ProcessScale][52] Not support scale greater than 1 / FLT_EPSILON.
[ERROR][WtsArqCalibrationCpuKernel][188] ArqQuantCPU scale is illegal.
[ERROR][ArqQuant][301] WtsArqCalibrationCpuKernel of format CO_CI_KH_KW failed.
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_algorithm.cpp:137]Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311 arq weight fake quant failed!
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_calibration_pass.cpp:90]Fail to excute WeightFakeQuant without trans!
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[weight_calibration_pass.cpp:185]layer Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311 run WeightFakeQuantArq failed
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[graph_optimizer.cpp:43]pass run failed
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[quantize_api.cpp:227]Do GenerateCalibrationGraph optimizer pass failed.
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:19[quantize_api.cpp:363]Generate calibration Graph failed.
[ERROR] AMCT(14815,atc.bin):2023-04-14-12:23:22[inner_graph_calibration.cpp:78]Failed to excute InnerQuantizeGraph failed.

Solutions

According to the error message layer xxxxxx run WeightFakeQuantArq failed, some weight-related layers in the current model do not support quantization. You can skip such layers that do not support quantization through configuration.

  1. Add a configuration to skip the layers that do not support quantization.

    Add a configuration file whose file name extension is .cfg, for example, simple_config.cfg. The file content is as follows (the information in bold indicates the layers that do not support quantization in the error message):

    skip_layers:"Default/network-DeepLabV3/resnet-Resnet/layer4-SequentialCell/0-Bottleneck/downsample-SequentialCell/0-Conv2d/Conv2D-op311"

    In addition, add the config_file parameter to the quantization configuration file specified by the --compression_optimize_conf parameter.

    calibration:
    {
        input_data_dir: xxxxxx
        config_file: simple_config.cfg
        input_shape: xxxxxx
        infer_soc: xxxxxx
    }
  2. Perform model conversion again.
  3. Perform inference again.

    If skipping layers that do not support quantization affects the model inference result, adjust the model and then quantize the model again.