Simplified Configuration File

The following table lists the simplified configuration file. The file is stored in the /amct_tensorflow/proto/calibration_config_ascend_tf.proto directory in the installation directory.

Table 1 Description of the calibration_config_ascend_tf.proto parameter

Message

Required

Type

Parameter

Description

AMCTConfig

-

-

-

Simplified PTQ configuration of AMCT.

optional

bool

activation_offset

Whether to quantize activations with offset. It is a global configuration parameter.

  • true: with offset. Activations are asymmetrically quantized.
  • false: without offset. Activations are symmetrically quantized.

repeated

string

skip_layers

Layers to skip quantization.

repeated

string

skip_layer_types

Types of layers to skip quantization.

optional

CalibrationConfig

common_config

Common quantization configuration, which is a global parameter. Use this configuration if a layer is not overridden by override_layer_types or override_layer_configs.

Parameter priority: override_layer_configs > override_layer_types > common_config

repeated

OverrideLayerType

override_layer_types

Certain types of layers to override the quantization configurations. It is used to determine which layers are to be differentiatedly quantized.

By using this parameter, you can perform differentiated quantization on some layers to change the quantization factor search step from 0.01 to 0.02.

Parameter priority: override_layer_configs > override_layer_types > common_config

repeated

OverrideLayer

override_layer_configs

Layer to override the quantization configurations. It is used to determine which layers are to be differentiatedly quantized.

By using this parameter, you can perform differentiated quantization on some layers to change the quantization factor search step from 0.01 to 0.02.

Parameter priority: override_layer_configs > override_layer_types > common_config

optional

bool

do_fusion

BN fusion switch. Defaults to true, indicating BN fusion enabled.

repeated

string

skip_fusion_layers

Layers to skip BN fusion.

OverrideLayerType

-

-

-

Quantization configuration overriding by layer type.

required

string

layer_type

Quantizable layer type.

required

CalibrationConfig

calibration_config

Quantization configuration to apply.

OverrideLayer

-

-

-

Quantization configuration overriding by layer.

required

string

layer_name

Layer to override.

required

CalibrationConfig

calibration_config

Quantization configuration to override.

CalibrationConfig

-

-

-

Calibration-based quantization configuration.

-

ARQuantize

arq_quantize

Weight quantization algorithm.

arq_quantize: ARQ algorithm configuration.

-

FMRQuantize

ifmr_quantize

Activation quantization algorithm.

ifmr_quantize: IFMR quantization algorithm configuration. Currently, only the IFMR algorithm is supported.

ARQuantize

-

-

-

ARQ algorithm configuration. For details about the algorithm, see ARQ Algorithm.

optional

bool

channel_wise

Whether to use different quantization factors for each channel.

FMRQuantize

-

-

-

FMR quantization algorithm. For details about the algorithm, see ifmr: IFMR algorithm for activation quantization.

optional

float

search_range_start

Quantization factor search start.

optional

float

search_range_end

Quantization factor search end.

optional

float

search_step

Quantization factor search step.

optional

float

max_percentile

Upper bound for searching for the largest.

optional

float

min_percentile

Lower bound for searching for the smallest.

optional

bool

asymmetric

Whether to perform symmetric quantization. It is used to select the layer-wise quantization algorithm.

  • true: asymmetric quantization
  • false: symmetric quantization

If this parameter is set for override_layer_configs, override_layer_types, and common_config, or

if the activation_offset parameter is set, the priority is as follows:

override_layer_configs>override_layer_types>common_config>activation_offset

The following is an example of the simplified configuration file (quant.cfg) for distillation:

# global quantize parameter
activation_offset : true
skip_layers : "Opname"
skip_layer_types:"Optype"
do_fusion: true
skip_fusion_layers : "Opname"
common_config : {
    arq_quantize : {
        channel_wise : true
    }
    ifmr_quantize : {
        search_range_start : 0.7
        search_range_end : 1.3
        search_step : 0.01
        max_percentile : 0.999999
        min_percentile : 0.999999
        asymmetric : true
    }
}
 
override_layer_types : {
    layer_type : "Optype"
    calibration_config : {
        arq_quantize : {
            channel_wise : false
        }
        ifmr_quantize : {
            search_range_start : 0.8
            search_range_end : 1.2
            search_step : 0.02
            max_percentile : 0.999999
            min_percentile : 0.999999
            asymmetric : true
        }
    }
}
 
override_layer_configs : {
    layer_name : "Opname"
    calibration_config : {
        arq_quantize : {
            channel_wise : true
        }
        ifmr_quantize : {
            search_range_start : 0.8
            search_range_end : 1.2
            search_step : 0.02
            max_percentile : 0.999999
            min_percentile : 0.999999
            asymmetric : true
        }
    }
}