Simplified QAT Configuration File

Table 1 describes the fields in the retrain_config_caffe.proto file. Find the file in /amct_caffe/proto/retrain_config_caffe.proto in the AMCT installation path.

Table 1 Parameter description

Message

Required

Type

Field

Description

AMCTRetrainConfig

-

-

-

Simplified QAT configuration of AMCT.

repeated

string

skip_layers

Layers to skip. It is globally effective, which efficiently realizes the same functionality for extended features. If this field is set, you can skip the settings of the extended quant_skip_layers and xxx_skip_layers.

If both skip_layers and quant_skip_layers are set, their union is used.

repeated

string

skip_layer_types

Layers to be skipped by layer type (not supported in the current version). It is globally effective, which efficiently realizes the same functionality for extended features. If this field is set, you can skip the settings of the extended quant_skip_types and xxx_skip_types.

If both skip_layer_types and quant_skip_types are set, their union is used.

repeated

RetrainOverrideLayer

override_layer_configs

Rewrite layers by layer name, that is, perform differentiated compression on the layers.

For example, if the initial upper and lower limits configured by using the global quantization configuration parameter are [-0.6, 0.6], differentiated quantization may be performed on some layers by using this parameter, and this parameter may be configured as [-0.3, 0.3].

Parameter Priority

  • Quantization scenario: override_layer_configs>override_layer_types>retrain_data_quant_config/retrain_weight_quant_config
  • Sparse scenario: override_layer_configs>override_layer_types>prune_config

repeated

RetrainOverrideLayerType

override_layer_types

Rewrite layers by layer type, that is, perform differentiated compression on the layers.

For example, if the initial upper and lower limits configured by using the global quantization configuration parameter are [-0.6, 0.6], differentiated quantization may be performed on some layers by using this parameter, and this parameter may be configured as [-0.3, 0.3].

Parameter Priority

  • Quantization scenario: override_layer_configs>override_layer_types>retrain_data_quant_config/retrain_weight_quant_config
  • Sparse scenario: override_layer_configs>override_layer_types>prune_config

optional

uint32

batch_num

Batch count used for quantization.

required

RetrainDataQuantConfig

retrain_data_quant_config

Quantization configuration parameter for QAT data. Global quantization configuration parameter.

Parameter priority: override_layer_configs > override_layer_types > retrain_data_quant_config/retrain_weight_quant_config

required

RetrainWeightQuantConfig

retrain_weight_quant_config

Weight quantization configuration parameter for QAT. Global quantization configuration parameter.

Parameter priority: override_layer_configs > override_layer_types > retrain_data_quant_config/retrain_weight_quant_config

repeated

string

quant_skip_layers

Skips layers that do not need to be quantized by layer name. Parameter used in the quantization scenario.

If both skip_layers and quant_skip_layers are set, their union is used.

repeated

string

quant_skip_types

Skips layers that do not need to be quantized by layer type (not supported in the current version). Parameter used in the quantization scenario.

If both skip_layer_types and quant_skip_types are set, their union is used.

RetrainDataQuantConfig

-

-

-

Quantization parameter configuration for QAT data.

-

ULQuantize

ulq_quantize

Activation quantization algorithm. Currently, only ULQ is supported.

ULQuantize

-

-

-

ULQ quantization algorithm configuration. For details about the algorithm, see ULQ Algorithm for Activation Quantization.

optional

ClipMaxMin

clip_max_min

Initial upper and lower bounds. IFMR is used for initialization by default.

optional

bool

fixed_min

Whether to fix the lower bound at 0. Set to true for ReLU or false for other algorithms.

ClipMaxMin

-

-

-

Initial upper and lower bounds.

required

float

clip_max

Initial upper bound.

required

float

clip_min

Initial lower bound.

RetrainWeightQuantConfig

-

-

-

Weight quantization parameter configuration for QAT.

-

ARQRetrain

arq_retrain

ARQ algorithm for weight quantization. Currently, only ARQ is supported.

ARQRetrain

-

-

-

ARQ algorithm configuration. For details about the algorithm, see ARQ Algorithm.

required

bool

channel_wise

Channel-wise ARQ enable.

RetrainOverrideLayer

-

-

-

Layer overriding configuration.

required

string

layer_name

Layer name.

required

RetrainDataQuantConfig

retrain_data_quant_config

Activation quantization configuration to apply.

required

RetrainWeightQuantConfig

retrain_weight_quant_config

Weight quantization configuration to apply.

RetrainOverrideLayerType

-

-

-

Types of layers to override.

required

string

layer_type

Layer type.

required

RetrainDataQuantConfig

retrain_data_quant_config

Activation quantization configuration to apply.

required

RetrainWeightQuantConfig

retrain_weight_quant_config

Weight quantization configuration to apply.

The following is an example simplified QAT configuration file (quant.cfg):

# global quantize parameter
 retrain_data_quant_config: {
     ulq_quantize: {
         clip_max_min: {
             clip_max: 6.0
             clip_min: -6.0
         }
     }
 }

 retrain_weight_quant_config: {
     arq_retrain: {
         channel_wise: true
     }
 }

 skip_layers: "conv_1"

 override_layer_types : {
     layer_type: "InnerProduct"
     retrain_weight_quant_config: {
         arq_retrain: {
            channel_wise: false
         }
     }
 }

 override_layer_configs : {   
     layer_name: "fc_5"   
     retrain_weight_quant_config: {        
         arq_retrain: {   
            channel_wise: false
         }    
      }
}