Simplified QAT Configuration File
Table 1 describes the fields in the retrain_config_caffe.proto file. Find the file in /amct_caffe/proto/retrain_config_caffe.proto in the AMCT installation path.
Message |
Required |
Type |
Field |
Description |
|---|---|---|---|---|
AMCTRetrainConfig |
- |
- |
- |
Simplified QAT configuration of AMCT. |
repeated |
string |
skip_layers |
Layers to skip. It is globally effective, which efficiently realizes the same functionality for extended features. If this field is set, you can skip the settings of the extended quant_skip_layers and xxx_skip_layers. If both skip_layers and quant_skip_layers are set, their union is used. |
|
repeated |
string |
skip_layer_types |
Layers to be skipped by layer type (not supported in the current version). It is globally effective, which efficiently realizes the same functionality for extended features. If this field is set, you can skip the settings of the extended quant_skip_types and xxx_skip_types. If both skip_layer_types and quant_skip_types are set, their union is used. |
|
repeated |
RetrainOverrideLayer |
override_layer_configs |
Rewrite layers by layer name, that is, perform differentiated compression on the layers. For example, if the initial upper and lower limits configured by using the global quantization configuration parameter are [-0.6, 0.6], differentiated quantization may be performed on some layers by using this parameter, and this parameter may be configured as [-0.3, 0.3]. Parameter Priority
|
|
repeated |
RetrainOverrideLayerType |
override_layer_types |
Rewrite layers by layer type, that is, perform differentiated compression on the layers. For example, if the initial upper and lower limits configured by using the global quantization configuration parameter are [-0.6, 0.6], differentiated quantization may be performed on some layers by using this parameter, and this parameter may be configured as [-0.3, 0.3]. Parameter Priority
|
|
optional |
uint32 |
batch_num |
Batch count used for quantization. |
|
required |
RetrainDataQuantConfig |
retrain_data_quant_config |
Quantization configuration parameter for QAT data. Global quantization configuration parameter. Parameter priority: override_layer_configs > override_layer_types > retrain_data_quant_config/retrain_weight_quant_config |
|
required |
RetrainWeightQuantConfig |
retrain_weight_quant_config |
Weight quantization configuration parameter for QAT. Global quantization configuration parameter. Parameter priority: override_layer_configs > override_layer_types > retrain_data_quant_config/retrain_weight_quant_config |
|
repeated |
string |
quant_skip_layers |
Skips layers that do not need to be quantized by layer name. Parameter used in the quantization scenario. If both skip_layers and quant_skip_layers are set, their union is used. |
|
repeated |
string |
quant_skip_types |
Skips layers that do not need to be quantized by layer type (not supported in the current version). Parameter used in the quantization scenario. If both skip_layer_types and quant_skip_types are set, their union is used. |
|
RetrainDataQuantConfig |
- |
- |
- |
Quantization parameter configuration for QAT data. |
- |
ULQuantize |
ulq_quantize |
Activation quantization algorithm. Currently, only ULQ is supported. |
|
ULQuantize |
- |
- |
- |
ULQ quantization algorithm configuration. For details about the algorithm, see ULQ Algorithm for Activation Quantization. |
optional |
ClipMaxMin |
clip_max_min |
Initial upper and lower bounds. IFMR is used for initialization by default. |
|
optional |
bool |
fixed_min |
Whether to fix the lower bound at 0. Set to true for ReLU or false for other algorithms. |
|
ClipMaxMin |
- |
- |
- |
Initial upper and lower bounds. |
required |
float |
clip_max |
Initial upper bound. |
|
required |
float |
clip_min |
Initial lower bound. |
|
RetrainWeightQuantConfig |
- |
- |
- |
Weight quantization parameter configuration for QAT. |
- |
ARQRetrain |
arq_retrain |
ARQ algorithm for weight quantization. Currently, only ARQ is supported. |
|
ARQRetrain |
- |
- |
- |
ARQ algorithm configuration. For details about the algorithm, see ARQ Algorithm. |
required |
bool |
channel_wise |
Channel-wise ARQ enable. |
|
RetrainOverrideLayer |
- |
- |
- |
Layer overriding configuration. |
required |
string |
layer_name |
Layer name. |
|
required |
RetrainDataQuantConfig |
retrain_data_quant_config |
Activation quantization configuration to apply. |
|
required |
RetrainWeightQuantConfig |
retrain_weight_quant_config |
Weight quantization configuration to apply. |
|
RetrainOverrideLayerType |
- |
- |
- |
Types of layers to override. |
required |
string |
layer_type |
Layer type. |
|
required |
RetrainDataQuantConfig |
retrain_data_quant_config |
Activation quantization configuration to apply. |
|
required |
RetrainWeightQuantConfig |
retrain_weight_quant_config |
Weight quantization configuration to apply. |
The following is an example simplified QAT configuration file (quant.cfg):
# global quantize parameter
retrain_data_quant_config: {
ulq_quantize: {
clip_max_min: {
clip_max: 6.0
clip_min: -6.0
}
}
}
retrain_weight_quant_config: {
arq_retrain: {
channel_wise: true
}
}
skip_layers: "conv_1"
override_layer_types : {
layer_type: "InnerProduct"
retrain_weight_quant_config: {
arq_retrain: {
channel_wise: false
}
}
}
override_layer_configs : {
layer_name: "fc_5"
retrain_weight_quant_config: {
arq_retrain: {
channel_wise: false
}
}
}