Simplified Distillation Configuration File
Table 1 describes the fields in the distill_config_pytorch.proto file. Find the file in /amct_pytorch/proto/distill_config_pytorch.proto under the AMCT installation directory.
Message |
Required |
Type |
Parameter |
Description |
|---|---|---|---|---|
AMCTDistillConfig |
- |
- |
- |
Simplified configuration of AMCT distillation. |
optional |
uint32 |
batch_num |
Number of distillation batches, which is used by IFMR to accumulate data and calculate quantization factors. |
|
optional |
uint32 |
group_size |
Minimum number of distillation units in a distillation block. |
|
optional |
bool |
data_dump |
Whether to enable block input and output dump for the teacher network. |
|
repeated |
DistillGroup |
distill_group |
User-defined distillation structure. |
|
optional |
DistillDataQuantConfig |
distill_data_quant_config |
Activation quantization configuration for distillation. |
|
optional |
DistillWeightQuantConfig |
distill_weight_quant_config |
Weight quantization configuration for distillation. |
|
repeated |
DistillOverrideLayer |
distill_override_layers |
Layers to override. |
|
repeated |
DistillOverrideLayerType |
distill_override_layer_types |
Types of layers to override. |
|
repeated |
string |
quant_skip_layers |
Distillation of layers that do not need to be quantized. |
|
repeated |
string |
quant_skip_layer_types |
Distillation of operator types that do not need to be quantized. |
|
DistillGroup |
- |
- |
- |
User-defined distillation structure. The distillation structure supports only operators of the torch.nn.Module type. |
required |
string |
start_layer_name |
Start layer of the user-defined distillation structure. |
|
required |
string |
end_layer_name |
End layer of the user-defined distillation structure. |
|
DistillDataQuantConfig |
- |
- |
- |
Activation quantization configuration for distillation. |
- |
ActULQquantize |
ulq_quantize |
Activation quantization algorithm. Currently, only ULQ is supported. |
|
ActULQquantize |
- |
- |
- |
ULQ algorithm for activation quantization. |
optional |
ClipMaxMin |
clip_max_min |
Initial upper and lower bounds. IFMR is used for initialization by default. |
|
optional |
bool |
fixed_min |
Whether to fix the lower bound at 0. Set to true for ReLU or false for other algorithms. |
|
optional |
DataType |
dst_type |
Bit width select of INT8 or INT4 quantization. Defaults to INT8. Currently, only INT8 quantization is supported. |
|
ClipMaxMin |
- |
- |
- |
Initial upper and lower bounds. |
required |
float |
clip_max |
Initial upper bound. |
|
required |
float |
clip_min |
Initial lower bound. |
|
DistillWeightQuantConfig |
- |
- |
- |
Weight quantization configuration for distillation. |
- |
ARQDistill |
arq_distill |
ARQ algorithm for weight quantization. |
|
- |
WtsULQDistill |
ulq_distill |
ULQ algorithm for weight quantization. |
|
ARQDistill |
- |
- |
- |
ARQ algorithm for weight quantization. For details about the algorithm, see ARQ Algorithm. |
optional |
DataType |
dst_type |
Bit width select of INT8 or INT4 quantization. Defaults to INT8. Currently, only INT8 quantization is supported. |
|
optional |
bool |
channel_wise |
Channel-wise ARQ enable. |
|
WtsULQDistill |
- |
- |
- |
ULQ algorithm for weight quantization. For details about the algorithm, see ULQ Algorithm for Activation Quantization. |
optional |
DataType |
dst_type |
INT8 or INT4 quantization bit width select. Defaults to INT8. |
|
optional |
bool |
channel_wise |
Channel-wise ULQ enable. |
|
DistillOverrideLayer |
- |
- |
- |
Layer overriding configuration. |
required |
string |
layer_name |
Layer name. |
|
optional |
DistillDataQuantConfig |
distill_data_quant_config |
Activation quantization configuration to apply. |
|
optional |
DistillWeightQuantConfig |
distill_weight_quant_config |
Weight quantization configuration to apply. |
|
DistillOverrideLayerType |
- |
- |
- |
Types of layers to override. |
required |
string |
layer_type |
Layer type. |
|
optional |
DistillDataQuantConfig |
distill_data_quant_config |
Activation quantization configuration to apply. |
|
optional |
DistillWeightQuantConfig |
distill_weight_quant_config |
Weight quantization configuration to apply. |
batch_num: 1
group_size: 1
data_dump: true
distill_group: {
start_layer_name: "layer1"
end_layer_name: "layer2"
}
distill_data_quant_config: {
ulq_quantize: {
clip_max_min: {
clip_max: 6.0
clip_min: -6.0
}
fixed_min: true
dst_type: INT8
}
}
distill_weight_quant_config: {
arq_distill: {
channel_wise: true
dst_type: INT8
}
}
quant_skip_layers: "layer3"
quant_skip_layer_types: "type1"
distill_override_layers : {
layer_name: "layer4"
distill_data_quant_config: {
ulq_quantize: {
clip_max_min: {
clip_max: 3.0
clip_min: -3.0
}
fixed_min: true
dst_type: INT8
}
}
distill_weight_quant_config: {
arq_distill: {
channel_wise: false
dst_type: INT8
}
}
}
distill_override_layer_types : {
layer_type: "type2"
distill_data_quant_config: {
ulq_quantize: {
clip_max_min: {
clip_max: 3.0
clip_min: -3.0
}
fixed_min: true
dst_type: INT8
}
}
distill_weight_quant_config: {
ulq_distill: {
channel_wise: false
dst_type: INT8
}
}
}