Simplified Distillation Configuration File

Table 1 describes the fields in the distill_config_pytorch.proto file. Find the file in /amct_pytorch/proto/distill_config_pytorch.proto under the AMCT installation directory.

Table 1 Parameter description

Message

Required

Type

Parameter

Description

AMCTDistillConfig

-

-

-

Simplified configuration of AMCT distillation.

optional

uint32

batch_num

Number of distillation batches, which is used by IFMR to accumulate data and calculate quantization factors.

optional

uint32

group_size

Minimum number of distillation units in a distillation block.

optional

bool

data_dump

Whether to enable block input and output dump for the teacher network.

repeated

DistillGroup

distill_group

User-defined distillation structure.

optional

DistillDataQuantConfig

distill_data_quant_config

Activation quantization configuration for distillation.

optional

DistillWeightQuantConfig

distill_weight_quant_config

Weight quantization configuration for distillation.

repeated

DistillOverrideLayer

distill_override_layers

Layers to override.

repeated

DistillOverrideLayerType

distill_override_layer_types

Types of layers to override.

repeated

string

quant_skip_layers

Distillation of layers that do not need to be quantized.

repeated

string

quant_skip_layer_types

Distillation of operator types that do not need to be quantized.

DistillGroup

-

-

-

User-defined distillation structure. The distillation structure supports only operators of the torch.nn.Module type.

required

string

start_layer_name

Start layer of the user-defined distillation structure.

required

string

end_layer_name

End layer of the user-defined distillation structure.

DistillDataQuantConfig

-

-

-

Activation quantization configuration for distillation.

-

ActULQquantize

ulq_quantize

Activation quantization algorithm. Currently, only ULQ is supported.

ActULQquantize

-

-

-

ULQ algorithm for activation quantization.

optional

ClipMaxMin

clip_max_min

Initial upper and lower bounds. IFMR is used for initialization by default.

optional

bool

fixed_min

Whether to fix the lower bound at 0. Set to true for ReLU or false for other algorithms.

optional

DataType

dst_type

Bit width select of INT8 or INT4 quantization. Defaults to INT8. Currently, only INT8 quantization is supported.

ClipMaxMin

-

-

-

Initial upper and lower bounds.

required

float

clip_max

Initial upper bound.

required

float

clip_min

Initial lower bound.

DistillWeightQuantConfig

-

-

-

Weight quantization configuration for distillation.

-

ARQDistill

arq_distill

ARQ algorithm for weight quantization.

-

WtsULQDistill

ulq_distill

ULQ algorithm for weight quantization.

ARQDistill

-

-

-

ARQ algorithm for weight quantization. For details about the algorithm, see ARQ Algorithm.

optional

DataType

dst_type

Bit width select of INT8 or INT4 quantization. Defaults to INT8. Currently, only INT8 quantization is supported.

optional

bool

channel_wise

Channel-wise ARQ enable.

WtsULQDistill

-

-

-

ULQ algorithm for weight quantization. For details about the algorithm, see ULQ Algorithm for Activation Quantization.

optional

DataType

dst_type

INT8 or INT4 quantization bit width select. Defaults to INT8.

optional

bool

channel_wise

Channel-wise ULQ enable.

DistillOverrideLayer

-

-

-

Layer overriding configuration.

required

string

layer_name

Layer name.

optional

DistillDataQuantConfig

distill_data_quant_config

Activation quantization configuration to apply.

optional

DistillWeightQuantConfig

distill_weight_quant_config

Weight quantization configuration to apply.

DistillOverrideLayerType

-

-

-

Types of layers to override.

required

string

layer_type

Layer type.

optional

DistillDataQuantConfig

distill_data_quant_config

Activation quantization configuration to apply.

optional

DistillWeightQuantConfig

distill_weight_quant_config

Weight quantization configuration to apply.

The following is an example of the simplified configuration file (quant.cfg) for distillation:
batch_num: 1
group_size: 1
data_dump: true

distill_group: {
    start_layer_name: "layer1"
    end_layer_name: "layer2"
}

distill_data_quant_config: {
    ulq_quantize: {
        clip_max_min: {
            clip_max: 6.0
            clip_min: -6.0
        }
        fixed_min: true
        dst_type: INT8
    }
}

distill_weight_quant_config: {
    arq_distill: {
       channel_wise: true
       dst_type: INT8
    }
  }

quant_skip_layers: "layer3"
quant_skip_layer_types: "type1"

distill_override_layers : {
    layer_name: "layer4"
    distill_data_quant_config: {
        ulq_quantize: {
            clip_max_min: {
                clip_max: 3.0
                clip_min: -3.0
            }
            fixed_min: true
            dst_type: INT8
        }
    }
    distill_weight_quant_config: {
        arq_distill: {
           channel_wise: false
           dst_type: INT8     
        }
    }
}

distill_override_layer_types : {
    layer_type: "type2"
    distill_data_quant_config: {
        ulq_quantize: {
            clip_max_min: {
                clip_max: 3.0
                clip_min: -3.0
            }
            fixed_min: true
            dst_type: INT8
        }
    }
    distill_weight_quant_config: {
        ulq_distill: {
           channel_wise: false
           dst_type: INT8    
        }
    }
}