Record Files
A record file is a serialized data structure file based on Protobuf. It records the scale and offset factors for quantization. You can generate a compressed model file by using the record file, quantization configuration file, and original network model file.
Record Prototype Definition
The Protobuf prototype of the record file is defined as follows:
message SingleLayerRecord {
optional float scale_d = 1;
optional int32 offset_d = 2;
repeated float scale_w = 3;
repeated int32 offset_w = 4;
repeated uint32 shift_bit = 5;
optional uint32 channels = 6;
optional uint32 height = 7;
optional uint32 width = 8;
optional bool skip_fusion = 9 [default = false];
}
message ScaleOffsetRecord {
message MapFiledEntry {
optional string key = 1;
optional SingleLayerRecord value = 2;
}
repeated MapFiledEntry record = 1;
}
The parameters are described as follows.
Message |
Required |
Type |
Parameter |
Description |
|---|---|---|---|---|
ScaleOffsetRecord |
- |
- |
- |
Map structure. The discrete map structure is used to ensure compatibility. |
repeated |
MapFiledEntry |
record |
Quantization factor record per layer, consisting of two members:
|
|
SingleLayerRecord |
- |
- |
- |
Quantization factors. |
optional |
float |
scale_d |
Scale factor for activation quantization. Only unified activation quantization is supported. |
|
optional |
int32 |
offset_d |
Offset factor for activation quantization; only unified activation quantization is supported. |
|
repeated |
float |
scale_w |
Scale factor for weight quantization. Two quantization modes are supported: scalar (uniformly quantizing the weight of the current layer) and vector (quantizing the weight of the current layer channel-wise). Channel-wise quantization applies only to the Conv2D, DepthwiseConv2dNative, and Conv2DBackpropInput layers. |
|
repeated |
int32 |
offset_w |
Offset factor for weight quantization. Similar to scale_w, it also supports scalar and vector modes and the dimension configuration must be the same as that of scale_w. Currently, weight quantization with offset is not supported, and offset_w must be 0. |
|
repeated |
uint32 |
shift_bit |
Shift factor. Reserved for the convert_model API. |
|
optional |
uint32 |
channels |
Size of the input channel dimension. AMCT does not support network-wide infer_shape. Therefore, you must configure the input shape information of the current layer. |
|
optional |
uint32 |
height |
Size of the input height dimension. AMCT does not support network-wide infer_shape. Therefore, you must configure the input shape information of the current layer. |
|
optional |
uint32 |
width |
Size of the input width dimension. AMCT does not support network-wide infer_shape. Therefore, you must configure the input shape information of the current layer. |
|
optional |
bool |
skip_fusion |
Whether to skip Conv+BN+Scale+Bias, Deconv+BN+Scale+Bias, BN+Scale+Conv, and FC+BN+Scale+Bias fusion at the current layer. Defaults to false, which indicates performing the preceding fusion types. |
Beware that the Protobuf protocol does not report an error if you have updated optional fields more than once. As such, the most recent settings are used.
Record Files
The format of a generated record file is record.txt. Configure the following parameters for general quantization layers: scale_d, offset_d, scale_w, offset_w, channels, height, width, and shift_bit. The scale_w and offset_w parameters are unavailable for AVE Pooling since the layer has no weight. Here is an example of a quantization factor record file:
record {
key: "conv1"
value: {
scale_d: 0.01424
offset_d: -128
scale_w: 0.43213
scale_w: 0.78163
scale_w: 1.03213
offset_w: 0
offset_w: 0
offset_w: 0
shift_bit: 1
shift_bit: 1
shift_bit: 1
channels:3
height: 144
width: 144
skip_fusion: true
}
}
record {
key: "pool1"
value: {
scale_d: 0.532532
offset_d: 13
channels:256
height: 32
width: 32
}
}
record {
key: "fc1"
value: {
scale_d: 0.37532
offset_d: -67
scale_w: 0.876221
offset_w: 0
shift_bit: 1
channels:1024
height: 1
width: 1
}
}