Record Files

A record file is a serialized data structure file based on Protobuf. It records the scale and offset factors for quantization. You can generate a compressed model file by using the record file, quantization configuration file, and original network model file.

Record Prototype Definition

The Protobuf prototype is defined as follows (find the code in the /amct_onnx/proto/scale_offset_record_onnx.proto file under the AMCT installation directory).

syntax = "proto2";

message SingleLayerRecord {
    optional float scale_d = 1;
    optional int32 offset_d = 2;
    repeated float scale_w = 3;
    repeated int32 offset_w = 4;
    repeated uint32 shift_bit = 5;
    optional bool skip_fusion = 6 [default = true];
    optional bool is_tensor_quantize = 10 [default = false];
    repeated float tensor_balance_factor = 13;
    optional string op_data_type = 15;
    optional string act_type = 20 [default = 'INT8'];
    optional string wts_type = 21 [default = 'INT8'];

}

message MapFiledEntry {
    optional string key = 1;
    optional SingleLayerRecord value = 2;

}

message ScaleOffsetRecord {
    repeated MapFiledEntry record = 1;
}

The parameters are described as follows.

Message	Required	Specification	Parameter	Description
SingleLayerRecord	-	-	-	Quantization factors.
	optional	float	scale_d	Scale factor for activation quantization. Only unified activation quantization is supported.
	optional	int32	offset_d	Offset factor for activation quantization; only unified activation quantization is supported.
	repeated	float	scale_w	Scale factor for weight quantization. Scalar (quantizing the weight of the current layer in a unified manner) and vector (quantizing the weight of the current layer in channel-wise mode) modes are supported. Only the Conv2d type supports the channel-wise quantization mode.
	repeated	int32	offset_w	Offset factor for weight quantization. Similar to scale_w, it also supports scalar and vector modes and the dimension configuration must be the same as that of scale_w. Currently, weight quantization with offset is not supported, and offset_w must be 0.
	repeated	uint32	shift_bit	Shift factor. shift_bit is written to the record file only when joint_quant is configured in Simplified PTQ Configuration File.
	optional	bool	skip_fusion	Whether to skip Conv+BN fusion at the current layer. Defaults to false, indicating performing the preceding fusion type.
	optional	bool	is_tensor_quantize	Flag for indicating tensor quantization records in the current record file. The default value is false, indicating non-tensor quantization records.
	repeated	float	tensor_balance_factor	Balanced quantization factor. This field is used only in pre-balancing activation quantization.
	optional	string	op_data_type	Input data type of an operator. The data type can be FLOAT16 or FLOAT32.
	optional	string	act_type	Activation quantization bit width: INT8 or INT16. Currently, only INT8 quantization is supported.
	optional	string	wts_type	Weight quantization bit width. Currently, the quantization factors after INT6 and INT7 quantization are still saved as the INT8 type.
ScaleOffsetRecord	-	-	-	Map structure. The discrete map structure is used to ensure compatibility.
ScaleOffsetRecord	repeated	MapFiledEntry	record	Quantization factor record per layer, consisting of two members: key: layer name. value: quantization factors defined by SingleLayerRecord.
MapFiledEntry	optional	string	key	Layer name.
MapFiledEntry	optional	SingleLayerRecord	value	Quantization factor configuration.

Beware that the Protobuf protocol does not report an error if you have updated optional fields more than once. As such, the most recent settings are used.

Record Files

The format of a generated record file is record.txt. According to different features, record files are classified into:

Quantization record file

For common quantization layers, the scale_d, offset_d, scale_w, offset_w and shift_bit parameters must be included. The following is an example:

record {
  key: "conv"
  value {
    shift_bit: 1              // The shift_bit information is recorded in the record file only when the joint_quant parameter is set in the simplified PTQ configuration file.
    scale_d: 0.0798481479
    offset_d: 1
    op_data_type: 'FLOAT32'
    scale_w: 0.007364662
    scale_w: 0.0069018262
    offset_w: 0
    offset_w: 0
    skip_fusion: true
    act_type: "INT8"
    wts_type: "INT8"
  }
}
record {
  key: "maxpool_ld_default:0"
  value {
    scale_d: 0.00392156886
    offset_d: -128
    op_data_type: 'FLOAT32'
    is_tensor_quantize: true
  }
}

Activation quantization balance preprocessing record file. The following is an example:

record {
  key: "matmul_1"
  value {
    scale_d: 0.00784554612
    offset_d: -1
    op_data_type: 'FLOAT32'
    scale_w: 0.00778095098
    offset_w: 0
    shift_bit: 2                   // The shift_bit information is recorded in the record file only when the joint_quant parameter is set in Simplified PTQ Configuration File.
    tensor_balance_factor: 0.948409557
    tensor_balance_factor: 0.984379828
  }
}
record {
  key: "conv_1"
  value {
    scale_d: 0.00759239076
    offset_d: -4
    op_data_type: 'FLOAT32'
    scale_w: 0.0075149606
    offset_w: 0
    shift_bit: 1
    tensor_balance_factor: 1.04744744
    tensor_balance_factor: 1.44586647
  }
}

Parent topic: See Also