create_quant_config
Description
Finds all quantizable layers in a graph, creates a quantization configuration file, and writes the quantization configuration of the quantizable layers to the file.
Prototype
create_quant_config(config_file, graph, skip_layers=None, batch_num=1, activation_offset=True, config_defination=None)
Command-Line Options
Parameter |
Input/Return |
Description |
Restriction |
|---|---|---|---|
config_file |
Input |
Path and name of the quantization configuration file The existing file (if any) in the path will be overwritten upon this API call. |
A string |
graph |
Input |
A tf.Graph of the model to be quantized. |
A tf.Graph. |
skip_layers |
Input |
Name of the layer that does not need to be quantized in the tf.Graph graph. |
Default: None A list of strings, for example, ['op1','op2','op3'] Restriction: If a simplified quantization configuration file is used as the input, this parameter must be set in the configuration file. In this case, the parameter setting in the input does not take effect. |
batch_num |
Input |
Number of batches used for quantization, that is, the number of batches used to generate quantization factors. |
Type: int Valid Value: an integer greater than or equal to 0 Default value: 1 Restrictions:
|
activation_offset |
Input |
Whether to quantize activations with offset. |
Default: True A bool. Restriction: If a simplified quantization configuration file is used as the input, this parameter must be set in the configuration file. In this case, the parameter setting in the input does not take effect. |
config_defination |
Input |
Simplified PTQ Configuration File Whether to create a simplified quantization configuration file quant.cfg from the calibration_config_tf.proto file in /amct_tensorflow/proto/calibration_config_tf.proto in the AMCT installation path. For details about the parameters in the calibration_config_tf.proto file and the generated simplified quantization configuration file quant.cfg, see Simplified PTQ Configuration File. |
Default: None A string Restriction: If None, a configuration file is generated based on the remaining arguments (skip_layers, batch_num, and activation_offset). In other cases, a configuration file in JSON format is generated based on this argument. |
Return Value
None
Outputs
A quantization configuration file in JSON format. (When quantization is performed again, this API will overwrite the existing configuration file in the output directory.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | { "version":1, "batch_num":1, "activation_offset":true, "joint_quant":false, "do_fusion":true, "skip_fusion_layers":[], "tensor_quantize":[ { "layer_name": "MaxPool", "input_index": 0, "activation_quant_params":{ "num_bits":8, "act_algo":"hfmg", "num_of_bins":4096, "asymmetric":false } } ] "MobilenetV2/Conv/Conv2D":{ "quant_enable":true, "dmq_balancer_param":0.5, "activation_quant_params":{ "num_bits":8, "max_percentile":0.999999, "min_percentile":0.999999, "search_range":[ 0.7, 1.3 ], "search_step":0.01, "act_algo":"ifmr", "asymmetric":false }, "weight_quant_params":{ "num_bits":8, "wts_algo":"arq_quantize", "channel_wise":true } }, "MobilenetV2/Logits/AvgPool":{ "quant_enable":true, "dmq_balancer_param":0.5, "activation_quant_params":{ "num_bits":8, "max_percentile":0.999999, "min_percentile":0.999999, "search_range":[ 0.7, 1.3 ], "search_step":0.01, "act_algo":"ifmr", "asymmetric":false }, "weight_quant_params":{ "num_bits":8, "wts_algo":"arq_quantize", "channel_wise":false } } } |
Example
1 2 3 4 5 6 7 8 9 | import amct_tensorflow as amct # Build a graph of the network to be quantized. network = build_network() # Create a quantization configuration file. amct.create_quant_config(config_file="./configs/config.json", graph=tf.get_default_graph(), skip_layers=None, batch_num=1, activation_offset=True) |