create_quant_config_ascend

Description

Finds all quantizable layers in a graph, creates a quantization configuration file, and writes the quantization configuration of the quantizable layers to the file.

Prototype

create_quant_config_ascend(config_file, graph, skip_layers=None, activation_offset=True, config_defination=None)

Command-Line Options

Option

Input/Return

Description

Restriction

config_file

Input

Path and name of the quantization configuration file

The existing file (if any) in the path will be overwritten upon this API call.

A string

graph

Input

A tf.Graph of the model to be quantized.

A tf.Graph.

skip_layers

Input

Name of the layer that does not need to be quantized in the tf.Graph graph.

Default: None

A list of strings.

Restriction: If a simplified quantization configuration file is used as the input, this parameter must be set in the configuration file. In this case, the parameter setting in the input does not take effect.

activation_offset

Input

Whether to quantize activations with offset.

Default: True

A bool.

Restriction: If a simplified quantization configuration file is used as the input, this parameter must be set in the configuration file. In this case, the parameter setting in the input does not take effect.

config_defination

Input

Simplified quantization configuration file quant.cfg, generated based on the calibration_config_ascend_tf.proto file in /amct_tensorflow/proto/calibration_config_ascend_tf.proto under the AMCT installation directory.

For details about the parameters in the calibration_config_ascend_tf.proto file and the generated simplified quantization configuration file quant.cfg, see Simplified Configuration File Template (calibration_config_ascend_tf.proto).

Default: None

A string

Restriction: If None, a configuration file is generated based on the remaining arguments (skip_layers, batch_num, and activation_offset). In other cases, a configuration file in JSON format is generated based on this argument.

Returns

None

Outputs

A quantization configuration file in JSON format. (When quantization is performed again, this API will overwrite the existing configuration file in the output directory.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
{
    "version":1,
    "activation_offset":true,
    "do_fusion":true,
    "skip_fusion_layers":[],
    "MobilenetV2/Conv/Conv2D":{
        "quant_enable":true,
        "activation_quant_params":{
            "max_percentile":0.999999,
            "min_percentile":0.999999,
            "search_range":[
                0.7,
                1.3
            ],
            "search_step":0.01,
            "act_algo":"ifmr",
            "asymmetric":false
        },
        "weight_quant_params":{
            "wts_algo":"arq_quantize",
            "channel_wise":true
        }
    },
    "MobilenetV2/Conv_1/Conv2D":{
        "quant_enable":true,
        "activation_quant_params":{
            "max_percentile":0.999999,
            "min_percentile":0.999999,
            "search_range":[
                0.7,
                1.3
            ],
            "search_step":0.01,
            "act_algo":"ifmr",
            "asymmetric":false
        },
        "weight_quant_params":{
            "wts_algo":"arq_quantize",
            "channel_wise":true
        }
    }
}

Example

1
2
3
4
5
6
7
8
import amct_tensorflow as amct
# Build a graph of the network to be quantized.
network = build_network()
# Create a quantization configuration file.
amct.create_quant_config_ascend(config_file="./configs/config.json",
                    graph=tf.get_default_graph(),
                    skip_layers=None,
                    activation_offset=True)