quantize_model

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Quantizes a graph based on the quantization configuration file, inserts the quantization operators, generates a quantization factor record file record_file, and returns the list of newly added operators.

Prototype

1
quant_add_ops = quantize_model(graph, config_file, record_file, calib_outputs=None)

Parameters

Parameter

Input/Output

Description

graph

Input

tf.Graph of the model for quantization.

A tf.Graph.

config_file

Input

User-generated quantization configuration file, which specifies the configuration of each layer to be quantized in the tf.Graph.

A string.

record_file

Input

Path (including the file name) of the quantization factor record file.

A string.

calib_outputs

Input

List of output operators in a graph.

When the output nodes change due to graph modification, this list is updated accordingly.

A list.

Default: None

quant_add_ops

Returns

List of quantization-inserted operator variables.

A list of tf.Variables.

NOTE:

The variable values on the list cannot be found in the model training parameter file, so if the model training parameters are directly restored, an error indicating that the variables cannot be found occurs. Therefore, before restoring the model training parameters, you need to:

Remove the variable values in the quant_add_ops list from the recovery list. For details about how to remove the variable values, see How Do I Restore the Model Training Parameters After Quantization Operators Are Inserted?

Returns

Returns a list of quantized layers on the network.

The quantize_model API performs fusion on the graph, which might alter the output nodes. For example, Conv+BN (or Conv+BiasAdd+BN) is fused into Conv+BiasAdd, and an output node equivalent to BN is a BiasAdd node.

Example

1
2
3
4
5
6
7
8
9
import amct_tensorflow as amct
# Build a network to be quantized.
network = build_network()

# Insert the quantization API.
amct.quantize_model(
      graph=tf.get_default_graph(),
      config_file="./configs/config.json",
      record_file="./record_scale_offset.txt")