accuracy_based_auto_calibration

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Calibrates the input model based on the input configuration file to search for a quantization configuration that meets accuracy requirements, and outputs a model suitable for both accuracy simulation in the TensorFlow environment and inference deployment on the Ascend AI Processor.

Prototype

1
accuracy_based_auto_calibration(model_file,outputs,record_file,config_file,save_dir,evaluator,strategy='BinarySearch',sensitivity='CosineSimilarity')

Parameters

Parameter

Input/Output

Description

model_file

Input

Definition file (.pb) of the dequantized TensorFlow model.

A string.

outputs

Input

String list of the output node.

A string.

config_file

Input

Quantization configuration file generated by the user.

A string.

record_file

Input

Path of the quantization factor record file. The existing file (if any) in the path will be overwritten.

A string.

save_dir

Input

Model save path.

Must include the prefix of the model name, for example, ./quantized_model/*model.

A string.

evaluator

Input

Python instance for automatic quantization calibration and accuracy evaluation.

A Python instance.

strategy

Input

Policy for searching for the quantization configuration that meets the accuracy requirements. The dichotomy policy is used by default.

A string or a Python instance.

Default: BinarySearch

sensitivity

Input

Metric used to evaluate how quantization-sensible each layer to be quantized is. By default, the cosine similarity metric is used.

A string or a Python instance.

Default: CosineSimilarity

Returns

None

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import amct_tensorflow as amct
from amct_tensorflow.accuracy_based_auto_calibration import accuracy_based_auto_calibration

def main():
    args_check(args)
    outputs = [PREDICTIONS]
    record_file = os.path.join(RESULT_DIR, 'record.txt')
    config_file = os.path.join(RESULT_DIR, 'config.json')
    with tf.io.gfile.GFile(args.model, mode='rb') as model:
        graph_def = tf.compat.v1.GraphDef()
        graph_def.ParseFromString(model.read())
    tf.import_graph_def(graph_def, name='')
    graph = tf.compat.v1.get_default_graph()
    amct.create_quant_config(config_file, graph)
    save_dir = os.path.join(RESULT_DIR, 'MobileNetV2')
    evaluator = MobileNetV2Evaluator(args.dataset, args.keyword, args.num_parallel_reads, args.batch_size)
    accuracy_based_auto_calibration(args.model, outputs, record_file, config_file, save_dir, evaluator)

Flush files:

  • A .pb model file that can be used for accuracy simulation in the TensorFlow environment or inference on the Ascend AI Processor.
  • A quantization factor record file, a quantization configuration file, a layer similarity result file, and an automatic unquantization history file.