accuracy_based_auto_calibration

Applicability

Product

Supported

Atlas A3 training series products/Atlas A3 inference series products

x

Atlas A2 training products/Atlas A2 inference products

x

Atlas 200I/500 A2 inference product

Atlas inference series products

Atlas training products

Description

Performs automatic calibration on a model based on the input configuration file, searches for a quantization configuration that meets the accuracy requirement, and outputs a fake-quantized model for accuracy simulation in the Caffe environment and a deployable model on the Ascend AI Processor for inference.

Restrictions

None

Prototype

1
accuracy_based_auto_calibration(model_file,weights_file,model_evaluator,config_file,record_file,save_dir,strategy='BinarySearch',sensitivity='CosineSimilarity')

Parameters

Parameter

Input/Output

Description

model_file

Input

Definition file (.prototxt) of the Caffe model.

A string.

weights_file

Input

Weight file (.caffemodel) of the trained Caffe model.

A string.

model_evaluator

Input

Python instance for automatic quantization calibration and accuracy evaluation.

A Python instance.

config_file

Input

Quantization configuration file generated by the user.

A string.

record_file

Input

Path of the quantization factor record file. The existing file (if any) in the path will be overwritten.

A string.

save_dir

Input

Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model.

A string.

strategy

Input

Policy for searching for the quantization configuration that meets the accuracy requirements. The dichotomy policy is used by default.

A string or a Python instance.

Default: BinarySearch

sensitivity

Input

Metric used to evaluate how quantization-sensible each layer to be quantized is. By default, the cosine similarity metric is used.

A string or a Python instance.

Default: CosineSimilarity

Returns

None

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import amct_caffe as amct 
from amct_caffe.common.auto_calibration import AutoCalibrationEvaluatorBase
from amct_caffe.common.auto_calibration import BinarySearchStrategy
from amct_caffe.common.auto_calibration import CosineSimilaritySensitivity
class AutoCalibrationEvaluator(AutoCalibrationEvaluatorBase):
    def __init__(self):
        """
            evaluate_batch_num is the needed batch num for evaluating
            the model. Larger evaluate_batch_num is recommended, because
            the evaluation metric of input model can be more precise
            with larger eval dataset.
        """
        super().__init__()
 
    def calibration(self, model_file, weights_file):
        """"
        Function:
            do the calibration with model
        Parameter:
            model_file: the prototxt model define file of caffe model
            weights_file: the binary caffemodel file of caffe model
        """
        run_caffe_model(args, model_file, weights_file, CALIBRATION_BATCH_NUM)
 
    def evaluate(self, model_file, weights_file):
        """"
        Function:
            evaluate the model with batch_num of data, return the eval
            metric of the input model, such as top1 for classification
            model, mAP for detection model and so on.
        Parameter:
            model_file: the prototxt model define file of caffe model
            weights_file: the binary caffemodel file of caffe model
        """
        return do_benchmark_test(args, model_file, weights_file, args.iterations)
 
    def metric_eval(self, original_metric, new_metric):
        """
        Function:
            whether the metric of new fake quant model can satisfy the
            requirement
        Parameter:
            original_metric: the metric of non quantized model
            new_metric: the metric of new quantized model
        """
        # the loss of top1 acc need to be less than 0.2%
        loss = original_metric - new_metric
        if loss * 100 < 0.2:
            return True, loss
        return False, loss
 
    # step 1: create the quant config file
    config_json_file = './config.json'
    skip_layers = []
    batch_num = CALIBRATION_BATCH_NUM
    activation_offset = True
    amct.create_quant_config(config_json_file, model_file, weights_file,
                        skip_layers, batch_num, activation_offset)
 
    scale_offset_record_file = os.path.join(TMP, 'scale_offset_record.txt')
    result_path = os.path.join(RESULT, 'MobileNetV2')
    evaluator = AutoCalibrationEvaluator()
 
    # step 2: start the accuracy_based_auto_calibration process
    amct.accuracy_based_auto_calibration(
        args.model_file,
        args.weights_file,
        evaluator,
        config_json_file,
        scale_offset_record_file,
        result_path)

Flush files:

  • A fake-quantized model file for accuracy simulation in the Caffe environment and its weight file, with names containing the fake_quant keyword.
  • A deployable model file and its weight file, with names containing the deploy keyword. The model can be deployed on the Ascend AI Processor after being converted by ATC.
  • A quantization factor record file (record_file), which records the weight quantization factors (scale_w and offset_w) of each layer to be quantized.
  • A quantization information file that records the locations of the quantization layers inserted by AMCT and operator fusion information, used for accuracy analysis of the quantized model.
  • A sensitivity file that records how quantization-sensible is each layer, based on which the layers to be unquantized are determined.
  • An automatic unquantization history file that records the layers to be unquantized.