accuracy_based_auto_calibration
Applicability
Product |
Supported |
|---|---|
√ |
|
√ |
|
√ |
|
√ |
|
√ |
Description
Calibrates the input model based on the input configuration file to search for a quantization configuration that meets accuracy requirements, and outputs a model suitable for both accuracy simulation in the TensorFlow environment and inference deployment on the Ascend AI Processor.
Prototype
1 | accuracy_based_auto_calibration(model_file,outputs,record_file,config_file,save_dir,evaluator,strategy='BinarySearch',sensitivity='CosineSimilarity') |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
model_file |
Input |
Definition file (.pb) of the dequantized TensorFlow model. A string. |
outputs |
Input |
String list of the output node. A string. |
config_file |
Input |
Quantization configuration file generated by the user. A string. |
record_file |
Input |
Path of the quantization factor record file. The existing file (if any) in the path will be overwritten. A string. |
save_dir |
Input |
Model save path. Must include the prefix of the model name, for example, ./quantized_model/*model. A string. |
evaluator |
Input |
Python instance for automatic quantization calibration and accuracy evaluation. A Python instance. |
strategy |
Input |
Policy for searching for the quantization configuration that meets the accuracy requirements. The dichotomy policy is used by default. A string or a Python instance. Default: BinarySearch |
sensitivity |
Input |
Metric used to evaluate how quantization-sensible each layer to be quantized is. By default, the cosine similarity metric is used. A string or a Python instance. Default: CosineSimilarity |
Returns
None
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | import amct_tensorflow as amct from amct_tensorflow.accuracy_based_auto_calibration import accuracy_based_auto_calibration def main(): args_check(args) outputs = [PREDICTIONS] record_file = os.path.join(RESULT_DIR, 'record.txt') config_file = os.path.join(RESULT_DIR, 'config.json') with tf.io.gfile.GFile(args.model, mode='rb') as model: graph_def = tf.compat.v1.GraphDef() graph_def.ParseFromString(model.read()) tf.import_graph_def(graph_def, name='') graph = tf.compat.v1.get_default_graph() amct.create_quant_config(config_file, graph) save_dir = os.path.join(RESULT_DIR, 'MobileNetV2') evaluator = MobileNetV2Evaluator(args.dataset, args.keyword, args.num_parallel_reads, args.batch_size) accuracy_based_auto_calibration(args.model, outputs, record_file, config_file, save_dir, evaluator) |
Flush files:
- A .pb model file that can be used for accuracy simulation in the TensorFlow environment or inference on the Ascend AI Processor.
- A quantization factor record file, a quantization configuration file, a layer similarity result file, and an automatic unquantization history file.