perf_based_auto_calibration

产品支持情况

产品	是否支持
Atlas 350 加速卡	x
Atlas A3 训练系列产品/Atlas A3 推理系列产品	x
Atlas A2 训练系列产品/Atlas A2 推理系列产品	x
Atlas 200I/500 A2 推理产品	x
Atlas 推理系列产品	x
Atlas 训练系列产品	x

功能说明

根据用户输入的原始未量化模型、量化后的模型以及性能采样的配置文件，经过上板性能测试，搜索得到性能较优的混合精度的量化模型，并输出性能较优的混合精度量化模型。

函数原型

perf_based_auto_calibration(original_model_file, quantize_model_file, sampler_config_file, save_dir, strategy='BatchRollBack')

参数说明

参数名	输入/输出	说明
original_model_file	输入	含义：用户未量化的ONNX模型的定义文件，格式为.onnx。数据类型：string
quantize_model_file	输入	含义：用户量化的ONNX模型的定义文件，格式为.onnx。数据类型：string
sampler_config_file	输入	含义：用户提供的性能采样配置文件。关于该文件详细说明以及配置示例请参见性能采样配置文件。数据类型：string
save_dir	输入	含义：性能量化回退后模型保存路径。数据类型：string
strategy	输入	含义：搜索满足性能要求的量化配置的策略。数据类型：string或python实例(PerfStrategyBase) 默认值：BatchRollBack

返回值说明

无

调用示例

import amct_onnx as amct
def main():
    args_check(args)
    model_file = args.model_file_name
    quant_model_file = args.quant_model_file_name
    save_dir = args.path_dir
    sampler_config_file = args.sampler_config_file
 
    amct.perf_based_auto_calibration(
        model_file, quant_model_file, sampler_config_file, save_dir)

落盘文件说明：

精度仿真模型文件：模型名中包含fake_quant，可以在ONNX执行框架ONNX Runtime进行精度仿真。
部署模型文件：模型名中包含deploy，经过ATC转换工具转换后可部署到AI处理器。

父主题： 训练后量化接口