--quant_dumpable

Description

Collects the dump data of the quantization operator.

For details, see Accuracy Improvement Suggestions for Model Inference in CANN AscendCL Application Software Development Guide (C&C++). During precision locating, if there is a model after AMCT quantization, the input and output of the quantization operators may be optimized during graph build when the model is converted to an OM offline model, affecting the dump data export of the quantization operators. For example: For two quantized convolution calculations, the intermediate output is optimized to the quantized output of int8.

To solve this problem, the --quant_dumpable option is introduced. After this option is enabled, the input and output of the quantization operator are not fused. The transdata operator is inserted to restore the original model format. In this way, the dump data of the quantization operator can be collected.

Argument

0: The input and output of the quantization operator may be optimized during graph compilation. In this case, the dump data of the quantization operator cannot be obtained. The default value is 0.
1: After this function is enabled, the dump data of the quantization operator can be collected.

Suggestions and Benefits

If data dump is enabled, you are advised to set this option to 1 to ensure that the dump data of the quantization operator can be collected.

Example

--quant_dumpable=1

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product