--quant_dumpable

Applicability

Product

Supported

Atlas A3 training products / Atlas A3 inference products

Atlas A2 training products / Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference products

Atlas training products

Description

Collects the dump data of the quantization operator.

For details, see ""Accuracy Improvement Suggestions for Model Inference"" in Application Development Guide (C&C++). During precision locating, if there is a model after AMCT quantization, the input and output of the quantization operators may be optimized during graph build when the model is converted to an OM offline model, affecting the dump data export of the quantization operators. For example, for two quantized convolution calculations, the intermediate output is optimized to the quantized output of int8.

To solve this problem, the --quant_dumpable option is introduced. After this option is enabled, the input and output of the quantization operator are not fused. The transdata operator is inserted to restore the original model format. In this way, the dump data of the quantization operator can be collected.

See Also

None

Argument

  • 0 (default): The inputs and outputs of the quantization operators may be optimized during graph build. In this case, the dump data of the quantization operators cannot be obtained.
  • 1: After this function is enabled, the dump data of the quantization operator can be collected.

Suggestions and Benefits

If data dump is enabled, you are advised to set this option to 1 to ensure that the dump data of the quantization operator can be collected.

Example

--quant_dumpable=1