Overview

  • Check the Comparison Result Description for this scenario.
  • This scenario applies only to comparison between the same processors.
  • Ascend NPU vs. Ascend NPU: Only accuracy comparisons between two non-quantized offline models and between two quantized offline models are supported.

Accuracy Comparison Before and After Version Iteration

You need to check whether there is a decrease in accuracy of an offline model generated through ATC-based conversion due to the CANN version iteration, model version iteration, or model tuning when the model is running on the Ascend AI Processor. The comparison is performed between two non-quantized models and between two quantized models. The input data is prepared as follows:

Table 1 Input data requirements for comparison between non-quantized models

File

Description

How to Obtain

Dump data file of the non-quantized offline model running on the Ascend AI Processor (before version iteration)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the non-quantized offline model running on the Ascend AI Processor (after version iteration)

Data to be compared

Table 2 Input data requirements for comparison between quantized models

File

Description

How to Obtain

Dump data file of the quantized offline model running on the Ascend AI Processor (before version iteration)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the quantized offline model running on the Ascend AI Processor (after version iteration)

Data to be compared

Accuracy Comparison in Scenarios with Inference Processor Switching of a Model

If the inference processor of an offline model is switched, you can compare the dump data of the offline model before the switching with that after the switching to check whether the accuracy decreases.

Perform the following steps to check whether there is a decrease in accuracy:

  1. Obtain the original model and perform ATC-based model conversion again.

    For details, see Offline Model File Preparation. --soc_version must be specified and 3 must be skipped. That is, operator fusion is disabled for both models to ensure that all operators can be compared.

  2. Obtain the dump data of the offline model.

    For details, see Table 3 or Table 4.

  3. Perform accuracy comparison.

    For details, see Comparison Operation and Analysis. You need to cancel the -f and -cf options. Otherwise, the comparison cannot be performed.

Table 3 Input data requirements for comparison between non-quantized models

File

Description

How to Obtain

Dump data file of the non-quantized offline model running on the Ascend AI Processor (Ascend NPU A)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the non-quantized offline model running on the Ascend AI Processor (Ascend NPU B)

Data to be compared

Table 4 Input data requirements for comparison between quantized models

File

Description

How to Obtain

Dump data file of the quantized offline model running on the Ascend AI Processor (Ascend NPU A)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the quantized offline model running on the Ascend AI Processor (Ascend NPU B)

Data to be compared

Accuracy Comparison Based on Model Conversion with Operator Fusion Enabled and Disabled

Generally, the operator fusion is enabled by default during offline model conversion. To check the accuracy of the fused operator, dump the file generated with operator fusion enabled, then disable operator fusion, and dump the file generated with operator fusion disabled. Then compare the dump data files with fusion enabled and disabled.

The comparison is performed between two non-quantized models and between two quantized models. The input data is prepared as follows:

Table 5 Input data requirements for comparison between non-quantized models

File

Description

How to Obtain

Non-quantized offline model file (.om) (with operator fusion disabled)

Non-quantized offline model file (.om) (with operator fusion enabled)

Model files

Offline Model File Preparation

Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion disabled)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion enabled)

Data to be compared

Table 6 Input data requirements for comparison between quantized models

File

Description

How to Obtain

Quantized offline model file (.om) (with operator fusion disabled)

Quantized offline model file (.om) (with operator fusion enabled)

Model files

Offline Model File Preparation

Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion disabled)

Benchmark data

In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following:

Preparing Dump Data of an Offline Model

Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion enabled)

Data to be compared