Overview

  • Check the Comparison Result Description for this scenario.
  • This scenario applies only to comparison between the same processors.
  • NPU vs. NPU: Only accuracy comparisons between two non-quantized offline models and between two quantized offline models are supported.

This scenario includes the following sub-scenarios:

  • Pre- and Post-iteration accuracy comparison: Used to determine if accuracy issues exist following CANN software updates, model version iterations, or model optimizations by comparing two sets of accuracy data.
  • Precision comparison for operator fusion: By default, the ATC tool enables operator fusion during offline model conversion. To evaluate the accuracy impact of these optimizations, a comparison is performed between data generated with fusion enabled versus disabled.

Accuracy Comparison Before and After Version Iteration

You need to check whether there is a decrease in accuracy of an offline model generated through ATC-based conversion due to the CANN version iteration, model version iteration, or model tuning when the model is running on the Ascend AI Processor. The comparison is performed between two non-quantized models and between two quantized models. The comparison data files are prepared as follows:

Table 1 Requirements for the non-quantized vs. non-quantized comparison data files

File

Description

How to Obtain

Dump data file of the non-quantized offline model running on the Ascend AI Processor (before version iteration)

Benchmark data

The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see:

Preparing Dump Data of an Offline Model

Dump data file of the non-quantized offline model running on the Ascend AI Processor (after version iteration)

Data to be compared

Table 2 Requirements for the quantized vs. quantized comparison data files

File

Description

How to Obtain

Dump data file of the quantized offline model running on the Ascend AI Processor (before version iteration)

Benchmark data

The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see:

Preparing Dump Data of an Offline Model

Dump data file of the quantized offline model running on the Ascend AI Processor (after version iteration)

Data to be compared

Accuracy Comparison Based on Model Conversion with Operator Fusion Enabled and Disabled

Generally, operator fusion is enabled by default when the ATC tool is used to convert an offline model. To evaluate the accuracy impact of these optimizations, you need to obtain the following data:

  • Enable operator fusion and use the ATC tool to convert the offline model. Then, dump the accuracy data of the converted offline model.
  • Disable operator fusion and use the ATC tool to convert the offline model. Then, dump the accuracy data of the converted offline model.

Compare the accuracy of the two offline models.

The comparison is performed between two non-quantized models with operator fusion enabled versus disabled, and between two quantized models with operator fusion enabled versus disabled. The comparison data files are prepared as follows:

Table 3 Requirements for the non-quantized (operator fusion enabled) vs. non-quantized (operator fusion disabled) comparison data files

File

Description

How to Obtain

Non-quantized offline model file (.om) (with operator fusion disabled)

Non-quantized offline model file (.om) (with operator fusion enabled)

Model files

Preparing an Offline Model File

Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion disabled)

Benchmark data

The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see:

Preparing Dump Data of an Offline Model

Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion enabled)

Data to be compared

Table 4 Requirements for the quantized (operator fusion enabled) vs. quantized (operator fusion disabled) comparison data files

File

Description

How to Obtain

Quantized offline model file (.om) (with operator fusion disabled)

Quantized offline model file (.om) (with operator fusion enabled)

Model files

Preparing an Offline Model File

Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion disabled)

Benchmark data

The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see:

Preparing Dump Data of an Offline Model

Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion enabled)

Data to be compared