Overview
- Check the Comparison Result Description for this scenario.
- This scenario applies only to comparison between the same processors.
- Ascend NPU vs. Ascend NPU: Only accuracy comparisons between two non-quantized offline models and between two quantized offline models are supported.
Accuracy Comparison Before and After Version Iteration
You need to check whether there is a decrease in accuracy of an offline model generated through ATC-based conversion due to the CANN version iteration, model version iteration, or model tuning when the model is running on the Ascend AI Processor. The comparison is performed between two non-quantized models and between two quantized models. The input data is prepared as follows:
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the non-quantized offline model running on the Ascend AI Processor (before version iteration) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the non-quantized offline model running on the Ascend AI Processor (after version iteration) |
Data to be compared |
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the quantized offline model running on the Ascend AI Processor (before version iteration) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the quantized offline model running on the Ascend AI Processor (after version iteration) |
Data to be compared |
Accuracy Comparison in Scenarios with Inference Processor Switching of a Model
If the inference processor of an offline model is switched, you can compare the dump data of the offline model before the switching with that after the switching to check whether the accuracy decreases.
Perform the following steps to check whether there is a decrease in accuracy:
- Obtain the original model and perform ATC-based model conversion again.
For details, see Offline Model File Preparation. --soc_version must be specified and 3 must be skipped. That is, operator fusion is disabled for both models to ensure that all operators can be compared.
- Obtain the dump data of the offline model.
- Perform accuracy comparison.
For details, see Comparison Operation and Analysis. You need to cancel the -f and -cf options. Otherwise, the comparison cannot be performed.
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the non-quantized offline model running on the Ascend AI Processor (Ascend NPU A) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the non-quantized offline model running on the Ascend AI Processor (Ascend NPU B) |
Data to be compared |
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the quantized offline model running on the Ascend AI Processor (Ascend NPU A) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the quantized offline model running on the Ascend AI Processor (Ascend NPU B) |
Data to be compared |
Accuracy Comparison Based on Model Conversion with Operator Fusion Enabled and Disabled
Generally, the operator fusion is enabled by default during offline model conversion. To check the accuracy of the fused operator, dump the file generated with operator fusion enabled, then disable operator fusion, and dump the file generated with operator fusion disabled. Then compare the dump data files with fusion enabled and disabled.
The comparison is performed between two non-quantized models and between two quantized models. The input data is prepared as follows:
File |
Description |
How to Obtain |
|---|---|---|
Non-quantized offline model file (.om) (with operator fusion disabled) Non-quantized offline model file (.om) (with operator fusion enabled) |
Model files |
|
Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion disabled) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion enabled) |
Data to be compared |
File |
Description |
How to Obtain |
|---|---|---|
Quantized offline model file (.om) (with operator fusion disabled) Quantized offline model file (.om) (with operator fusion enabled) |
Model files |
|
Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion disabled) |
Benchmark data |
In the offline inference scenario, the methods for obtaining the dump data of the NPU environment are the same for different frameworks. For details, see the following: |
Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion enabled) |
Data to be compared |