Overview
- Check the Comparison Result Description for this scenario.
- This scenario applies only to comparison between the same processors.
- NPU vs. NPU: Only accuracy comparisons between two non-quantized offline models and between two quantized offline models are supported.
This scenario includes the following sub-scenarios:
- Pre- and Post-iteration accuracy comparison: Used to determine if accuracy issues exist following CANN software updates, model version iterations, or model optimizations by comparing two sets of accuracy data.
- Precision comparison for operator fusion: By default, the ATC tool enables operator fusion during offline model conversion. To evaluate the accuracy impact of these optimizations, a comparison is performed between data generated with fusion enabled versus disabled.
Accuracy Comparison Before and After Version Iteration
You need to check whether there is a decrease in accuracy of an offline model generated through ATC-based conversion due to the CANN version iteration, model version iteration, or model tuning when the model is running on the Ascend AI Processor. The comparison is performed between two non-quantized models and between two quantized models. The comparison data files are prepared as follows:
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the non-quantized offline model running on the Ascend AI Processor (before version iteration) |
Benchmark data |
The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see: |
Dump data file of the non-quantized offline model running on the Ascend AI Processor (after version iteration) |
Data to be compared |
File |
Description |
How to Obtain |
|---|---|---|
Dump data file of the quantized offline model running on the Ascend AI Processor (before version iteration) |
Benchmark data |
The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see: |
Dump data file of the quantized offline model running on the Ascend AI Processor (after version iteration) |
Data to be compared |
Accuracy Comparison Based on Model Conversion with Operator Fusion Enabled and Disabled
Generally, operator fusion is enabled by default when the ATC tool is used to convert an offline model. To evaluate the accuracy impact of these optimizations, you need to obtain the following data:
- Enable operator fusion and use the ATC tool to convert the offline model. Then, dump the accuracy data of the converted offline model.
- Disable operator fusion and use the ATC tool to convert the offline model. Then, dump the accuracy data of the converted offline model.
Compare the accuracy of the two offline models.
The comparison is performed between two non-quantized models with operator fusion enabled versus disabled, and between two quantized models with operator fusion enabled versus disabled. The comparison data files are prepared as follows:
File |
Description |
How to Obtain |
|---|---|---|
Non-quantized offline model file (.om) (with operator fusion disabled) Non-quantized offline model file (.om) (with operator fusion enabled) |
Model files |
|
Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion disabled) |
Benchmark data |
The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see: |
Dump data file of the non-quantized offline model running on the Ascend AI Processor (with operator fusion enabled) |
Data to be compared |
File |
Description |
How to Obtain |
|---|---|---|
Quantized offline model file (.om) (with operator fusion disabled) Quantized offline model file (.om) (with operator fusion enabled) |
Model files |
|
Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion disabled) |
Benchmark data |
The methods of obtaining dump data in the NPU environment in offline inference scenarios are the same for different frameworks. For details, see: |
Dump data file of the quantized offline model running on the Ascend AI Processor (with operator fusion enabled) |
Data to be compared |