Comparison Operation and Analysis
Description
- The computational graph file and directory names in this section are only examples. Replace them with the actual ones. Ensure that the operation user has the read and write permissions on the result path specified by --out.
- If "MemoryError" is displayed during comparison, memory overflow/underflow occurs due to overloaded data. Split dump files on the NPU into different directories and compare the files one by one.
- If the size of the specified data file for comparison exceeds 1 GB or the size of the .json file exceeds 100 MB, the comparison may take a long time and the system displays the message "The size ( %d) of %s more than the XX, it needs more time to run."
Prerequisites
Ensure that operations in Before You Start have been completed.
Network-wide Comparison Using All Algorithms and Corresponding Expert Suggestions Output
It compares the accuracy of all operators involved in computing in a network model. The procedure is as follows:
- Log in to the CANN tool installation environment.
- Generate a computational graph file in .json format.
atc --mode=5 --om=ge_proto_00005_Build.txt --json=ge_proto_00005_Build.txt.json
For details about how to obtain a computational graph file in .txt format, see Preparing Dump Data and Computational Graph Files on NPU.
- Go to the ${INSTALL_DIR}/tools/operator_cmp/compare directory. Replace ${INSTALL_DIR} with the actual CANN component directory. If the Ascend-CANN-Toolkit package is installed as the root user, the CANN component directory is /usr/local/Ascend/ascend-toolkit/latest..
- Run the comparison command.Example of network-wide comparison using all algorithms:
python3 msaccucmp.py compare -m $HOME/output/20200808163566/0/ge_default_20200808163719_121/11/0 -g $HOME/output/Standard_tf/resnet50 -f $HOME/data/ge_proto_00005_Build.txt.json -out $HOME/result -advisor
- The preceding command provides only examples of parameters required in the current scenario. For example, if the range of potential accuracy issues is known or the output file size of a large network model is too large, you can configure related parameters to reduce the output data volume. For details about more parameters, see Command Syntax.
- Install the dependency of pandas 1.3 or later. Otherwise, the -advisor option cannot be executed to output expert suggestions.
Table 1 Command-line options Option
Description
Mandatory (Yes/No)
-m
--my_dump_path
Directory for storing the dump data file generated during training/online inference network running on the Ascend AI Processor. The parent directory of the dump data file is required.
Yes
-g
--golden_dump_path
Directory for storing the .npy data file generated during original network running on the GPU. The parent directory of the .npy data file is required.
Yes
-f
--fusion_rule_file
Network-wide information file.
It is a .json file converted from the computational graph file using ATC in 2.
Yes
-out
--output
Path of the comparison result. Defaults to the current path.
You are not advised to configure directories that are different from those of the current user to avoid privilege escalation risks.
No
-advisor
After tensor comparison is complete, analyzes the comparison result and provides expert suggestions. For details, see Advisor Suggestions on Comparison Results.
No
Figure 1 shows the comparison result.Parameters in a Complete Model Comparison Result describes the fields in the comparison result.
- Analyze the comparison result.
For details, see Comparison Result Analysis. If the comparison fails or exceptions occur (for example, NaN in the results), see Comparison Result Description.
