Parsing Dump Data
- Run the following command to parse the original dump binary data file into an .npy file that can be read by NumPy:
find precision_data/npu/debug_0 -type f -name "*" | xargs -i python3 /usr/local/Ascend/ascend-toolkit/latest/tools/operator_cmp/compare/msaccucmp.py convert -d {} -out dump_data_npy/ -v 2This command converts the format of the files in the precision_data/npu/debug_0 directory using the msaccucmp.py script and saves the files to the dump_data_npy directory. In the preceding command, /usr/local/Ascend/ascend-toolkit/ is the CANN installation directory, which can be changed as required. For details about how to use the msaccucmp.py script, see "Converting Dump File Formats" in CANN Accuracy Debugging Tool User Guide.
- Search for the NaN source in the .npy file.
The converted .npy data is stored in the dump_data_npy directory, including the input and output data of all operators on the network. The five positions separated by periods (.) in the file name are timestamps. Sort all files in ascending order of timestamps and check whether NaN exists in the files. Find the data file where NaN first appears. If NaN exists, print the file name and terminate the loop. If no file name is printed, NaN does not exist in the file. In this case, check whether the dump step is correctly executed. For details about the file naming format, see "Data Format Requirements" in CANN Accuracy Debugging Tool User Guide.
Run the python3 find_nan.py command. The content of find_nan.py is as follows:
import glob import numpy as np files = glob.glob("dump_data_npy/*") files.sort(key = lambda x : int(x.split(".")[4])) for i in files: f = np.load(i) if np.isnan(f).any(): print(i) break