Data Storing Directories
The structure of the directory for storing profile data is as follows:
- Directory structure when the tensorboard_trace_handler function is called:
- The profile data files output by MindSpore and PyTorch in this scenario are basically the same. The following describes the data of the two frameworks. The differences are described in the comments.
- You do not need to open the following data files. You can use MindStudio Insight User Guide to view and analyze profile data.
- If the step ID in the kernel_details.csv file is empty, you can view the step information of the operator in the trace_view.json file or collect profile data again.
- The following data is profiled based on the actual environment. If the corresponding conditions do not exist in the environment, the corresponding data or file will not be generated. For example, if the model does not have the AI CPU operator, the corresponding data_preprocess.csv file will not be generated even if profiling has been performed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
└── localhost.localdomain_139247_20230628101435_ascend_pt // Profile data result directory. The naming format is {worker_name}_{timestamp}_ascend_{framework}. By default, {worker_name} is {hostname}_{pid}, {timestamp} is the timestamp, and {framework} is the abbreviation of MindSpore (ms) and PyTorch (pt). ├── profiler_info_{Rank_ID}.json // File that records the metadata related to the profiler. In PyTorch single-rank scenarios, {Rank_ID} is not displayed in the file name. ├── profiler_metadata.json // File that stores the information added by the user through the add_metadata API and other metadata related to the profiler. ├── ASCEND_PROFILER_OUTPUT // Directory of profile data collected and parsed by the MindSpore Profiler or Ascend PyTorch Profiler APIs. │ ├── analysis.db // Generated by default in PyTorch multi-rank or cluster scenarios involving communication. │ ├── api_statistic.csv // Generated when profiler_level is set to Level0 (only for MindSpore), Level1, or Level2. │ ├── ascend_mindspore_profiler_{Rank_ID}.db // Generated when export_type is set to Db in MindSpore scenarios. │ ├── ascend_pytorch_profiler_{Rank_ID}.db // Generated by default in PyTorch scenarios. In single-rank scenarios, {Rank_ID} is not displayed in the file name. │ ├── communication_analyzer.db // Generated when export_type is set to Db in MindSpore multi-rank or cluster scenarios involving communication. │ ├── communication.json // Generated when profiler_level is set to Level1 or Level2 in multi-rank or cluster scenarios involving communication. This file provides data basis for visualizing profile data. │ ├── communication_matrix.json // Generated when profiler_level is set to Level1 or Level2 in multi-rank or cluster scenarios involving communication. This file records basic information about small communication operators and provides data basis for visualizing profile data. │ ├── dataset.csv // Generated when activities is set to CPU in MindSpore scenarios. │ ├── data_preprocess.csv // Generated when profiler_level is set to Level2. │ ├── hccs.csv // Generated when sys_interconnection is set to True. │ ├── kernel_details.csv // Generated when activities is set to NPU. │ ├── l2_cache.csv // Generated when l2_cache is set to True. │ ├── memory_record.csv // Generated when profile_memory is set to True. │ ├── minddata_pipeline_raw_{Rank_ID}.csv // Generated when data_process is set to True and the mindspore.dataset module is called in MindSpore scenarios. │ ├── minddata_pipeline_summary_{Rank_ID}.csv // Generated when data_process is set to True and the mindspore.dataset module is called in MindSpore scenarios. │ ├── minddata_pipeline_summary_{Rank_ID}.json // Generated when data_process is set to True and the mindspore.dataset module is called in MindSpore scenarios. │ ├── nic.csv // Generated when sys_io is set to True. │ ├── npu_module_mem.csv // Generated when profile_memory is set to True. │ ├── operator_details.csv // Generated when activities is set to CPU and record_shapes is set to True in MindSpore scenarios. In PyTorch scenarios, this file is automatically generated by default. │ ├── operator_memory.csv // Generated when profile_memory is set to True. │ ├── op_statistic.csv // Number of times that the AI Core and AI CPU operators are called and their time consumption. │ ├── pcie.csv // Generated when sys_interconnection is set to True. │ ├── roce.csv // Generated when sys_io is set to True. │ ├── step_trace_time.csv // Time statistics of computation and communication in iterations. │ └── trace_view.json // Time information of the entire AI task. ├── FRAMEWORK // Raw profile data on the framework side, which can be ignored. ├── logs // Parsing process logs. └── PROF_000001_20230628101435646_FKFLNPEPPRRCFCBA // Profile data at the CANN layer, named in the format of PROF_{Number}_{Timestamp}_{Character string}. When data_simplification is set to True, only the raw profile data in this directory is retained, and other data is deleted. ├── analyze // Generated when profiler_level is set to Level1 or Level2 in multi-rank or cluster scenarios involving communication. ├── device_{Rank_ID} // Raw profile data collected by CANN Profiling on the device. ├── host // Raw profile data collected by CANN Profiling on the host. ├── mindstudio_profiler_log // Log file parsed by CANN Profiling. └── mindstudio_profiler_output // Profile data parsed by CANN Profiling. ├── localhost.localdomain_139247_20230628101435_ascend_pt_op_arg // Directory for storing operator statistics files in PyTorch scenarios. It is generated when record_op_args is set to True.
MindSpore Profiler and Ascend PyTorch Profiler APIs associate and integrate data on the framework side with data collected by CANN Profiling to form profile data files such as trace, kernel, and memory. Files are stored in the ASCEND_PROFILER_OUTPUT directory, including timeline and summary data (in .json and .csv formats). For details, see Timeline and Summary Data, Data in ascend_pytorch_profiler_{Rank_ID}.db, and Data in analysis.db.
The PROF directory stores the profile data collected by CANN Profiling, mainly in the mindstudio_profiler_output directory and the msprof_*.db file. For details about the data, see Profile Data File References.
- In the PyTorch scenario, when the export_chrome_trace method is called, the Ascend PyTorch Profiler APIs write the parsed trace data to the *.json file. * indicates the file name. If the file does not exist, it will be automatically created in the specified path.
Parent topic: MindSpore & PyTorch Profile Data File References