Using msprof-analyze to Analyze Profile Data
Prerequisites
- You have performed operations in Environment Setup.
- You have performed operations in Profile Data Collection and obtained the profile data of the Ascend NPU environment.
- Run the following command to install msprof-analyze:
pip install msprof-analyze
If the following information is displayed, the installation is successful:
1Successfully installed msprof-analyze-{version}
Using msprof-analyze to Perform Analysis
The following section provides only the operation guide and does not include specific data analysis.
msprof-analyze can analyze the iteration duration, communication time, and communication matrix in the communication domain to identify slow cards, nodes, and links.
The operations are as follows:
- Prepare data.
- Perform profile data analysis.
msprof-analyze -m all -d $HOME/profiling_data/
The analysis result is generated in the cluster_analysis_output folder in the directory specified by the -d parameter, and the cluster_analysis.db file is generated.
For more information, see msprof-analyze.
The deliverables of the cluster analysis tool are displayed using MindStudio Insight. For details, see Using MindStudio Insight to Visualize Profile Data.
Using advisor to Perform Analysis
The advisor function of msprof-analyze is to analyze the profile data collected and parsed by Ascend PyTorch Profiler and provide performance tuning suggestions.
The command is as follows:
msprof-analyze advisor all -d $HOME/profiling_data/
Analyze and output brief suggestions to the execution terminal, and the mstt_advisor_{timestamp}.html and /log/mstt_advisor_{timestamp}.xlsx files are generated in the command execution directory for preview.
advisor generates analysis results to provide expert suggestions on possible performance problems.
Using compare_tools to Compare Performance
compare_tools is used to compare the performance before and after a training project is migrated from the GPU to the Ascend NPU environment, or compare the profile data between two software versions in the Ascend NPU environment.
The operations are as follows:
- Copy the profile data in the GPU environment to the Ascend NPU environment.
- Perform profile data comparison.
msprof-analyze compare -d $HOME/npu/profiling_data/*_ascend_pt -bp $HOME/gpu/profiling_data/*_ascend_pt --output_path ./compare_result/profiler_compare
The analysis result is output to the terminal, and the performance_comparison_result_{timestamp}.xlsx file is generated in the path specified by --output_path for preview.
The performance comparison tool measures the overall performance based on training duration and memory usage. Training duration is further analyzed from three dimensions: operators (including nn.Module), communication, and scheduling. The overall performance metrics are displayed to help users identify where performance degradation occurs. In addition, the tool displays the execution durations, communication durations, and memory usages of each operator in the performance_comparison_result_{timestamp}.xlsx file. The operators with the DIFF value greater than 0 are underperforming. No example is provided here. For details, see "Comparison Result Description" in compare_tools.