Performance Analysis in Cluster Training Scenarios
Scenario
A cluster consists of multiple nodes, which are management in a unified manner on the management page. Each node has an independent system. In cluster scenarios, the tool collects profile data of each node, generates a PROF_XXX directory on each node, and pre-parses and summarizes all PROF_XXX directories to OBS. You need to manually copy all PORF_XXX directories summarized by OBS to an environment where cluster data can be displayed and analyzed.
Currently, the following tool supports cluster data display and analysis: MindStudio Insight.
Profile Data Collection Process
The following figure shows the overall process of profile data collection.

Environment Setup
Restrictions
In cluster scenarios, profile data of a maximum of 128 nodes can be collected. If eight devices are configured for each node, profile data of a maximum of 1024 devices can be collected.
Profile Data Collection
After the environment is set up, you can collect profile data in the cluster scenario as follows:
- Use Profile Data Collection with MindSpore Framework APIs to collect profile data.
- Use the Ascend PyTorch Profiler API to collect PyTorch profile data.
- Set up a distributed training environment and prepare the distributed training script used after migration. For details, see "Porting Adaptation " in the PyTorch Training Model Porting and Tuning Guide .
- Refer to Profiling Quick Start (PyTorch Training/Online Inference) to modify the training script and start distributed training for data collection.