Collecting and Parsing Profile Data

MindSpore Profiler is a profiling tool developed for the MindSpore framework. By adding MindSpore Profiler APIs to MindSpore training scripts, profile data can be collected during training and visualized as profile data files upon completion of training, improving the profiling efficiency. The MindSpore Profiler APIs scan collect complete profile data in MindSpore training scenarios, including information about operators at the MindSpore and CANN layers, bottom-layer NPU operators, and operator memory usages, providing a comprehensive analysis on performance status during MindSpore training.

MindSpore supports the following profiling methods:

Profiling Using mindspore.profiler.profile
Complete profiling APIs. You can add the APIs to the code to select the data to be profiled.
Dynamic Profiling Using mindspore.profiler.DynamicProfilerMonitor
Profiling at any time during training.
Profiling Using the Environment Variable
Profiling without the need to modify the user code, which is more flexible.
Profiling Using the msprof CLI (non-MindSpore Profiler APIs)
Common profiling method without the need to modify the user code, which is more flexible.

MindSpore 2.0 and later versions support this method.

Restrictions

The MindSpore Profiler APIs support multiple profiling methods, but these methods cannot be enabled at the same time.

Ensure that the MindSpore Profiler APIs are called in the same process as the service process to be profiled.

The MindSpore Profiler APIs profile data by mapping processes to devices as follows:

Multi-process to multi-device: One profiling process for each device.
Single-process to multi-device: Not supported.
Multi-process to single-device: Ensure that the profiling processes are in serial sequence, that is, the profiling processes do not start at the same time, and each profiling process is complete from start to stop.

Prerequisites

Ensure that operations in Before You Start have been completed.
Prepare a model trained on MindSpore 2.7.0 and a matched dataset, develop a training script (for example, train_*.py), and train the model in the Ascend AI Processor environment.

Profiling Using mindspore.profiler.profile

This API supports two profiling methods: callback and custom for loop, in both graph and PyNative modes.

For details about the API, see Debugging and Tuning.

Add the following sample code to the training script to configure the profiling parameters, and then start training.

Callback

For the complete sample, see call_back_profiler.

import mindspore

class StopAtStep(mindspore.Callback):
    def __init__(self, start_step, stop_step):
        super(StopAtStep, self).__init__()
        self.start_step = start_step
        self.stop_step = stop_step
        experimental_config = mindspore.profiler._ExperimentalConfig()
        self.profiler = mindspore.profiler.profile(start_profile=False, experimental_config=experimental_config,
                                                   schedule=mindspore.profiler.schedule(wait=0, warmup=0, active=self.stop_step - self.start_step + 1, repeat=1, skip_first=0),
                                                   on_trace_ready=mindspore.profiler.tensorboard_trace_handler("./data"))

    def on_train_step_begin(self, run_context):
        cb_params = run_context.original_args()
        step_num = cb_params.cur_step_num
        if step_num == self.start_step:
            self.profiler.start()

    def on_train_step_end(self, run_context):
        cb_params = run_context.original_args()
        step_num = cb_params.cur_step_num
        if self.start_step <= step_num <= self.stop_step:
            self.profiler.step()
        if step_num == self.stop_step:
            self.profiler.stop()

Custom for loop

You can set the schedule and on_trace_ready parameters to start profiling.

For example, to collect profile data of the first two steps, you can use the following schedule settings:

For the complete sample, see for_loop_profiler.

import mindspore
from mindspore.profiler import ProfilerLevel, ProfilerActivity, AicoreMetrics

# Define the number of model training times.
steps = 15

# Define a training model network.
net = Net()

# Configure extended parameters.
experimental_config = mindspore.profiler._ExperimentalConfig(
                        profiler_level=ProfilerLevel.Level0,
                        aic_metrics=AicoreMetrics.AiCoreNone,
                        l2_cache=False,
                        mstx=False,
                        data_simplification=False)

# Initialize the profile.
with mindspore.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.NPU],
                                    schedule=mindspore.profiler.schedule(wait=1, warmup=1, active=2,
                                            repeat=1, skip_first=2),
                                    on_trace_ready=mindspore.profiler.tensorboard_trace_handler("./data"),
                                    profile_memory=False,
                                    experimental_config=experimental_config) as prof:
        for step in range(steps):
            train(net)
            # Call the step to profile data.
            prof.step()

Parse profile data.
Both automatic parsing (see tensorboard_trace_handler in the preceding sample code) and offline parsing are supported.
View and analyze the profile data result files.
For details about the profile data result files, see MindSpore & PyTorch Profile Data File References.

For details about how to visualize and analyze the parsed profile data files, see MindStudio Insight User Guide.

You can use the msprof-analyze to analyze the profile data.

Dynamic Profiling Using mindspore.profiler.DynamicProfilerMonitor

During training, if you want to modify the configuration file and perform profiling with the new configurations without interrupting the training process, you can use the mindspore.profiler.DynamicProfilerMonitor API. This API requires a profiler_config.json file. In the absence of a custom file, a file with default settings will be generated.

For details about the mindspore.profiler.DynamicProfilerMonitor API and parameters in the .json configuration file, see mindspore.profiler.DynamicProfilerMonitor.

Create the profiler_config.json file. The following is an example:

{
   "start_step": 2,
   "stop_step": 5,
   "aic_metrics": -1,
   "profiler_level": 0,
   "activities": 0,
   "export_type": 0,
   "profile_memory": false,
   "mstx": false,
   "analyse": true,
   "analyse_mode": 0,
   "parallel_strategy": false,
   "with_stack": false,
   "data_simplification": true
}

Add the following sample code to the training script to configure the profiling parameters, and then start training.

For the complete sample, see dynamic_profiler.

from mindspore.profiler import DynamicProfilerMonitor

# cfg_path specifies the path to the preceding .json configuration file, and output_path specifies the output path.
dp = DynamicProfilerMonitor(cfg_path="./cfg_path", output_path="./output_path")
STEP_NUM = 15
# Define a training model network.
net = Net()
for _ in range(STEP_NUM):
    train(net)
    # Call the step to profile data.
    dp.step()

Parse profile data.
Both automatic parsing (controlled by the analyse parameter in the configuration file) and offline parsing are supported.
View and analyze the profile data result files.
For details about the profile data result files, see MindSpore & PyTorch Profile Data File References.

For details about how to visualize and analyze the parsed profile data files, see MindStudio Insight User Guide.

You can use the msprof-analyze to analyze the profile data.

Profiling Using the Environment Variable

Set the parameters in the MS_PROFILER_OPTIONS environment variable. Profile data will be automatically collected during model training.

For details about the environment variable, see Environment Variables.

This method applies only to single-rank scenarios.
This method does not support the schedule, on_trace_ready, and experimental_config parameters.
Before executing the training script, set device_id using the environment variable. device_id cannot be set by using the set_context function in the training script.

Set the environment variable as follows:

export MS_PROFILER_OPTIONS='
{"start": true,
"output_path": "./output_path",
"activities": ["CPU", "NPU"],
"with_stack": true,
"aic_metrics": "AicoreNone",
"l2_cache": false,
"profiler_level": "Level0"}'

start must be set to true to start data profiling.

Run the training script to complete data profiling.
Parse profile data.
Both automatic parsing and offline parsing are supported.
View and analyze the profile data result files.
For details about the profile data result files, see MindSpore & PyTorch Profile Data File References.

For details about how to visualize and analyze the parsed profile data files, see MindStudio Insight User Guide.

You can use the msprof-analyze to analyze the profile data.

Profiling Using the msprof CLI

MindSpore 2.0 or later supports data profiling using the msprof CLI.

For details about the msprof command line parameters, see Profile Data Collecting and Parsing.

Run the msprof command to start data profiling. The following is an example:
```
msprof --output=./output_path python3 ./train_*.py
```
python3 ./train_*.py is the training execution command and must be added to the end of the msprof command.

After the profiling is complete, the PROF_XXX directory is generated in the directory specified by --output.
Parse profile data.
The mindstudio_profiler_output directory in the PROF_XXX directory contains the profile data that is automatically parsed. For details about how to manually parse the profile data, see Offline Parsing.
View and analyze the profile data result files.
For details about the profile data result files, see MindSpore & PyTorch Profile Data File References.

For details about how to visualize and analyze the parsed profile data files, see MindStudio Insight User Guide.

You can use the msprof-analyze to analyze the profile data.

Parent topic: MindSpore Profiler