Profile Data Collection
Prerequisites
- Currently, MindStudio does not support data collection in cluster scenarios. You can use Merge Reports to import the parent directory of PROF_XXX to display the collected profile data.
For details about profile data collection in cluster scenarios, see "Advanced Functions > Profiling in Cluster Scenarios" in the Profiling Instructions.
- To collect data using the training project, add the configuration information of the PROFILING_OPTIONS field to the environment variable script file env_*.sh of the training project. The following is an example:
export PROFILING_MODE=true export PROFILING_OPTIONS='{"output":"/tmp/profiling","training_trace":"on","task_trace":"on","fp_point":"","bp_point":"","aic_metrics":"MemoryL0"}'The path specified by output stores the profile data collected on the server by Profiling, which will be copied to the path specified by Project Location and a .json result file is generated for MindStudio to display.
The PROFILING_OPTIONS field is used to configure profiling items. Configure required options as required. For details about the options for adding Profiling configurations to the training project script, see "Appendixes > Profiling Options" in the Profiling Instructions.
Procedure
- In the navigation tree on the left of the welcome page, click Projects, and select and open a built project.
- On the menu bar, choose . The profiling configuration window is displayed.
- Under Project Properties, set Project Name and Project Location. See Figure 1. Click Next.
Table 1 Project Properties parameters Parameter
Description
Project Name
Profiling project name customized by the user. After the configuration, a folder named after the profiling project name is automatically created in the MindstudioProjects directory of Project Location. The raw profile data directory PROF_XXX and the parsing result .json file generated during profile data collection and parsing are stored in this folder. The parsing result .json file is named in the following format: report_{timestamp}_{device_id}_{model_id}_{iter_id}.json, in which {device_id} indicates the device ID, {model_id} indicates the model ID, and {iter_id} indicates the ID of an iteration.
Project Location
Profile data output path. After profile data collection is complete, a file directory named after the project name is generated in the path.
- Access the Executable Properties page, select Remote Run or Local Run, and identify the project type based on the specified project directory (Project Path). See the following figures.Figure 2 Remote Run (inference application project)
Figure 3 Remote Run (inference operator project)
Figure 4 Remote Run (training project)
After the configuration is complete, Deployment binds with the Environment Variables and Remote Toolkit Path parameters. Click Next to save the parameter settings. During re-configuration, if Deployment has been configured, the Environment Variables and Remote Toolkit Path parameters are automatically set and can be manually modified.
Figure 5 Local Run (inference application project)
Figure 6 Local Run (inference operator project)
Figure 7 Local Run (training project)
Table 2 Executable Properties parameters Parameter
Description
Run Mode
- Remote Run
- Local Run
In Windows OSs, only Remote Run is supported.
Deployment
Run configuration. This parameter is mandatory and is available only when Remote Run is selected. You can use the Deployment function to synchronize the files and folders in a specified project to a specified directory on a remote device. For details, see Deployment.
Project Path
Application project path. This parameter is mandatory.
The following project types can be identified based on the specified target project:
- Ascend App: The specified target project is an inference application project.
- Ascend Operator: The specified target project is an inference operator project.
- Ascend Training: The specified target project is a training project. In this case, you can click Start to start Profiling.
Executable File
Executable file of the application project for profiling. This parameter is mandatory.
This parameter must be set to the executable file in the Project Path subdirectory.
You can specify the binary script file, Python script file, or shell script file.
Due to the restrictions of the msprof tool, the requirements for specifying a Python script file are as follows:
- The paths in the Python script of the PyACL project must be absolute paths.
- Asynchronous APIs (whose names end with async) cannot be called.
The shell script file is provided by the user and does not need to be saved in the Project Path.
Command Arguments
Application execution parameters. Configure this as required and separate arguments with spaces. By default, this parameter is left empty.
Environment Variables
Environment variable configuration. You can manually configure the environment variables or click
to configure them in the dialog box displayed. This parameter is optional.Remote Toolkit Path
Installation path of the Toolkit software package in the remote operating environment, available when you select the Remote Run mode. This parameter is mandatory. Example: ${HOME}/Ascend/ascend-toolkit/{version}/toolkit.
This parameter binds with the Deployment parameter. After you click Next, the parameter value is saved. During re-configuration, if Deployment has been configured, the parameters are automatically set and can be manually modified.
CANN Version
CANN software package version. This parameter is displayed when Local Run is selected and is mandatory.
It is specified during project creation in MindStudio. If the version is not specified, click Change to specify the installation path of the CANN software package.
- The Profiling Options page is displayed. You can configure Task-based or Sample-based in AI Core Profiling. See Figure 8 and Figure 9.
Table 3 Profiling Options parameters Parameter
Description
AI Core Profiling
Task-based
AI Core profiling switch. It collects profile data task by task. The default value is Pipeline Utilization.
- Pipeline Utilization: percentage of time taken by the compute units and MTEs
- Arithmetic Utilization: percentage of time taken by the cube and vector instructions
- UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory
- L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C
- UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar
Sample-based
AI Core profiling switch. It collects profile data at a fixed interval (AI Core-Sampling Interval). The default value is Pipeline Utilization.
- Pipeline Utilization: percentage of time taken by the compute units and MTEs
- Arithmetic Utilization: percentage of time taken by the cube and vector instructions
- UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory
- L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C
- UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar
- Frequency(Hz): Sampling frequency (Hz). Defaults to 100. Must be in the range [1, 100].
MsprofTX
MsprofTX
Switch that controls the MsprofTX user and upper-layer framework program to output profile data. This parameter is optional and is disabled by default.
API Trace
AscendCL API
AscendCL profiling switch. It traces AscendCL API calls. This switch is enabled by default and cannot be disabled.
Runtime API
Runtime profiling switch. It traces Runtime API calls. This parameter is optional and is disabled by default.
OS Runtime API
Function library API and Pthreads API during system running. This parameter is optional and is disabled by default.
NOTE:The third-party open-source tools perf and ltrace must be installed for collecting data of OS Runtime API calls. For details, see Before You Start. Using ltrace to collect data of OS Runtime API calls may cause high CPU usage. In addition, using this tool is related to the application's pthread locking and unlocking, which may affect the process running speed.
Graph Engine (GE)
Graph Engine profiling switch. It traces the scheduling information of Graph Engine. This switch is enabled by default and cannot be disabled.
AI CPU Operators
AI CPU profiling switch, which is used to collect enhanced AI CPU profile data. This parameter is optional and is disabled by default.
Device System Profiling
DDR
DDR sampling frequency (Hz). This parameter is optional and is disabled by default.
You can change the sampling frequency (Hz). The value must be in the range [1, 1000], and is defaulted to 50 Hz.
Host System Profiling
CPU
Samples the host CPU usage. This parameter is optional and is disabled by default.
Memory
Samples the host memory usage. This parameter is optional and is disabled by default.
Disk
Samples the host disk usage. This parameter is optional and is disabled by default.
NOTE:The third-party open-source tool iotop must be installed for collecting data of disk calls. For details, see Before You Start.
Network
Samples the host network usage. This parameter is optional and is disabled by default.
HCCL
HCCL
HCCL profiling switch. This parameter is optional and is disabled by default.
After the profiling is complete, only data of the first iteration of the model (ID) with the largest number of iterations is exported by default.
- After the preceding configurations are complete, click Start in the lower right corner of the window to start Profiling.
The profiling results will be automatically displayed in the MindStudio window after the execution is complete.


