Profile Data Collection

Prerequisites

Currently, MindStudio IDE does not support data collection in cluster scenarios. You can use the Import Result function to import the parent directory of PROF_XXX to display the collected cluster profile data.
For details about profile data collection in cluster scenarios, see "Appendixes" > "Performance Analysis in Cluster Training Scenarios" in the Profiling Instructions.
To collect data using the training project, add the configuration information of the PROFILING_OPTIONS field to the environment variable script file env_*.sh of the training project. The following is an example:
```
export PROFILING_MODE=true
export PROFILING_OPTIONS='{"output":"/tmp/profiling","training_trace":"on","task_trace":"on","fp_point":"","bp_point":"","aic_metrics":"MemoryL0"}'
```
The path specified by output stores the profile data collected on the server by Profiling, which will be copied to the path specified by Project Location and a .json result file is generated for MindStudio IDE to display.

The PROFILING_OPTIONS field is used to configure profiling items. Select required options as required. For details about the options for adding profiling configurations to the training project script, see "Other Collection Modes" > "Collecting Data Using TensorFlow Framework Interfaces" > "Profiling Options" in Profiling Instructions.

Procedure

In the navigation bar on the left of the welcome page, click Projects, and select and open a built project.
Choose Ascend > System Profiler from the menu bar. The system analysis project page is displayed.
Figure 1 System analysis project page

On the system analysis project page, click New Project on the welcome page or the

icon in the upper left corner. The profiling configuration window is displayed, as shown in Figure 2.

Set Project Name and Project Location under Project Properties. Click Next.

**Table 1** **Project Properties** parameters
Parameter		Description
Project Name		Profiling project name customized by the user. After the configuration, a folder named after the project name is automatically created in the directory specified by Project Location. The collected raw profile data directory PROF_XXX and the data parsing result .json file are stored in this folder. NOTE: The parsing result .json file is named in the following format: report_{timestamp}_{device_id}_{model_id}_{iter_id}.json, in which {device_id} indicates the device ID, {model_id} indicates the model ID, and {iter_id} indicates the ID of an iteration.
Project Location		Profile data output path. After profile data collection is complete, a file directory named after the project name is generated in the path.

Figure 2 Configuring project properties

Access the Executable Properties configuration page, as shown in the following figures.

Figure 3 Executable Properties

**Table 2** **Executable Properties** parameters
Parameter		Description
Project Path		Path of the target project for profiling. This parameter is mandatory. If the specified target project is a training project, you can click Start to directly start the Profiling tool.
Executable File		Executable file of the target project for profiling. This parameter is mandatory. Set this parameter to an executable file in the Project Path subdirectory, which can be a binary script file (such as the main file), Python script file (such as the train.py file), and Shell script file (such as the npu_set_env_1p.sh file). Due to the restrictions of the msprof tool, the requirements for specifying a Python script file are as follows: Paths in the Python script of the pyACL project must be absolute paths. Asynchronous APIs (whose names end with async) cannot be called. The shell script file is provided by the user and does not need to be saved in the Project Path.
Command Arguments		Application execution parameters. Configure this as required and separate arguments with spaces. By default, this parameter is left empty.
Environment Variables		Environment variable configuration. You can manually configure the environment variables or click to configure them in the dialog box displayed. This parameter is optional.
CANN Version		CANN package version. This parameter is mandatory. It is specified during project creation in MindStudio IDE. If the version is not specified, click Change to specify the installation path of the CANN package.

Click Next to obtain the profiling configuration. A dialog box is displayed, as shown in Figure 4.
Figure 4 Obtaining profiling configuration

The Profiling Options page is displayed. You can configure Task-based or Sample-based in AI Core Profiling. See Figure 5 and Figure 6.

Figure 5 Task-based scenario

Figure 6 Sample-based scenario

**Table 3** **Profiling Options** parameters
Parameter			Description
AI Core Profiling	Mode		Task-based: AI Core profiling switch. It collects profile data task by task. The default value is Pipeline Utilization. Sample-based: AI Core profiling switch. It collects profile data at a fixed interval (AI Core-Sampling Interval). The default value is Pipeline Utilization.
	Metrics		When Mode is set to Task-based: Pipeline Utilization: percentage of time taken by the compute units and MTEs Arithmetic Utilization: percentage of time taken by the cube and vector instructions UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar When Mode is set to Sample-based: Pipeline Utilization: percentage of time taken by the compute units and MTEs Arithmetic Utilization: percentage of time taken by the cube and vector instructions UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar
	L2Cache		L2 sampling switch in Task-based profiling. This parameter is optional and is disabled by default.
	Frequency(Hz)		Sampling frequency (Hz) in Sample-based profiling. Defaults to 100. Must be in the range [1, 100].
MsprofTX	MsprofTX		Switch that controls the MsprofTX user and upper-layer framework program to output profile data. This parameter is optional and is disabled by default.
API Trace	AscendCL API		AscendCL profiling switch. It traces AscendCL API calls. This parameter is enabled by default.
	Runtime API		Runtime profiling switch. It traces Runtime API calls. This parameter is optional and is disabled by default.
	Graph Engine(GE)		Graph Engine profiling switch. It traces the scheduling information of Graph Engine. This switch is enabled by default and cannot be disabled.
	AICPU Operators		AI CPU profiling switch, which is used to collect enhanced AI CPU profile data. This parameter is optional and is disabled by default.
HCCL	HCCL		HCCL profiling switch. This parameter is optional and is disabled by default. After the profiling is complete, only data of the first iteration of the model (ID) with the largest number of iterations is exported by default.
Device System Profiling	CPU & Memory Usage Profiling		Profiling switch for system CPU usage and system memory. This parameter is optional and is disabled by default. You can change the sampling frequency (Hz). The value must be in the range [1, 10], and is defaulted to 10 Hz.
Host System Profiling	Application Based System Profiling	CPU	Samples the host CPU usage. This parameter is optional and is disabled by default.
		Memory	Samples the host memory usage. This parameter is optional and is disabled by default.
		Disk	Samples the host disk usage. This parameter is optional and is disabled by default. NOTE: The third-party open-source tool iotop must be installed for collecting data of disk calls. For details, see Before You Start.
		Network	Samples the host network usage. This parameter is optional and is disabled by default.
		Syscall & PThreadcall	Samples host-side syscall and pthreadcall. This parameter is optional and is disabled by default.
	System CPU & Memory Usage	CPU	Samples the CPU usage of the host system and all processes. This parameter is optional and is disabled by default.
		Memory	Samples the memory usage of the host system and all processes. This parameter is optional and is disabled by default.
		Frequency(Hz)	CPU and memory usage sampling frequency (Hz). Defaults to 50. Must be in the range [1, 50].

Table 3 lists configuration options for full collection. The actual configuration options supported by a processor are subject to the GUI.

After the preceding configurations are complete, click Start in the lower right corner of the window to start profile data collection.
The performance analysis results will be automatically displayed in the MindStudio IDE window after the execution is complete.

Parent topic: Performance Analysis