Profile Data Collection

Prerequisites

  • Currently, MindStudio does not support data collection in cluster scenarios. You can use Merge Reports to import the parent directory of PROF_XXX to display the collected profile data.

    For details about profile data collection in cluster scenarios, see "Advanced Functions > Profiling in Cluster Scenarios" in the Profiling Instructions.

  • To collect data using the training project, add the configuration information of the PROFILING_OPTIONS field to the environment variable script file env_*.sh of the training project. The following is an example:
    export PROFILING_MODE=true
    export PROFILING_OPTIONS='{"output":"/tmp/profiling","training_trace":"on","task_trace":"on","fp_point":"","bp_point":"","aic_metrics":"MemoryL0"}'

    The path specified by output stores the profile data collected on the server by Profiling, which will be copied to the path specified by Project Location and a .json result file is generated for MindStudio to display.

    The PROFILING_OPTIONS field is used to configure profiling items. Configure required options as required. For details about the options for adding Profiling configurations to the training project script, see "Appendixes > Profiling Options" in the Profiling Instructions.

Procedure

  1. In the navigation tree on the left of the welcome page, click Projects, and select and open a built project.
  2. On the menu bar, choose Ascend > System Profiler > New Project. The profiling configuration window is displayed.
  3. Under Project Properties, set Project Name and Project Location. See Figure 1. Click Next.
    Figure 1 Configuring project properties
    Table 1 Project Properties parameters

    Parameter

    Description

    Project Name

    Profiling project name customized by the user. After the configuration, a folder named after the profiling project name is automatically created in the MindstudioProjects directory of Project Location. The raw profile data directory PROF_XXX and the parsing result .json file generated during profile data collection and parsing are stored in this folder. The parsing result .json file is named in the following format: report_{timestamp}_{device_id}_{model_id}_{iter_id}.json, in which {device_id} indicates the device ID, {model_id} indicates the model ID, and {iter_id} indicates the ID of an iteration.

    Project Location

    Profile data output path. After profile data collection is complete, a file directory named after the project name is generated in the path.

  4. Access the Executable Properties page, select Remote Run or Local Run, and identify the project type based on the specified project directory (Project Path). See the following figures.
    Figure 2 Remote Run (inference application project)
    Figure 3 Remote Run (inference operator project)
    Figure 4 Remote Run (training project)

    After the configuration is complete, Deployment binds with the Environment Variables and Remote Toolkit Path parameters. Click Next to save the parameter settings. During re-configuration, if Deployment has been configured, the Environment Variables and Remote Toolkit Path parameters are automatically set and can be manually modified.

    Figure 5 Local Run (inference application project)
    Figure 6 Local Run (inference operator project)
    Figure 7 Local Run (training project)
    Table 2 Executable Properties parameters

    Parameter

    Description

    Run Mode

    • Remote Run
    • Local Run

    In Windows OSs, only Remote Run is supported.

    Deployment

    Run configuration. This parameter is mandatory and is available only when Remote Run is selected. You can use the Deployment function to synchronize the files and folders in a specified project to a specified directory on a remote device. For details, see Deployment.

    Project Path

    Application project path. This parameter is mandatory.

    The following project types can be identified based on the specified target project:

    • Ascend App: The specified target project is an inference application project.
    • Ascend Operator: The specified target project is an inference operator project.
    • Ascend Training: The specified target project is a training project. In this case, you can click Start to start Profiling.

    Executable File

    Executable file of the application project for profiling. This parameter is mandatory.

    This parameter must be set to the executable file in the Project Path subdirectory.

    You can specify the binary script file, Python script file, or shell script file.

    Due to the restrictions of the msprof tool, the requirements for specifying a Python script file are as follows:

    • The paths in the Python script of the PyACL project must be absolute paths.
    • Asynchronous APIs (whose names end with async) cannot be called.

    The shell script file is provided by the user and does not need to be saved in the Project Path.

    Command Arguments

    Application execution parameters. Configure this as required and separate arguments with spaces. By default, this parameter is left empty.

    Environment Variables

    Environment variable configuration. You can manually configure the environment variables or click to configure them in the dialog box displayed. This parameter is optional.

    Remote Toolkit Path

    Installation path of the Toolkit software package in the remote operating environment, available when you select the Remote Run mode. This parameter is mandatory. Example: ${HOME}/Ascend/ascend-toolkit/{version}/toolkit.

    This parameter binds with the Deployment parameter. After you click Next, the parameter value is saved. During re-configuration, if Deployment has been configured, the parameters are automatically set and can be manually modified.

    CANN Version

    CANN software package version. This parameter is displayed when Local Run is selected and is mandatory.

    It is specified during project creation in MindStudio. If the version is not specified, click Change to specify the installation path of the CANN software package.

  5. The Profiling Options page is displayed. You can configure Task-based or Sample-based in AI Core Profiling. See Figure 8 and Figure 9.
    Figure 8 Task-based scenario
    Figure 9 Sample-based scenario
    Table 3 Profiling Options parameters

    Parameter

    Description

    AI Core Profiling

    Task-based

    AI Core profiling switch. It collects profile data task by task. The default value is Pipeline Utilization.

    • Pipeline Utilization: percentage of time taken by the compute units and MTEs
    • Arithmetic Utilization: percentage of time taken by the cube and vector instructions
    • UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory
    • L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C
    • UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar

    Sample-based

    AI Core profiling switch. It collects profile data at a fixed interval (AI Core-Sampling Interval). The default value is Pipeline Utilization.

    • Pipeline Utilization: percentage of time taken by the compute units and MTEs
    • Arithmetic Utilization: percentage of time taken by the cube and vector instructions
    • UB/L1/L2/Main Memory Bandwidth: memory read/write bandwidth rate of UB/L1/L2/main memory
    • L0A/L0B/L0C Memory Bandwidth: memory read/write bandwidth rate of L0A/L0B/L0C
    • UB Memory Bandwidth: UB read/write bandwidth rate of MTE/Vector/Scalar
    • Frequency(Hz): Sampling frequency (Hz). Defaults to 100. Must be in the range [1, 100].

    MsprofTX

    MsprofTX

    Switch that controls the MsprofTX user and upper-layer framework program to output profile data. This parameter is optional and is disabled by default.

    API Trace

    AscendCL API

    AscendCL profiling switch. It traces AscendCL API calls. This switch is enabled by default and cannot be disabled.

    Runtime API

    Runtime profiling switch. It traces Runtime API calls. This parameter is optional and is disabled by default.

    OS Runtime API

    Function library API and Pthreads API during system running. This parameter is optional and is disabled by default.

    NOTE:

    The third-party open-source tools perf and ltrace must be installed for collecting data of OS Runtime API calls. For details, see Before You Start. Using ltrace to collect data of OS Runtime API calls may cause high CPU usage. In addition, using this tool is related to the application's pthread locking and unlocking, which may affect the process running speed.

    Graph Engine (GE)

    Graph Engine profiling switch. It traces the scheduling information of Graph Engine. This switch is enabled by default and cannot be disabled.

    AI CPU Operators

    AI CPU profiling switch, which is used to collect enhanced AI CPU profile data. This parameter is optional and is disabled by default.

    Device System Profiling

    DDR

    DDR sampling frequency (Hz). This parameter is optional and is disabled by default.

    You can change the sampling frequency (Hz). The value must be in the range [1, 1000], and is defaulted to 50 Hz.

    Host System Profiling

    CPU

    Samples the host CPU usage. This parameter is optional and is disabled by default.

    Memory

    Samples the host memory usage. This parameter is optional and is disabled by default.

    Disk

    Samples the host disk usage. This parameter is optional and is disabled by default.

    NOTE:

    The third-party open-source tool iotop must be installed for collecting data of disk calls. For details, see Before You Start.

    Network

    Samples the host network usage. This parameter is optional and is disabled by default.

    HCCL

    HCCL

    HCCL profiling switch. This parameter is optional and is disabled by default.

    After the profiling is complete, only data of the first iteration of the model (ID) with the largest number of iterations is exported by default.

  6. After the preceding configurations are complete, click Start in the lower right corner of the window to start Profiling.

    The profiling results will be automatically displayed in the MindStudio window after the execution is complete.