Importing Profile Data

Overview

MindStudio Insight allows users to import profile data files and displays related content in graphics. For details about how to collect profile data files, see Profiling Instructions.

Profile data is classified into single-card scenario and cluster scenario. For details, see Table 1. For details about how to import the data, see Importing Data.

Table 1 Profile data scenarios

Scenario

Description

Single-card scenario

You can import single-card data into MindStudio Insight for analysis.

When the single-card data is imported, the Timeline, Memory, and Operator tab pages are displayed on MindStudio Insight. For details, see Single-Card Scenario.

Cluster scenario

Clusters are classified into small clusters and large clusters based on the number of cards. The GUI display varies depending on the imported data. For details, see Cluster Scenario.

Simplified cluster data. Cluster data is simplified to display only data of large communication operators and some computing operators.

If the imported text profile data contains DB files, MindStudio Insight preferentially parses the DB files. If you only need to visualize the TEXT data, search for and delete the DB files in the original profile data folder, and import the data again. For details about the TEXT and DB profile data files, see Single-Card Scenario.

Single-Card Scenario

In the single-card scenario, profile data can be classified into the following types:

  • PyTorch training/inference data: You can import profile data directories ending with ascend_pt. For details about the profile data files, see Table 2 and Table 3.
    Table 2 PyTorch training/inference profile data files (text)

    File

    Description

    GUI

    trace_view.json

    Includes application layer data, CANN layer data, and bottom-layer NPU data.

    Timeline

    msprof_*.json

    Indicates Timeline reports. If AI Core Freq data exists, the AI Core Freq layer is displayed.

    Timeline

    operator_details.csv

    Collects statistics on the duration of PyTorch operators on the host (delivery) and device (execution).

    Timeline

    memory_record.csv

    Indicates process-level memory allocation information.

    Memory

    operator_memory.csv

    Indicates operator memory allocation information.

    Memory

    kernel_details.csv

    Indicates information about all operators executed on the NPU.

    Operator

    step_trace_time.csv

    Indicates time statistics of computation and communication in a step.

    Summary

    communication.json

    Indicates the file that stores details about communication operators, such as communication duration and bandwidth.

    Communication

    communication_matrix.json

    Indicates the basic information file of small communication operators.

    Communication

    Note: The asterisk (*) indicates the timestamp.

    Table 3 PyTorch training/inference profile data files (db)

    File

    Description

    GUI

    ascend_pytorch_profiler_{rank_id}.db

    Indicates the profile data file collected by Ascend PyTorch Profiler APIs.

    Timeline

    Memory

    Operator

    Summary

    Communication

    analysis.db

    Indicates data files collected in scenarios where multiple cards or clusters communicate with each other.

    • The memory_record.csv and operator_memory.csv files in Table 2 must exist at the same time and be in the same directory. The Memory tab page can be properly displayed only after the files are imported successfully.
    • Operator dotting data files can be imported. For details about how to obtain the files, see msprof_tx in "Ascend PyTorch Profiler" in Profiling Instructions. After the files are imported successfully, the dotting data is displayed on the Timeline tab page.
    • When a single card is imported, the Summary and Communication tab pages are not displayed.
  • MindSpore training/inference data: MindSpore framework profile data can be imported. For details about how to obtain the data, see "Debugging and Tuning" > "Ascend Performance Tuning" in MindSpore Tutorial.

    MindStudio Insight allows you to import profile data directories ending with ascend_ms. For details about the profile data files, see Table 4 and Table 5.

    Table 4 MindSpore training/inference profile data files (text)

    File

    Description

    GUI

    msprof_*.json

    Indicates Timeline reports. If AI Core Freq data exists, the AI Core Freq layer is displayed.

    Timeline

    trace_view.json

    Includes application layer data, CANN layer data, and bottom-layer NPU data.

    Timeline

    memory_record.csv

    Indicates process-level memory allocation information.

    Memory

    operator_memory.csv

    Indicates operator memory allocation information.

    Memory

    static_op_mem.csv

    Indicates memory allocation information in static curve scenarios.

    Memory

    kernel_details.csv

    Indicates information about all operators executed on the NPU.

    Operator

    step_trace_time.csv

    Indicates time statistics of computation and communication in a step.

    Summary

    communication.json

    Indicates the file that stores details about communication operators, such as communication duration and bandwidth.

    Communication

    communication_matrix.json

    Indicates the basic information file of small communication operators.

    Communication

    Note: The asterisk (*) indicates the timestamp.

    Table 5 MindSpore training/inference profile data files (db)

    File

    Description

    GUI

    ascend_mindspore_profiler_{rank_id}.db

    Indicates the profile data file collected by Ascend MindSpore Profiler APIs.

    Timeline

    Memory

    Operator

    Summary

    Communication

    communication_analyzer.db

    Indicates data files collected in scenarios where multiple cards or clusters communicate with each other.

    • The memory_record.csv and operator_memory.csv files in Table 4 must exist at the same time and be in the same directory. The Memory tab page can be properly displayed only after the files are imported successfully. If static_op_mem.csv exists, the Memory tab page displays the static curve mode.
    • When a single card is imported, the Summary and Communication tab pages are not displayed.
    • In graph mode, if the compilation optimization level parameter jit_level is set to O2 and the profile data collected by calling the step API is imported to MindStudio Insight, the Communication tab page is not displayed.
  • Offline inference data: Profile data in the mindstudio_profiler_output directory can be imported. For details about the profile data files, see Table 6 and Table 7.
    Table 6 Offline inference profile data files (text)

    File

    Description

    GUI

    msprof_*.json

    Indicates Timeline reports.

    Timeline

    fusion_op_*.csv

    Indicates operator fusion summary in a model. This profile data file does not exist in single-operator scenarios.

    Timeline

    api_statistic_*.csv

    Indicates time spent by API execution at the CANN layer.

    Timeline

    memory_record_*.csv

    Indicates process-level memory allocation information.

    Memory

    operator_memory_*.csv

    Indicates operator memory allocation information.

    Memory

    op_summary_*.csv

    Indicates AI Core and AI CPU operator data.

    Operator

    op_statistic_*.csv

    Indicates the number of times that the AI Core and AI CPU operators are called and the time consumption.

    Operator

    prof_rule_0_*.json

    Indicates profiling suggestions.

    Timeline

    Summary

    Communication

    step_trace_*.csv

    Indicates step trace data. This profile data file does not exist in single-operator scenarios.

    -

    step_trace_*.json

    Indicates step trace data, which records the time required for each step. This profile data file does not exist in single-operator scenarios.

    -

    task_time_*.csv

    Indicates task scheduler data.

    -

    Note: The asterisk (*) indicates the timestamp.

    Table 7 Offline inference profile data files (db)

    File

    Description

    GUI

    msprof_*.db

    Indicates a unified .db file. Currently, the data volume of this format is different from that of the data parsed by the TEXT format.

    Timeline

    Memory

    Operator

    Summary

    Communication

    Note: The asterisk (*) indicates the timestamp.

    • The memory_record.csv and operator_memory.csv files in Table 6 must exist at the same time and be in the same directory. The Memory tab page can be properly displayed only after the files are imported successfully.
    • If the PROF_XXX directory does not contain parsed profile data, you need to use the export function of the msprof command to parse and export profile data files before using MindStudio Insight to display the data. For details about how to parse and export profile data files using the msprof command, see "Offline Parsing" in Profiling Instructions.
    • When a single card is imported, the Summary and Communication tab pages are not displayed.
  • npumonitor data: You can import the profile data collected by npumonitor. For details about the collection method, see npumonitor. For details about the profile data file, see Table 8.
    Table 8 Profile data file details

    File

    Description

    GUI

    msmonitor_{pid}_{timestamp}_{rank_id}.db

    DB file collected by npumonitor

    Timeline

    • pid indicates the process ID.
    • timestamp indicates the timestamp.
    • For cluster data, rank_id is a non-negative integer and starts from 0. For single-device data, rank_id is -1.
    • MindStudio Insight supports the import of a single DB file collected by npumonitor. You can also import the upper-level directory of the DB files, which will be displayed in tile mode. If the data volume is large, you are advised to import a single DB file each time. Importing all files at once can lead to slow parsing and potential out-of-memory errors.

Cluster Scenario

  • The cluster scenario, also called the multi-card scenario, refers to the cluster data composed of multiple multi-card data. Cluster data can be classified into small cluster data and large cluster data. When MindStudio Insight is used to import data in different cluster scenarios, the data is different, as shown in Table 9.
    For a large cluster, importing all raw data collected by the performance tuning tool takes a long time. Therefore, you are not advised to directly import the raw data.
    • When cluster data is imported, if the profile data file contains the cluster_analysis_output directory file, related information is displayed on the Summary and Communication tab pages based on the cluster_analysis_output directory file after the import is successful. If the profile data file does not contain the cluster_analysis_output directory file, the corresponding cluster_analysis_output directory file is generated when data is imported to MindStudio Insight.
    • If the profile data collected by Ascend PyTorch Profiler or MindSpore Profiler needs to be displayed using MindStudio Insight, you are advised to set repeat to 1. The value 0 is not recommended. If repeat is greater than 1, the collected profile data folder needs to be divided into repeat equal parts. The files need to be stored in different folders based on the timestamp in the folder name and re-imported. In this way, the data can be properly displayed.
    • If the msprof-analyze tool has been installed in Linux when you use MindStudio Insight to analyze cluster data, check the tool version and upgrade it to the latest version. For details about how to install the msprof-analyze tool of the latest version, see msprof-analyze.
    Table 9 Cluster scenario

    Scenario

    Card Count

    Data To Be Imported

    GUI Display

    Small cluster

    Up to 32 cards

    All collected raw data can be imported.

    Timeline

    Memory

    Operator

    Summary

    Communication

    Large cluster

    More than 32 cards, thousands of cards, and tens of thousands of cards.

    Use the cluster analysis capability of msprof-analyze in the mstt toolset to preprocess the raw profile data, obtain the communication group-based communication analysis and step duration analysis results, and import the preprocessed data. For details about how to download and use the msprof-analyze tool, see msprof-analyze.

    1. Save all directories whose names end with ascend_pt or ascend_ms to the same folder.
    2. Use the msprof-analyze tool to generate the cluster_analysis_output directory. For details about the data files in the cluster_analysis_output directory, see Table 10.
    3. Copy the generated cluster_analysis_output directory to the local PC and import the directory to MindStudio Insight.
    4. Go to the Communication page, analyze the data, import the corresponding small cluster data or single-card data, and analyze the data again.

    Summary

    Communication

    Table 10 Files in the cluster_analysis_output directory

    File

    Description

    cluster_step_trace_time.csv

    Generated when the data parsing mode is communication_matrix, communication_time, or all.

    cluster_communication_matrix.json

    Generated when the data parsing mode is communication_matrix or all.

    cluster_communication.json

    Generated when the data parsing mode is communication_time or all. The data is mainly the communication time data.

    cluster_analysis.db

    Generated during the parsing of analysis.db or ascend_pytorch_profiler_{rank_id}.db.

  • The cluster data is simplified based on the ascend_pytorch_profiler_{rank_id}.db file. Large communication operators, key compute functions, and key framework functions are extracted to simplify the data, saving memory and enabling quick global analysis. After the simplified cluster data is imported, only the Timeline tab page is displayed in MindStudio Insight.

    You can use the msprof-analyze tool in the mstt toolset to set the -m filter_db option to generate simplified cluster data. For details about how to install msprof-analyze, see Installing msprof-analyze. For details about how to set -m filter_db, see "filter_db" in recipe result and cluster_analysis.db deliverable table structure description. The cluster data simplification function supports only the DB scenario.