Querying Profile Data File Information

This function is used to query profile data file information. Confirm that the model ID and iteration ID are specified during export.

To query profile data information, perform the following steps:

  1. Log in to the development environment as the Ascend-CANN-Toolkit running user.
  2. Switch to the directory where the msprof.py script is located.

    ${INSTALL_DIR}/tools/profiler/profiler_tool/analysis/msprof. Replace ${INSTALL_DIR} with the actual CANN component directory. If the Ascend-CANN-Toolkit package is installed as the root user, the CANN component directory is /usr/local/Ascend/ascend-toolkit/latest.

    Quick tip: Create an alias for the msprof.py script with the command alias msprof_analysis='python3 msprof.py_script_directory' as the running user. Then, you can start profiling with the shortcut msprof_analysis in any directory. This operation takes effect only in the current window.

  3. To query the profile data information, run the following command. Table 1 lists the options to be configured.
    python3 msprof.py query -dir <dir> 

    Example: python3 msprof.py query -dir /home/HwHiAiUser/profiler_data/PROF_XXX

    Table 1 Command-line options

    Option

    Description

    Required/Optional

    -dir, --collection-dir

    Directory of collected profile data. The value must be PROF_XXX or the parent directory of PROF_XXX, for example,

    /home/HwHiAiUser/profiler_data/PROF_XXX

    Required

    --data-type

    Data type. This parameter is used for interconnection with MindStudio and does not need to be configured. Possible values are:

    • 0: cluster scenario. You can query whether the current data is collected in the cluster scenario.
    • 1: iteration trace data, that is, detailed data of each iteration, including the FP/BP elapsed time, iteration refresh hangover time, and iteration interval.
    • 2: calculation amount, that is, the number of floating-point operations on the AI Core.
    • 3: data preparation, including training data sending to the device and training data reading on the device.
    • 4: parallelism optimization suggestion.
    • 5: parallelism data, including the pure communication duration and computation duration.
    • 6: slow card and slow link data and optimization suggestion.
    • 7: communication matrix data and optimization suggestion.
    • 8: CPU and memory performance metrics of the host-side system and processes.
    • 9: communication time consumption collection by enabling key path analysis.
    • 10: communication matrix collection by enabling key path analysis.

    Optional

    --id

    Rank ID of a cluster node in the cluster scenario, and device ID in the non-cluster scenario.

    This parameter is used for interconnection with MindStudio and does not need to be configured.

    Optional

    --model-id

    Model ID.

    This parameter is used for interconnection with MindStudio and does not need to be configured.

    Optional

    --iteration-id

    Iteration ID for graph-based statistics collection. (The iteration ID increases by 1 each time a graph is executed. When a script is compiled into multiple graphs, the iteration ID is different from the step ID at the script layer.) The default value is 1.

    This parameter is used for interconnection with MindStudio and does not need to be configured.

    Optional

    -h, --help

    Help information.

    Optional

    After the preceding command is executed, find the query result displayed.

    Before calling the query API, call the import command to parse the profile data. Otherwise, the query result is meaningless.

    Table 2 describes the information obtained by the query function of the msprof tool.

    Table 2 Profile data file information

    Field

    Description

    Job Info

    Job name.

    Device ID

    Device ID.

    Dir Name

    Folder name.

    Collection Time

    Data collection time.

    Model ID

    Model ID.

    Iteration Number

    Total number of iterations.

    Top Time Iteration

    Top five iterations with the longest time consumptions.

    Rank ID

    Node ID in the cluster scenario.