MSPTI Samples

This section provides samples of using various MSPTI APIs.

Prerequisites

  • You have installed the CANN Toolkit package and ops operator package.

    For details, see CANN Software Installation Guide.

  • The samples of MSPTI Python APIs depend on the PyTorch framework and torch_npu plugin. Ensure that they have been installed.

    For details, see "Installing PyTorch" in Ascend Extension for PyTorch Software Installation Guide.

Sample Building and Execution

  1. After the CANN software is installed, when you build and run your application as the CANN running user, log in to the environment as the CANN running user and run the source ${install_path}/set_env.sh command to set environment variables. {install_path} indicates the CANN installation path, for example, /usr/local/Ascend/ascend-toolkit.
  2. Go to the sample directory.

    The MSPTI sample code is integrated in the CANN Toolkit package and ops operator package in ${INSTALL_DIR}/tools/mspti/samples.

    Replace ${INSTALL_DIR} with the actual CANN component directory. If the Ascend-CANN-Toolkit package is installed as the root user, the CANN component directory is /usr/local/Ascend/ascend-toolkit/latest.

    Example:

    cd ${INSTALL_DIR}/tools/mspti/samples/callback_domain
  3. Execute sample_run.sh in the sample directory.
    bash sample_run.sh

The following table describes the samples provided currently.

Table 1 Callback API samples

Sample

Description

Applicability

callback_domain

Demonstrates the Callback API function. You can call msptiEnableDomain to perform callback operations before and after the runtime API.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

callback_mstx

  1. Demonstrates the function of combining the callback and mstx APIs. You can use the callback and mstx APIs to profile operator data before and after the runtime launch kernel.
  2. Demonstrates the usage of userdata in callback. You can use userdata to transparently transmit configurations or specific running parameters.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

Table 2 Activity API samples

Sample

Description

Applicability

mspti_activity

  1. Demonstrates the basic functions of the Activity API. It shows how to profile kernels and memory.
  2. Demonstrates the basic running of the Activity API and describes the basic usage of the Activity API, including the memory allocation of the activity buffer and the logic of buffer consumption.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

mspti_correlation

  1. Demonstrates the basic functions of the Activity API. It shows how to use the correlationId field to correlate the API with the kernel data.
  2. Demonstrates the correlation between the runtime API delivery and the actual kernel execution data. After the association, the delivery and execution of operators can be one-to-one mapped, which facilitates performance bottleneck analysis.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

mspti_external_correlation

  1. Demonstrates the MSPTI external correlation function.
  2. Demonstrates the usage of msptiActivityPopExternalCorrelationId and msptiActivityPushExternalCorrelationId. You can use it to correlate various APIs to trace function call stacks.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

mspti_hccl_activity

Demonstrates the basic functions of the Activity API. It shows how to profile HCCL communication data.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

mspti_mstx_activity_domain

  1. Demonstrates how MSPTI controls the mstxDomain function. You can enable or disable the function to control domain profiling.
  2. You can use the MSPTI switch to enable or disable profiling in real time, reducing performance loss.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

Table 3 Python API samples

Sample

Description

Applicability

python_monitor

Demonstrates the basic usage of Monitor. You can use KernelMonitor and HcclMonitor to obtain the time consumed by compute operators and communication operators.

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

python_mstx_monitor

Demonstrates the basic usage of MstxMonitor. You can use mstx to collect the time consumed by a specific operator (for example, matmul).

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product