Collecting Profile Data of Ascend C Operators

This section shows how to use msProf to tune a vector operator on the board. The vector operator can add two vectors and output the result.

The procedure for collecting profile data in the three operator calling scenarios (kernel launch, single-operator API call, and PyTorch framework) is basically the same. This example uses the kernel launch scenario as an example.

Prerequisites

  • You have obtained the sample project from the link, and prepared for onboard operator simulation tuning.
    • This example project does not support Atlas A3 Training Series Product .
    • When downloading the code sample, run the following command to specify the branch version:
      git clone https://gitee.com/ascend/samples.git -b master
  • Environment variables have been configured as instructed in Before You Start.

Procedure

  1. Prepare for operator compilation according to the sample project description and by following the instructions provided in "Directly Debugging a Kernel Based on a Sample Project" in Ascend C Operator Development Guide.
  2. Build a single-operator executable file.

    The following uses the Add operator as an example. In the ${git_clone_path}/samples/operator/ascendc/0_introduction/3_add_kernellaunch/AddKernelInvocationNeo directory of the sample project, run the following command to build an executable file:

    bash run.sh -r npu -v <soc_version> # Operator running on the Ascend device
    bash run.sh -r sim -v <soc_version> # Operator running on the simulator

    After the one-click script building and running is complete, the NPU-side executable file ascendc_kernels_bbit is generated in the project directory.

    • The executable file name (ascendc_kernels_bbit) in this example is only an example. Use the actual compiled script in the current project.
    • Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the Chip Name information. The actual value is AscendChip Name. For example, if Chip Name is xxxyy, the actual value is Ascendxxxyy.
  3. Import environment variables.
    export LD_LIBRARY_PATH=${git_clone_path}/samples/operator/ascendc/0_introduction/3_add_kernellaunch/AddKernelInvocationNeo/out/lib/:$LD_LIBRARY_PATH
  4. Collect operator profile data.

    For operators running on an Ascend device, run the following command to collect the msprof op profile data and refined tuning data:

    msprof op ascendc_kernels_bbit

    For operators running on the simulator, run the following command to collect the msprof op simulator profile data, pipeline data, and heatmap data:

    msprof op simulator --soc-version=Ascendxxxyy ascendc_kernels_bbit  // xxxyy indicates the type of the processor used by the user.
  5. View the operator profile data. For details, see Tool Usage.