Collecting Profile Data of Ascend C Operators (Kernel Launch)
This section shows how to use msProf to tune a vector operator on the board. The vector operator can add two vectors and output the result.
The performance data collection procedures in direct kernel debugging, single-operator API execution, and PyTorch framework are similar. The following uses the sub-scenario of direct kernel debugging as an example.
Prerequisite
- Click Link to obtain the sample project to prepare for onboard and simulation-based tuning.
- When downloading the code sample, run the following command to specify the branch version:
git clone https://gitee.com/ascend/samples.git -b v0.2-8.0.0.beta1
- When downloading the code sample, run the following command to specify the branch version:
- Configure environment variables by referring to Before You Start.
Procedure
- Prepare for operator compilation according to the sample project description and by referring to Kernel Launch.
- Build a single-operator executable file.
The following uses the Add operator as an example. In the ${git_clone_path}/samples/operator/ascendc/0_introduction/3_add_kernellaunch/AddKernelInvocationNeo directory of the sample project, run the following command to build an executable file:
bash run.sh -r npu -v <soc_version> # Operator running on the Ascend device bash run.sh -r sim -v <soc_version> # Operator running on the simulator
After the one-click script building and running is complete, the NPU-side executable file ascendc_kernels_bbit is generated in the project directory.
- The executable file name (ascendc_kernels_bbit) in this example is only an example. Use the actual compiled script in the current project.
- Run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the Chip Name information. The actual value is AscendChip Name. For example, if Chip Name is xxxyy, the actual value is Ascendxxxyy. If Ascendxxxyy is the code sample path, you need to set ascendxxxyy.
- Import environment variables.
export LD_LIBRARY_PATH=${git_clone_path}/samples/operator/ascendc/0_introduction/3_add_kernellaunch/AddKernelInvocationNeo/out/lib/:$LD_LIBRARY_PATH - Collect operator profile data.
For operators running on an Ascend device, run the following command to collect the msprof op profile data and refined tuning data:
msprof op ascendc_kernels_bbit
For an operator running on the simulator, run the following command to collect the msprof op simulator profile data:
msprof op simulator --soc-version=Ascendxxxyy ascendc_kernels_bbit # xxxyy indicates the type of the processor used by the user.
- View the operator profile data. For details, see Tool Usage.