Debugging a Template Library Operator on the Board
This section shows how to use msDebug to debug a template library operator (matmul) on the board. The operator can multiply two matrices and output the result.
Prerequisites
- Click here to obtain a sample project for operator debugging.
- Configure environment variables by referring to Before You Start.
Procedure
- Compile the operator based on the sample project in Prerequisites and obtain the executable file 00_basic_matmul.Run the following command to compile the operator. After the compilation is complete, the executable file 00_basic_matmul is generated in the build/bin directory.
1bash ./scripts/build.sh 00_basic_matmul --debug --msdebug
- Start msDebug to start the operator program and enter the debugging page.
1 2 3 4
msdebug ./build/bin/00_basic_matmul 256 512 1024 0 (msdebug) target create "./build/bin/00_basic_matmul" Current executable set to '/home/mindstudio/projects/ascendc-templates/build/bin/00_basic_matmul' (aarch64). (msdebug)
- Set a breakpoint.In this sample, the implementation code of the kernel function is stored in basic_matmul.hpp. Set NPU breakpoints in this file for required code lines.
1 2 3
(msdebug) b basic_matmul.hpp:121 Breakpoint 1: 2 locations. (msdebug)
- Run the operator program and wait until the breakpoint is hit.
The program starts to run until the first breakpoint (basic_matmul.hpp:127) is hit. msDebug detects that the NPU core function starts to run on device 0.
_ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Blo indicates the kernel name of the template library. Only the first 64 characters are displayed in the example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
(msdebug) run Process 3344307 launched: '/home/mindstudio/projects/ascendc-templates/build/bin/00_basic_matmul' (aarch64) [Launch of Kernel _ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Blo on Device 0] Process 3344307 stopped [Switching to focus on Kernel _ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Blo, CoreId 21, Type aic] * thread #1, name = '00_basic_matmul', stop reason = breakpoint 1.1 frame #0: 0x0000000000001c38 device_debugdata`_ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Block9BlockMmadINS1_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS8_ILj128ELj256ELj64EEENS1_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESG_SG_vNS1_4Tile8TileCopyINS_4Arch7AtlasA2ESG_SG_SG_vEENSH_8TileMmadISK_SG_SG_vEEEEvNS4_24GemmIdentityBlockSwizzleILj3ELj0EEEEEEEvNT_6ParamsE_mix_aic at basic_matmul.hpp:121:71 118 119 for (uint32_t loopIdx = AscendC::GetBlockIdx(); loopIdx < coreLoops; loopIdx += AscendC::GetBlockNum()) { 120 // Compute block location -> 121 GemmCoord blockCoord = matmulBlockScheduler.GetBlockCoord(loopIdx); 122 GemmCoord actualBlockShape = matmulBlockScheduler.GetActualBlockShape(blockCoord); 123 124 // Compute initial location in logical coordinates (msdebug)
- Review information.
For details about other debugging operations, see Printing Memory and Variables, Displaying the Debugging Information, and Switching Cores.
- Run the ascend info cores command to query NPU core information.
1 2 3 4 5 6 7
(msdebug) ascend info cores CoreId Type Device Stream Task Block PC stop reason * 21 aic 0 48 0 0 0x12c0c00d6c38 breakpoint 1.1 22 aic 0 48 0 1 0x12c0c00d6c38 breakpoint 1.1 23 aic 0 48 0 2 0x12c0c00d6c38 breakpoint 1.1 24 aic 0 48 0 3 0x12c0c00d6c38 breakpoint 1.1 (msdebug)
- Run the print command to print the gmA variable information.
1 2 3 4 5 6 7 8 9
(msdebug) print gmA (AscendC::GlobalTensor<__fp16>) $0 = { AscendC::BaseGlobalTensor<__fp16> = { address_ = 0x000012c0c0013000 oriAddress_ = 0x000012c0c0013000 } bufferSize_ = 0 cacheMode_ = CACHE_MODE_NORMAL }
- Run the memory read command to print the value stored in the gmA variable.
- Print the data stored in gmA in the GM.
1 2 3
(msdebug) memory read -m GM 0x12c0c0013000 -f float16[] -s 256 -c 1 0x12c0c0013000: {3.40234 -1.05664 2.83008 2.98438 4.11719 -3.02539 -1.64746 2.68164 -2.22266 0.539551 -0.226074 1.28906 -1.35254 0.134033 4.52344 4.16016 1.35742 2.17383 -3.58398 1.06934 -4.83594 -2.57031 -3.62695 3.04102 -3.43359 -0.990723 -3.70117 -3.91211 4.98828 -2.81836 0.129272 3.39062 1.12598 -2.03906 1.37598 0.24292 -0.0641479 4.72656 -2.07422 2.71289 0.267334 2.69922 -0.997559 3.91602 -2.16602 -1.47559 3.07812 4.19141 -4.30078 4.49219 0.26001 -4.14062 -3.07812 1.63184 3.90234 -1.51074 -4.35938 -4.80078 -0.423096 -4.36719 -2.61719 4.70703 4.02344 3.50977 -2.33398 0.397705 -1.24805 2.60156 0.125366 1.67676 0.316162 -4.60547 -0.623535 4.31641 4.30859 2.20898 -2.15625 2.38477 1.39941 -1.45996 1.87891 -3.33984 -0.599121 3.80078 3.29297 -1.69629 -2.71094 3.93359 -1.49609 1.86621 4.56641 0.88623 1.57324 3.58594 -0.604492 4.23828 -1.01562 3.14844 1.8418 4.10938 -0.175049 -2.8418 4.50391 4.20312 -3.52344 3.81055 1.41113 -0.680664 1.19629 -2.18945 2.85938 -1.92578 -0.529785 -2.73828 -3.125 -2.23828 0.564453 -0.834961 -3.30469 4.06641 -3.96875 -3.73828 -0.0455627 2.60547 4.84766 4.35156 1.84473 -1.16797} (msdebug)
- Print the data stored in gmA in the GM.
- Switch to another AIC core and print the required information.
1 2 3 4 5 6 7 8 9 10 11 12 13
(msdebug) ascend aic 24 // Select the core ID corresponding to block 3 in ascend info cores. In this example, the core ID is 24. [Switching to focus on Kernel _ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Blo, CoreId 24, Type aic] * thread #1, name = '00_basic_matmul', stop reason = breakpoint 1.1 frame #0: 0x0000000000001c38 device_debugdata`_ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Block9BlockMmadINS1_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS8_ILj128ELj256ELj64EEENS1_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESG_SG_vNS1_4Tile8TileCopyINS_4Arch7AtlasA2ESG_SG_SG_vEENSH_8TileMmadISK_SG_SG_vEEEEvNS4_24GemmIdentityBlockSwizzleILj3ELj0EEEEEEEvNT_6ParamsE_mix_aic at basic_matmul.hpp:121:71 118 119 for (uint32_t loopIdx = AscendC::GetBlockIdx(); loopIdx < coreLoops; loopIdx += AscendC::GetBlockNum()) { 120 // Compute block location -> 121 GemmCoord blockCoord = matmulBlockScheduler.GetBlockCoord(loopIdx); 122 GemmCoord actualBlockShape = matmulBlockScheduler.GetActualBlockShape(blockCoord); 123 124 // Compute initial location in logical coordinates (msdebug) p loopIdx (uint32_t) $1 = 0
- Run the ascend info cores command to query NPU core information.
- Query and delete breakpoints to resume program execution.
1 2 3 4 5 6 7 8 9 10 11
(msdebug) breakpoint list Current breakpoints: 1: file = 'basic_matmul.hpp', line = 121, exact_match = 0, locations = 2, resolved = 2, hit count = 1 1.1: where = device_debugdata`_ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Block9BlockMmadINS1_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS8_ILj128ELj256ELj64EEENS1_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESG_SG_vNS1_4Tile8TileCopyINS_4Arch7AtlasA2ESG_SG_SG_vEENSH_8TileMmadISK_SG_SG_vEEEEvNS4_24GemmIdentityBlockSwizzleILj3ELj0EEEEEEEvNT_6ParamsE_mix_aic + 4748 [inlined] _ZN7Catlass4Gemm6Kernel11BasicMatmulINS0_5Block9BlockMmadINS0_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS7_ILj128ELj256ELj64EEENS0_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESF_SF_vNS0_4Tile8TileCopyINS_4Arch7AtlasA2ESF_SF_SF_vEENSG_8TileMmadISJ_SF_SF_vEEEEvNS3_24GemmIdentityBlockSwizzleILj3ELj0EEEEclILi1EEEvRKNSQ_6ParamsE_mix_aic + 4632 at basic_matmul.hpp:121:71, address = 0x0000000000001c38, resolved, hit count = 1 1.2: where = device_debugdata`_ZN7Catlass13KernelAdapterINS_4Gemm6Kernel11BasicMatmulINS1_5Block9BlockMmadINS1_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS8_ILj128ELj256ELj64EEENS1_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESG_SG_vNS1_4Tile8TileCopyINS_4Arch7AtlasA2ESG_SG_SG_vEENSH_8TileMmadISK_SG_SG_vEEEEvNS4_24GemmIdentityBlockSwizzleILj3ELj0EEEEEEEvNT_6ParamsEm_mix_aic + 4772 [inlined] _ZN7Catlass4Gemm6Kernel11BasicMatmulINS0_5Block9BlockMmadINS0_19MmadAtlasA2PingpongILb1EEENS_9GemmShapeILj128ELj256ELj256EEENS7_ILj128ELj256ELj64EEENS0_8GemmTypeIDhNS_6layout8RowMajorELN7AscendC9TPositionE0EEESF_SF_vNS0_4Tile8TileCopyINS_4Arch7AtlasA2ESF_SF_SF_vEENSG_8TileMmadISJ_SF_SF_vEEEEvNS3_24GemmIdentityBlockSwizzleILj3ELj0EEEEclILi1EEEvRKNSQ_6ParamsE_mix_aic + 4632 at basic_matmul.hpp:121:71, address = 0x000000000000dd54, resolved, hit count = 0 (msdebug) breakpoint delete 1 1 breakpoints deleted; 0 breakpoint locations disabled. (msdebug) continue Process 3344307 resuming Compare success. Process 3344307 exited with status = 0 (0x00000000)
- After the debugging is complete, run the q command and enter Y or y to end the debugging.
(msdebug) q
Parent topic: Typical Cases