可在算子的运行程序上设置行断点,即在算子代码文件的特定行号上设置断点。
(msdebug) b add_custom.cpp:85 Breakpoint 1: where = device_debugdata`::add_custom(uint8_t *__restrict, uint8_t *__restrict, uint8_t *__restrict) + 14348 [inlined] KernelAdd::CopyOut(int) + 1700 at add_custom.cpp:85:9, address = 0x000000000000380c (msdebug)
字段 |
释义 |
---|---|
device_debugdata |
设备侧.o文件名。 |
add_custom() |
断点所在的kernel函数名。 |
14348 |
本次断点地址相对add_custom函数的地址偏移量,即380c相对add_custom函数所在地址的偏移量是14348。 |
KernelAdd::CopyOut(int) + 1700 |
代码所在函数,偏移量为1700。 |
address = 0x000000000000380c |
断点的地址,即逻辑相对地址。 |
(msdebug) b /home/xx/op_host/add_custom.cpp:24 Breakpoint 1: no locations (pending). WARNING: Unable to resolve breakpoint to any actual locations. (msdebug)
在算子运行后,会自动找到实际位置,并自动设置断点。
(msdebug) run Process 196968 launched: '/home/HwHiAiUser/projects/add_ascendc_sample/add_custom_npu' (aarch64) [Launch of Kernel add_custom on Device 1] [Switching to focus on Kernel add_custom, CoreId 1, Type aiv] Process 196968 stopped * thread #1, name = 'add_custom_npu', stop reason = breakpoint 1.1 frame #0: 0x000000000000380c device_debugdata`::add_custom(uint8_t *__restrict, uint8_t *__restrict, uint8_t *__restrict) [inlined] KernelAdd::CopyOut(this=0x000000000016a9b8, progress=0) at add_custom.cpp:85:9 82 // copy progress_th tile from local tensor to global tensor 83 DataCopy(zGm[progress * TILE_LENGTH], zLocal, TILE_LENGTH); 84 // free output tensor for reuse -> 85 outQueueZ.FreeTensor(zLocal); 86 } 87 88 private: (msdebug)
“0x000000000000380c”代表该断点所在的pc地址。