展示如何使用msdebug工具来上板调试一个PyTorch接口调用的add算子,该add算子可实现两个向量相加并输出结果的功能。
# CMakePresets.json ... "configurePresets": [ { "name": "default", ... "CMAKE_BUILD_TYPE": { "type": "STRING", "value": "Debug" # 编译配置修改为Debug }
bash build.sh bash ./build_out/custom_opp_ubuntu_aarch64.run
PytorchInvocation ├── op_plugin_patch ├── run_op_plugin.sh // 4.执行样例时,需要使用 └── test_ops_custom.py // 步骤2启动工具时,需要使用
bash run_op_plugin.sh -- CMAKE_CCE_COMPILER: /usr/local/Ascend/ascend-toolkit/latest/toolkit/tools/ccec_compiler/bin/ccec -- CMAKE_CURRENT_LIST_DIR: ${INSTALL_DIR}/AddKernelInvocation/cmake/Modules -- ASCEND_PRODUCT_TYPE: Ascendxxxyy -- ASCEND_CORE_TYPE: VectorCore -- ASCEND_INSTALL_PATH: /usr/local/Ascend/ascend-toolkit/latest -- The CXX compiler identification is GNU 10.3.1 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring done -- Generating done -- Build files have been written to: ${INSTALL_DIR}/AddKernelInvocation/build Scanning dependencies of target add_npu [ 33%] Building CCE object cmake/npu/CMakeFiles/add_npu.dir/__/__/add_custom.cpp.o [ 66%] Building CCE object cmake/npu/CMakeFiles/add_npu.dir/__/__/main.cpp.o [100%] Linking CCE executable ../../../add_npu [100%] Built target add_npu ${INSTALL_DIR}/AddKernelInvocation INFO: compile op on ONBOARD succeed! INFO: execute op on ONBOARD succeed! test pass
(msdebug) export LAUNCH_KERNEL_PATH=${INSTALL_DIR}/opp/vendors/customize/op_impl/ai_core/tbe/kernel/SOC_VERSION/add_custom/AddCustom_1e04ee05ab491cc5ae9c3d5c9ee8950b.o
msdebug python3 test_ops_custom.py (msdebug) target create "python3" Current executable set to '/home/mindstudio/miniconda3/envs/py37/bin/python3' (aarch64). (msdebug) settings set -- target.run-args "test_ops_custom.py" (msdebug)
(msdebug) b /home/xx/op_host/add_custom.cpp:24 Breakpoint 1: no locations (pending). WARNING: Unable to resolve breakpoint to any actual locations. (msdebug)
在算子运行后,会自动找到实际位置,并自动设置断点。
(msdebug) b add_custom.cpp:60 Breakpoint 1: where = AddCustom_1e04ee05ab491cc5ae9c3d5c9ee8950b.o`::AddCustom_1e04ee05ab491cc5ae9c3d5c9ee8950b_1(uint8_t *, uint8_t *, uint8_t *, uint8_t *, uint8_t *) + 9912 [inlined] KernelAdd::Compute(int) + 3400 at add_custom.cpp:60:9, address = 0x00000000000026b8
(msdebug) r Process 197189 launched: '/home/miniconda3/envs/py38/bin/python3' (aarch64) Process 197189 stopped and restarted: thread 1 received signal: SIGCHLD ... [Launch of Kernel anonymous on Device 0] Process 197189 stopped [Switching to focus on Kernel anonymous, CoreId 8, Type aiv] * thread #1, name = 'python3', stop reason = breakpoint 2.1 frame #0: 0x00000000000026b8 AddCustom_1e04ee05ab491cc5ae9c3d5c9ee8950b.o`::AddCustom_1e04ee05ab491cc5ae9c3d5c9ee8950b_1(uint8_t *, uint8_t *, uint8_t *, uint8_t *, uint8_t *) [inlined] KernelAdd::Compute(this=0x000000000020efb8, progress=1) at add_custom.cpp:60:9 57 LocalTensor<DTYPE_Y> yLocal = inQueueY.DeQue<DTYPE_Y>(); 58 LocalTensor<DTYPE_Z> zLocal = outQueueZ.AllocTensor<DTYPE_Z>(); 59 Add(zLocal, xLocal, yLocal, this->tileLength); -> 60 outQueueZ.EnQue<DTYPE_Z>(zLocal); 61 inQueueX.FreeTensor(xLocal); 62 inQueueY.FreeTensor(yLocal); 63 } (msdebug)
(msdebug) q Quitting LLDB will kill one or more processes. Do you really want to proceed: [Y/n] y