昇腾社区首页
中文
注册

单步调试

用户输入n后,可将运行模式改为单核运行模式,即只有聚焦的核运行,其他核静止。

前提条件

算子编译时,使用--cce-ignore-always-inline=true的编译选项。

操作步骤

  1. 将断点打在需要调试的位置,并运行。打断点的具体操作请参见行断点
    (msdebug) r                // 运行
    Process 2695700 launched: '${INSTALL_DIR}/projects/reduce_sum/add_tik2_npu' (aarch64)
    [Launch of Kernel _Z17reduce_sum_customPhS_S_S_ on Device 0]
    Process 2695700 stopped
    [Switching to focus on Kernel _Z17reduce_sum_customPhS_S_S_, CoreId 0, Type aiv]
    * thread #1, name = 'add_tik2_npu', stop reason = breakpoint 1.1
        frame #0: 0x0000000000001390 device_debugdata`reduce_sum_custom(unsigned char*, unsigned char*, unsigned char*, unsigned char*) [inlined] KernelReduceSum::Compute(this=0x0000000000167258) at reduce_sum_custom.cpp:45:36
       42       }
       43       __aicore__ inline void Compute()
       44       {
    -> 45           LocalTensor<half> xLocal = inQueueX.DeQue<half>();  // 断点位置 
       46           LocalTensor<half> yTmpLocal = yTmp.Get<half>();
       47           LocalTensor<half> workTmpLocal = workTmp.Get<half>();
       48           LocalTensor<int32_t> syncTmpLocal = syncTmp.Get<int32_t>();
  2. 用户输入n后,msdebug工具将运行模式改为单核运行模式。
    (msdebug) n
    Process 2695700 stopped
    [Switching to focus on Kernel _Z17reduce_sum_customPhS_S_S_, CoreId 0, Type aiv]
    * thread #1, name = 'add_tik2_npu', stop reason = step over
    //   通过回显可查看pc的位置,表示单步成功
        frame #0: 0x000000000000183c device_debugdata`reduce_sum_custom(unsigned char*, unsigned char*, unsigned char*, unsigned char*) [inlined] KernelReduceSum::Compute(this=0x0000000000167258) at reduce_sum_custom.cpp:46:44
       43       __aicore__ inline void Compute()
       44       {
       45           LocalTensor<half> xLocal = inQueueX.DeQue<half>();
    -> 46           LocalTensor<half> yTmpLocal = yTmp.Get<half>();
       47           LocalTensor<half> workTmpLocal = workTmp.Get<half>();
       48           LocalTensor<int32_t> syncTmpLocal = syncTmp.Get<int32_t>();
       49           LocalTensor<half> secondTmpLocal = secondTmp.Get<half>();
  3. 输入ascend info cores命令,查看所有核的PC信息和停止原因 。
    (msdebug) ascend info cores
      CoreId  Type  Device Stream Task Block         PC               stop reason
    *   0     aiv      0     47     0     2     0x1240c001c83c         step over             //* 代表当前正在运行的核
        1     aiv      0     47     0     3     0x1240c001c390         breakpoint 1.1  
        2     aiv      0     47     0     4     0x1240c001c390         breakpoint 1.1
        3     aiv      0     47     0     5     0x1240c001c390         breakpoint 1.1
        4     aiv      0     47     0     6     0x1240c001c390         breakpoint 1.1
        5     aiv      0     47     0     7     0x1240c001c390         breakpoint 1.1
       48     aiv      0     47     0     0     0x1240c001c390         breakpoint 1.1
       49     aiv      0     47     0     1     0x1240c001c390         breakpoint 1.1
    • 当前核的停止原因既有单步调试又有断点时,将展示为breakpoint。
    • 若运行程序出现卡顿的现象,可以通过键盘输入“CTRL+C”中断运行程序 。运行卡顿的原因可能是以下情况:
      • 用户程序本身存在死循环,需要通过修复程序解决。
      • 算子使用了表1中的同步类指令。
  4. 使用核切换功能,调试其他核 。
    (msdebug) ascend aiv 2
    [Switching to focus on Kernel _Z17reduce_sum_customPhS_S_S_, CoreId 2, Type aiv]
    * thread #1, name = 'add_tik2_npu', stop reason = step over
        frame #0: 0x0000000000001390 device_debugdata`reduce_sum_custom(unsigned char*, unsigned char*, unsigned char*, unsigned char*) [inlined] KernelReduceSum::Compute(this=0x000000000016f258) at reduce_sum_custom.cpp:45:36
       42       }
       43       __aicore__ inline void Compute()
       44       {
    -> 45           LocalTensor<half> xLocal = inQueueX.DeQue<half>();
       46           LocalTensor<half> yTmpLocal = yTmp.Get<half>();
       47           LocalTensor<half> workTmpLocal = workTmp.Get<half>();
       48           LocalTensor<int32_t> syncTmpLocal = syncTmp.Get<int32_t>();
    (msdebug) n
    Process 2695700 stopped
    [Switching to focus on Kernel _Z17reduce_sum_customPhS_S_S_, CoreId 2, Type aiv]
    * thread #1, name = 'add_tik2_npu', stop reason = step over
        frame #0: 0x000000000000183c device_debugdata`reduce_sum_custom(unsigned char*, unsigned char*, unsigned char*, unsigned char*) [inlined] KernelReduceSum::Compute(this=0x000000000016f258) at reduce_sum_custom.cpp:46:44
       43       __aicore__ inline void Compute()
       44       {
       45           LocalTensor<half> xLocal = inQueueX.DeQue<half>();
    -> 46           LocalTensor<half> yTmpLocal = yTmp.Get<half>();
       47           LocalTensor<half> workTmpLocal = workTmp.Get<half>();
       48           LocalTensor<int32_t> syncTmpLocal = syncTmp.Get<int32_t>();
       49           LocalTensor<half> secondTmpLocal = secondTmp.Get<half>();
    (msdebug) ascend info cores
      CoreId  Type  Device Stream Task Block         PC               stop reason
        0     aiv      0     47     0     2     0x1240c001c83c         step over
        1     aiv      0     47     0     3     0x1240c001c390         breakpoint 1.1
    *   2     aiv      0     47     0     4     0x1240c001c83c         step over         // 用户输入n后,stop reason才会展示为stepover
        3     aiv      0     47     0     5     0x1240c001c390         breakpoint 1.1
        4     aiv      0     47     0     6     0x1240c001c390         breakpoint 1.1
        5     aiv      0     47     0     7     0x1240c001c390         breakpoint 1.1
       48     aiv      0     47     0     0     0x1240c001c390         breakpoint 1.1
       49     aiv      0     47     0     1     0x1240c001c390         breakpoint 1.1
  5. 调试完以后,执行q命令并输入Y或y结束调试。
    (msdebug) q
    Quitting LLDB will kill one or more processes. Do you really want to proceed: [Y/n] y