--op_debug_config

Description

Specifies the path and name of the debugging configuration file.

See Also

None

Arguments

Argument: directory of the configuration file, including the file name.

Format: The directory (including the file name) can contain letters, digits, underscores (_), hyphens (-), periods (.), and Chinese characters.

Restrictions:

The configuration file supports the following options. Multiple options when used should be separated with commas (,).

  • oom: checks whether memory overwriting occurs in the global memory during operator execution.
    • Configuring this option retains the binary operator file (.o) and operator description file (.json) in the kernel_meta folder under the current execution directory during operator build.
    • If this option is used, the following detection logic is added during operator build. You can use the dump_cce option to view the following code in the generated .cce file:
      1
      2
      3
      4
      5
      6
      7
      8
      9
      inline __aicore__ void  CheckInvalidAccessOfDDR(xxx) {
          if (access_offset < 0 || access_offset + access_extent > ddr_size) {
              if (read_or_write == 1) {
                  trap(0X5A5A0001);
              } else {
                  trap(0X5A5A0002);
              }
          }
      }
      

      During inference, if memory overwriting occurs, the error code EZ9999 is reported.

  • dump_bin: retains the binary operator file (.o) and operator description file (.json) in the kernel_meta folder under the current execution directory during operator build.
  • dump_cce: retains the operator CCE file (.cce), binary operator file (.o), and operator description file (.json) in the kernel_meta folder under the current execution directory during operator build.
  • dump_loc: retains the Python-CCE mapping file (*_loc.json) in the kernel_meta folder under the current execution directory during operator build.
  • ccec_O0: enables the CCEC compiler option -O0 during operator build. This option does not optimize the debugging information for later analysis of AI Core errors.
  • ccec_g: enables the CCEC compiler option -g during operator build. This option optimizes the debugging information for later analysis of AI Core errors.
  • check_flag: checks whether pipeline synchronization signals in operators match each other during operator execution.
    • Configuring this option retains the binary operator file (.o) and operator description file (.json) in the kernel_meta folder under the current execution directory during operator build.
    • If this option is used, the following detection logic is added during operator build. You can use the dump_cce option to view the following code in the generated .cce file:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
        set_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID0);
        set_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID1);
        set_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID2);
        set_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID3);
        ....
        pipe_barrier(PIPE_MTE3);
        pipe_barrier(PIPE_MTE2);
        pipe_barrier(PIPE_M);
        pipe_barrier(PIPE_V);
        pipe_barrier(PIPE_MTE1);
        pipe_barrier(PIPE_ALL);
        wait_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID0);
        wait_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID1);
        wait_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID2);
        wait_flag(PIPE_MTE3, PIPE_MTE2, EVENT_ID3);
        ...
      

      During actual inference, if the pipeline synchronization signals in operators do not match each other, a timeout error is reported at the faulty operator, and the program is terminated. The following is an example of the error message:

      Aicore kernel execute failed, ..., fault kernel_name=operator name,...
      rtStreamSynchronizeWithTimeout execute failed....
  • When ccec_O0 and ccec_g are enabled, the size of the operator kernel file (*.o file) increases. In the dynamic shape scenario, all possible shape scenarios are traversed during operator build, which may cause operator build failures due to large operator kernel files. In this case, do not enable the CCE compiler options.

    If a build failure is caused by the large operator kernel file, the following log is displayed:

    1
    message:link error ld.lld: error: InputSection too large for range extension thunk ./kernel_meta_xxxxx.o:
    
  • The CCEC options ccec_O0 and oom cannot be enabled at the same time. Otherwise, an AI Core error is reported. The following is an example of the error information:
    1
    ...there is an aivec error exception, core id is 49, error code = 0x4 ...
    
  • The oom configuration option cannot be used together with the NPU_COLLECT_PATH environment variable. Otherwise, an error is reported when the compiled operator kernel package is used.

Suggestions and Benefits

None

Example

Assume that the configuration file for enabling global memory detection is gm_debug.cfg.

op_debug_config=ccec_g,oom

Upload the file to any directory (for example, $HOME/module) on the server where the compilation tool is located.

op_compiler --kernel_name=<kernel_name>  --op_debug_config=gm_debug.cfg --soc_version=<soc_version> --log=info --job=128  --output=<output_dir>

Applicability

Atlas Training Series Product

Restrictions

None