Before You Start

Setting Up the Environment

Configure environment variables by referring to Environment Setup.

(Optional) Configuring a Compilation Option

You can determine whether to add compilation options as required. For details, see Table 1.

Table 1 Compilation scenarios

Adding Compilation Options or Not

Instruction Check Scope

Exception Check Scope

Application Scenario

No

GM-related transfer instructions

  • Only invalid read/write and unaligned access in memory check are supported.
  • The call stack information is not displayed in the exception report.
    NOTE:
    • In this scenario, the optimization level of the operator must be O2, and the -q option must be added in the operator linking phase to retain the symbol relocation information. Otherwise, the check function will fail.
    • This scenario is not applicable to Atlas Inference Series Product.
    • This scenario applies only to the operator kernel launch symbol scenario.

Applicable only to quickly identify invalid read/write and unaligned access exceptions in operator memory.

Yes

All instructions

  • Full check.
  • After the -g option is added, the call stack information will be displayed in the exception report.

Quickly locate the abnormal operator by not adding compilation options, and then add compilation options to perform full check on the abnormal operator. For details, see Enabling Full Check.

Enabling Full Check

To enable the full check, you need to add compilation options in the compilation phase of the operator code. The location of adding compilation options depends on operator projects. The following describes the Template library scenario, Kernel launch symbol scenario, Triton operator calling scenario, and msOpGen operator project compilation scenario.

  • Template library scenario
    Modify the /examples/CMakeLists.txt file in the template library and add the -g --cce-enable-sanitizer compilation option.
    set(BISHENG_COMPILER_OPTIONS -g --cce-enable-sanitizer)
  • Kernel launch symbol scenario
    1. For details about the sample project code, see link. Run the following command to download the sample code of the branch version:
      git clone https://gitee.com/ascend/samples.git -b 8.0.RC2

      This sample project does not support Atlas A3 Training Series Product.

    2. To compile the operator code, add the following compilation options:
      • -g
      • --cce-enable-sanitizer or --sanitizer
      Edit the cmake/npu/CMakeLists.txt file in the sample project directory by referring to the complete sample of kernel function development and running verification.
      target_compile_options(${smoke_testcase}_npu PRIVATE
                           -O2
                           -std=c++17
                           --cce-enable-sanitizer
                           -g
      )

      The --cce-enable-sanitizer or --sanitizer option is added to enable exception check.

      The -g option is added to enable the compiler to generate location information. The specific location of the exception (file name, line number, and call stack) will be output in the exception report.

      • If both --cce-enable-sanitizer and -O0 are enabled, the --cce-ignore-always-inline=false compilation option needs to be added.
      • If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
      • The operator binary file generated by adding the --cce-enable-sanitizer compilation option must be used together with the msSanitizer tool. You are not advised to use this binary file independently. Otherwise, unpredictable problems may occur.
      • Due to the restrictions of the llvm-symbolizer open-source software, the call stack exception information may fail to be obtained. In this case, you can run the check command again to obtain the exception information about the call stack.
    3. Add the target_link_options option in the linking phase.
      • Edit the cmake/npu/CMakeLists.txt file in the sample project directory.
        target_link_options(${smoke_testcase}_npu PRIVATE
            --cce-fatobj-link
            --cce-enable-sanitizer
        )
      • Edit the cmake/Modules/CMakeCCEInformation.cmake file in the sample project directory.
        if(NOT CMAKE_CCE_LINK_EXECUTABLE)
          set(CMAKE_CCE_LINK_EXECUTABLE
            "<CMAKE_CCE_COMPILER> ${CMAKE_LIBRARY_CREATE_CCE_FLAGS} ${_CMAKE_COMPILE_AS_CCE_FLAG} <FLAGS> <CMAKE_CCE_LINK_FLAGS> <LINK_FLAGS> <OBJECTS> -o <TARGET> <LINK_LIBRARIES>${__IMPLICIT_LINKS}")
        endif()
    4. When msSanitizer is enabled, the NPU-side executable file <kernel_name>_npu needs to be loaded. For details about how to obtain this file, see "Kernel Launch".
  • msOpGen operator project compilation scenario
    1. Click link and run the install.sh script in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch directory to generate a custom operator project.
      When downloading the code sample, run the following command to specify the branch version:
      git clone https://gitee.com/ascend/samples.git -b master
      1
      bash install.sh -v Ascendxxxyy    # xxxyy indicates the processor type.
      
    2. Switch to the custom operator project directory.
      1
      cd CustomOp
      
    3. Edit the op_kernel/CMakeLists.txt file in the sample project directory and add the -sanitizer option to the compilation options. For details, see Supported Custom Compilation Options.
      1
      add_ops_compile_options(ALL OPTIONS -sanitizer)
      
  • Triton operator calling scenario
    The Triton operator is developed in Python and the kernel is compiled in just-in-time (JIT) mode. Before executing the operator script, configure the following environment variable to support full check:
    1
    export TRITON_ENABLE_SANITIZER=1
    

Starting the Tool

After completing the setup in Environment Setup and (Optional) Configuring a Compilation Option, enable the functions of the msSanitizer tool by referring to Enabling Memory Check, Enabling Contention Check, Enabling Uninitialization Check, and Synchronization Check.

Exception reports are classified into the following levels:
  • WARNING: This level indicates an unclear risk. Possible exceptions, such as multi-core corruption and unused memory allocation, depend on the real situation. The risk of multi-core corruption involves operations on the same memory block by multiple cores. Experienced users can avoid this risk by means of inter-core synchronization. But this exception for beginners is a high risk. Currently, the WARNING-level report of multi-core corruption can be generated only for inter-core synchronization of the atomic type.
  • ERROR: This is the highest severity level of exceptions, which are deterministic memory errors, such as invalid read/write, memory leak, unaligned access, memory not initialized, and contention exceptions. It is strongly recommended that you check for exceptions at this level.