Checking the Operators Called by a PyTorch API

Prerequisite

  • You have obtained the sample project by clicking link to prepare for the operator check.
    • This sample project supports only Python 3.9. To run it on other Python versions, change the Python version in the run_op_plugin.sh file in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/PytorchInvocation directory.
    • This example project does not support Atlas A3 Training Series Product.
    • When downloading the code sample, run the following command to specify the branch version:
      git clone https://gitee.com/ascend/samples.git -b master
  • You have installed the PyTorch framework and torch_npu plug-in by referring to Ascend Extension for PyTorch Software Installation Guide.

Procedure

  1. Run the following command to generate a custom operator project and implement the operator on the host and kernel:
    bash install.sh -v Ascendxxxyy    # xxxyy indicates the processor type used by the user.
  2. Compile and deploy the operator by referring to Compiling and Deploying Operators.

    Edit the CMakeLists.txt file in the ${git_clone_path}/samples/operator/ascendc/0_introduction/1_add_frameworklaunch/CustomOp/op_kernel directory of the sample project and add the compilation option -sanitizer.

    add_ops_compile_options(ALL OPTIONS -sanitizer)
  3. Go to PyTorch access project, call the AddCustom operator project in PyTorch mode, and complete compilation as instructed.
    1
    2
    3
    4
    PytorchInvocation
    ├── op_plugin_patch         
    ├── run_op_plugin.sh      // Required for 5. Executing the sample.
    └── test_ops_custom.py    // Required for tool startup in Step 6.
    
  4. Execute the sample. During the sample execution, test data is automatically generated. Run the PyTorch sample, and verify the running result.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    bash run_op_plugin.sh
    -- CMAKE_CCE_COMPILER: ${INSTALL_DIR}/toolkit/tools/ccec_compiler/bin/ccec
    -- CMAKE_CURRENT_LIST_DIR: ${INSTALL_DIR}/AddKernelInvocation/cmake/Modules
    -- ASCEND_PRODUCT_TYPE:
      Ascendxxxyy
    -- ASCEND_CORE_TYPE:
      VectorCore
    -- ASCEND_INSTALL_PATH:
      /usr/local/Ascend/cann
    -- The CXX compiler identification is GNU 10.3.1
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Configuring done
    -- Generating done
    -- Build files have been written to: ${INSTALL_DIR}/AddKernelInvocation/build
    Scanning dependencies of target add_npu
    [ 33%] Building CCE object cmake/npu/CMakeFiles/add_npu.dir/__/__/add_custom.cpp.o
    [ 66%] Building CCE object cmake/npu/CMakeFiles/add_npu.dir/__/__/main.cpp.o
    [100%] Linking CCE executable ../../../add_npu
    [100%] Built target add_npu
    ${INSTALL_DIR}/AddKernelInvocation
    INFO: compile op on ONBOARD succeed!
    INFO: execute op on ONBOARD succeed!
    test pass
    
  5. Start the msSanitizer tool to start the Python program for exception detection. For details about how to enable the exception detection function, see Principles for Enabling the Exception Check Function.
  6. Analyze abnormal behavior by referring to Analyzing a Memory Exception Report, Analyzing a Contention Check Report, and Analyzing a Uninitialization Exception Report.