Graph Build and Execution
This section uses the single-operator model as an example to describe the build and execution processes in graph mode. Single-operator model execution means that operators are executed based on graph IR. First, build operators (for example, use the ATC tool to compile the single-operator description file defined by Ascend IR into an .om model file). Then, call an ACL API to load the operator model. Finally, call an ACL API to execute the operator. The following provides only the example and basic content of single-operator model execution. For more details, see Single-Operator Model Execution
Environment Requirements
- You have installed the CANN Driver and software and set basic environment variables by referring to Environment Setup.
After the CANN package is installed, log in to the environment as the CANN operating user and run the source $INSTALL_DIR/set_env.sh command to set environment variables. Replace ${INSTALL_DIR} with the CANN component directory. For example, if the installation is performed by the root user, the default file storage path is /usr/local/Ascend/cann.
- You have developed and deployed operators by referring to Project-based Operator Development.
Preparing a Verification Code Project
├──input // Directory for storing the input data generated by the script ├──output // Directory for storing the output data and truth value generated during operator execution ├── inc // Header file directory │ ├── common.h // Common method class declaration file, used to read binary files │ ├── operator_desc.h // Operator description declaration file, including the operator input and output, operator type, input description, and output description │ ├── op_runner.h // Operator execution information declaration file, including the numbers and sizes of operator input and output ├── src │ ├── CMakeLists.txt // Build script │ ├── common.cpp // Common function file, used to read binary files │ ├── main.cpp // File used to compile a single-operator into an .om file and load the .om file for execution │ ├── operator_desc.cpp // File used to construct the input and output description of the operator │ ├── op_runner.cpp // Function implementation file for building and running a single-operator ├── scripts │ ├── verify_result.py // Truth value comparison file │ ├── gen_data.py // Script for generating the input data and truth value │ ├── acl.json // ACL configuration file │ ├── add_custom_static_shape.json // Operator description file, which is used to construct a static-shape single-operator model file │ ├── add_custom_dynamic_shape.json // Operator description file, which is used to construct a dynamic-shape single-operator model file
Generating a Single-Operator Offline Model File
- Construct the static-shape single-operator description file add_custom_static_shape.json to describe the input, output, and attributes of the operator.The following is an example of the description file of the AddCustom static-shape operator:
[ { "op": "AddCustom", "input_desc": [ { "name": "x", "param_type": "required", "format": "ND", "shape": [8, 2048], "type": "float16" }, { "name": "y", "param_type": "required", "format":"ND", "shape": [8, 2048], "type": "float16" } ], "output_desc": [ { "name": "z", "param_type": "required", "format": "ND", "shape": [8, 2048], "type": "float16" } ] } ]The following is an example of the description file of the AddCustom dynamic-shape operator:[ { "op": "AddCustom", "input_desc": [ { "name": "x", "param_type": "required", "format": "ND", "shape": [-1, -1], "shape_range": [[1,-1],[1,-1]], "type": "float16" }, { "name": "y", "param_type": "required", "format":"ND", "shape": [-1, -1], "shape_range": [[1,-1],[1,-1]], "type": "float16" } ], "output_desc": [ { "name": "z", "param_type": "required", "format": "ND", "shape": [-1, -1], "shape_range": [[1,-1],[1,-1]], "type": "float16" } ] } ] - Use the ATC tool to compile the operator description file into a single-operator model file (*.om file).
The following is a command example of the ATC tool:
atc --singleop=$HOME/op_verify/run/out/test_data/config/add_custom_static_shape.json --output=op_models/ --soc_version=<soc_version>The key command-line options are described as follows. For details, see ATC Instructions. :
- --singleop: path of the single-operator description file (JSON format).
- --output: directory for storing .om files.
- --soc_version: AI processor version. Replace it with the actual version.
The AI processor model can be obtained in the following ways:
- For the following products: Run the npu-smi info command on the server where Ascend AI Processor is installed to obtain the Name information. The actual value is AscendName. For example, if Name is xxxyy, the actual value is Ascendxxxyy.
Atlas A2 training products /Atlas A2 inference products Atlas 200I/500 A2 inference products Atlas inference products Atlas training products - For the following products: Run the npu-smi info -t board -i id -c chip_id command on the server where Ascend AI Processor is installed to obtain the Chip Name and NPU Name information. The actual value is Chip Name_NPU Name. For example, if the value of Chip Name is Ascendxxx and the value of NPU Name is 1234, the actual value is Ascendxxx_1234. Note that:
- id: device ID, which is the NPU ID obtained by running the npu-smi info -l command.
- chip_id: chip ID, which is obtained by running the npu-smi info -m command.
Atlas A3 training products /Atlas A3 inference products
- For the following products: Run the npu-smi info command on the server where Ascend AI Processor is installed to obtain the Name information. The actual value is AscendName. For example, if Name is xxxyy, the actual value is Ascendxxxyy.
After the preceding command is executed, an offline model file with the suffix *.om is generated in the path specified by the --output option.
Test Data Generation
Run the following command in the sample operator project directory:
python3 scripts/gen_data.py
Two data files input_0.bin and input_1.bin with shape (8, 2048) and data type float16 are generated in the input directory for verifying the AddCustom operator.
A code example is as follows:
import numpy as np
a = np.random.randint(100, size=(8, 2048,)).astype(np.float16)
b = np.random.randint(100, size=(8, 2048,)).astype(np.float16)
a.tofile('input_0.bin')
b.tofile('input_1.bin')
Compiling Verification Code
Compile the code logic for loading and executing a single operator by referring to the following example.
The following is a code snippet of key steps only, which is not ready to be built or run. After APIs are called, you need to add exception handling branches and record error logs and info logs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
// 1. Perform initialization. aclRet = aclInit("../scripts/acl.json"); // 2. Allocate runtime resources. int deviceId = 0; aclRet = aclrtSetDevice(deviceId); // Obtain the run mode of the software stack. Different run modes lead to different API call sequences (for example, whether data transfer is required). aclrtRunMode runMode; bool g_isDevice = false; aclError aclRet = aclrtGetRunMode(&runMode); g_isDevice = (runMode == ACL_DEVICE); // 3. Load the single-operator model file (*.om file). // This directory is relative to the directory of the executable file. For example, if the executable file is stored in the output directory, the directory is op_models. aclRet = aclopSetModelDir("../op_models"); // 4. Set the operator input, allocate memory, read the input data input_0.bin and input_1.bin, and save them to the allocated memory. // ...... // 5. Create a stream. aclrtStream stream = nullptr; aclrtCreateStream(&stream) // 6. Execute the single-operator. // opType indicates the operator type name, for example, AddCustom. // numInputs indicates the number of operator inputs. For example, the AddCustom operator has two inputs. // inputDesc indicates the array of the operator input tensor descriptions, describing the format, shape, and data type of each input. // inputs indicates the input tensor data of the operator. // numOutputs indicates the number of operator outputs. For example, the AddCustom operator has one output. // outputDesc indicates the array of the operator output tensor descriptions, describing the format, shape, and data type of each output. // outputs indicates the output tensor data of the operator. // attr indicates the operator attributes. If the operator does not have attributes, call aclopCreateAttr to create data of the aclopAttr type. // stream is used to maintain the execution sequence of some asynchronous operations. aclopExecuteV2(opType, numInputs, inputDesc, inputs, numOutputs, outputDesc, outputs, attr, nullptr); // 7. Block the app until all tasks in the specified stream are complete. aclrtSynchronizeStream(stream); // 8. Process the output data after the operator is executed, for example, display the data on the screen or write the data to a file. You can implement the processing based on the actual requirements. In this example, the result is written to the output_z.bin file. // ...... // 9. Destroy the stream. aclrtDestroyStream(stream); // 10. Destroy runtime allocations. aclRet = aclrtResetDevice(deviceId); aclRet = aclFinalize(); // .... |
Running and Verification
- In the development environment, set environment variables and configure the paths of the header files and library files on which the build of single-operator verification program depends. The following is an example of setting environment variables. Replace ${INSTALL_DIR} with the CANN component directory. For example, if the installation is performed by the root user, the default file storage path is /usr/local/Ascend/cann. {arch-os} indicates the architecture and OS of the operating environment. arch indicates the OS architecture, and os indicates the operating system, for example, x86_64-linux.
export DDK_PATH=${INSTALL_DIR} export NPU_HOST_LIB=${INSTALL_DIR}/{arch-os}/devlib
- Build the sample project to generate an executable file for single-operator verification.
- Go to the directory of the sample project and run the following command in this directory to create a directory (for example, build) for storing the generated executable file.
mkdir -p build
- Go to the build directory and run the CMake compile command to generate build files.
Command example:
cd build
cmake ../src -DCMAKE_SKIP_RPATH=TRUE
- Run the following command to generate an executable file:
The executable file execute_add_op is generated in the output directory of the project.
- Go to the directory of the sample project and run the following command in this directory to create a directory (for example, build) for storing the generated executable file.
- Execute the single-operator.
- Copy execute_add_op in the output directory of the sample project in the development environment to any directory in the operating environment as the running user (for example, HwHiAiUser).
Note: If your development environment is the operating environment, skip this step.
- Execute the execute_add_op file in the operating environment to verify the single-operator model file.
chmod +x execute_add_op
./execute_add_op
Check the command output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
[INFO] static op will be called [INFO] Set device[0] success [INFO] Get RunMode[1] success [INFO] aclopSetModelDir op model success [INFO] Init resource success [INFO] Set input success [INFO] Copy input[0] success [INFO] Copy input[1] success [INFO] Create stream success [INFO] Execute AddCustom success [INFO] Synchronize stream success [INFO] Copy output[0] success [INFO] Write output success [INFO] Run op success [INFO] Reset Device success [INFO] Destroy resource success
If Run op success is displayed, the execution is successful and the output_z.bin file is generated in the output directory.
- Copy execute_add_op in the output directory of the sample project in the development environment to any directory in the operating environment as the running user (for example, HwHiAiUser).
- Compare the truth value files.
Switch to the root directory of the sample project and run the following command:
python3 scripts/verify_result.py output/output_z.bin output/golden.bin
If the following information is displayed, the verification result of the AddCustom operator is correct.1test pass