Sample Overview
Sample Obtaining
Visit the Ascend samples repository on Gitee and download the sample package that matches your CANN version in use. For the version mapping, see "Release Notes" in the README file. Find the "acl_operator_add" sample in the python/level1_single_api/1_acl/4_blas/acl_operator_add directory.
Function Description
This sample verifies the functionality of the custom operator by converting the custom operator file into a single-operator offline model file and loading the file using the ACL for execution.
This sample implements the matrix-matrix addition operation: C = A + B. In this operation, A, B, and C are 8 x 16 matrices and are of the int32 type. The result is an 8 x 16 matrix.
Main APIs
Function |
ACL Module |
ACL Interface Function |
Description |
|---|---|---|---|
Resource initialization |
Initialization |
acl.init |
Initializes the ACL configuration. |
Device management |
acl.rt.set_device |
Specifies the device for computation. |
|
Context management |
acl.rt.create_context |
Creates a context. |
|
Stream management |
acl.rt.create_stream |
Creates a stream. |
|
Operator loading and execution |
acl.op.set_model_dir |
Loads the directory of a model file. |
|
Data postprocessing |
Operator loading and execution |
acl.op.create_attr |
Creates data of the aclopAttr type. |
-- |
acl.create_tensor_desc |
Creates data of the aclTensorDesc type. |
|
-- |
acl.get_tensor_desc_size |
Obtains the size of a tensor description. |
|
-- |
acl.create_data_buffer |
Creates data of the aclDataBuffer type. |
|
Data exchange |
Memory management |
acl.rt.memcpy |
Sends data from the host to the device or from the device to the host. |
Memory management |
acl.rt.malloc |
Allocates device memory. |
|
Memory management |
acl.rt.malloc_host |
Allocates memory on the host. |
|
Single-operator inference |
Operator loading and execution |
acl.op.execute |
Loads and executes an operator. It is an asynchronous interface. |
Common module |
-- |
acl.util.ptr_to_numpy |
Obtains the numpy.ndarray object based on the pointer address. |
-- |
acl.util.numpy_to_ptr |
Obtains the pointer address of the memory data of the numpy.ndarray object. |
|
Allocation destruction |
Memory management |
acl.rt.free |
Frees device memory. |
Memory management |
acl.rt.free_host |
Frees the memory on the host. |
|
Stream management |
acl.rt.destroy_stream |
Destroys a stream. |
|
Context management |
acl.rt.destroy_context |
Destroys a context. |
|
Device management |
acl.rt.reset_device |
Resets the current device and reclaims the resources on the device. |
|
Deinitialization |
acl.finalize |
Deinitializes ACL. |
Single-Operator Matrix Addition Process
Directory Structure
The following is an example of the directory structure after the model file has been converted. The op_models folder is generated after the conversion.
acl_operator_add
├──src
│ ├── acl_execute_add.py // Running file
│ └── constant.py // Constant definition
└── test_data
├── config
│ ├── acl.json // Configuration file for system initialization
│ └── add_op.json // Description information of the matrix-matrix addition operator
└── op_models
└── 0_Add_3_2_8_16_3_2_8_16_3_2_8_16.om //Model file of the matrix addition operator
