Sample Overview

Sample Obtaining

Visit the Ascend samples repository on Gitee and download the sample package that matches your CANN version in use. For the version mapping, see "Release Notes" in the README file. Find the "acl_operator_add" sample in the python/level1_single_api/1_acl/4_blas/acl_operator_add directory.

Function Description

This sample verifies the functionality of the custom operator by converting the custom operator file into a single-operator offline model file and loading the file using the ACL for execution.

This sample implements the matrix-matrix addition operation: C = A + B. In this operation, A, B, and C are 8 x 16 matrices and are of the int32 type. The result is an 8 x 16 matrix.

Main APIs

Table 1 shows the main APIs.
Table 1 Main APIs

Function

ACL Module

ACL Interface Function

Description

Resource initialization

Initialization

acl.init

Initializes the ACL configuration.

Device management

acl.rt.set_device

Specifies the device for computation.

Context management

acl.rt.create_context

Creates a context.

Stream management

acl.rt.create_stream

Creates a stream.

Operator loading and execution

acl.op.set_model_dir

Loads the directory of a model file.

Data postprocessing

Operator loading and execution

acl.op.create_attr

Creates data of the aclopAttr type.

--

acl.create_tensor_desc

Creates data of the aclTensorDesc type.

--

acl.get_tensor_desc_size

Obtains the size of a tensor description.

--

acl.create_data_buffer

Creates data of the aclDataBuffer type.

Data exchange

Memory management

acl.rt.memcpy

Sends data from the host to the device or from the device to the host.

Memory management

acl.rt.malloc

Allocates device memory.

Memory management

acl.rt.malloc_host

Allocates memory on the host.

Single-operator inference

Operator loading and execution

acl.op.execute

Loads and executes an operator. It is an asynchronous interface.

Common module

--

acl.util.ptr_to_numpy

Obtains the numpy.ndarray object based on the pointer address.

--

acl.util.numpy_to_ptr

Obtains the pointer address of the memory data of the numpy.ndarray object.

Allocation destruction

Memory management

acl.rt.free

Frees device memory.

Memory management

acl.rt.free_host

Frees the memory on the host.

Stream management

acl.rt.destroy_stream

Destroys a stream.

Context management

acl.rt.destroy_context

Destroys a context.

Device management

acl.rt.reset_device

Resets the current device and reclaims the resources on the device.

Deinitialization

acl.finalize

Deinitializes ACL.

Single-Operator Matrix Addition Process

Figure 1 shows the single-operator matrix addition process.
Figure 1 Single-operator matrix addition process

Directory Structure

The following is an example of the directory structure after the model file has been converted. The op_models folder is generated after the conversion.

acl_operator_add
├──src
│ ├── acl_execute_add.py // Running file
│ └── constant.py // Constant definition
└── test_data
  ├── config
  │ ├── acl.json // Configuration file for system initialization
  │ └── add_op.json // Description information of the matrix-matrix addition operator
  └── op_models
     └── 0_Add_3_2_8_16_3_2_8_16_3_2_8_16.om //Model file of the matrix addition operator