Sample Overview

Sample Obtaining

Click implementing matrix-matrix addition operation to obtain the sample.

Function Usage

This sample verifies the functionality of the custom operator by converting the custom operator file into a single-operator offline model file and loading the file using the pyacl for execution.

This sample implements the matrix-matrix addition operation: C = A + B. In this operation, A, B, and C are 8 x 16 matrices and are of the int32 type. The result is an 8 x 16 matrix.

Main APIs

The following table lists the main APIs.

Initialization

  • Call acl.init to initialize the configuration.
  • Call acl.finalize to deinitialize the configuration.

Device management

  • Call acl.rt.set_device to specify the compute device.
  • Call acl.rt.get_run_mode to obtain the running mode of the software stack. The internal processing process varies according to the running mode.
  • Call acl.rt.reset_device to reset the current device and reclaim the associated resources.

Stream management

  • Call acl.rt.create_stream to create a stream.
  • Call acl.rt.destroy_stream to destroy a stream.
  • Call acl.rt.synchronize_stream to block the programs until all tasks in the specified stream are complete.

Memory management

  • Call acl.rt.malloc_host to allocate host memory.
  • Call acl.rt.free_host to deallocate the memory on the host.
  • Call acl.rt.malloc to allocate device memory.
  • Call acl.rt.free to deallocate device memory.

Data transfer

If your app runs on the host, call the acl.rt.memcpy API.

  • Transfers decode source data from the host to the device.
  • Transfers the inference result from the device to the host.

Data transfer is not required if your app runs in the board environment.

Single-operator calling

  • call acl.op.execute_v2 to execute the specified operator.
  • Use the ATC (Ascend Tensor Compiler) tool to compile the description information (including the input and output tensor description and operator attributes) of the ADD operator into an offline model adapted to the Ascend AI Processor file (*.om file) to verify the execution result of the matrix addition ADD operator.

Directory Structure

The following is an example of the directory structure after the model file is converted:

acl_operator_add
├──scripts
│ ├── host_version.conf // Version number configuration file.
│ └── testcase_300.sh // Run script.
├──src
│ ├── acl_execute_add.py // Running file
│ └── constant.py // Constant definition
└── test_data
  ├── config
  │ ├── acl.json // Configuration file for system initialization
  │ └── add_op.json // Description information of the matrix-matrix addition operator
  └── op_models // Directory generated after ATC conversion.
     └── 0_Add_3_2_8_16_3_2_8_16_3_2_8_16.om //Model file of the matrix addition operator