API Introduction

Overview

To accelerate the release of model computing power, Compute Architecture for Neural Networks (CANN) provides the Ascend Operator Library (AOL). This library provides a series of optimized high-performance operator APIs, which are Ascend AI Processor affinity. The call process is shown in Figure 1. Developers can directly call the library APIs to enable model innovation and application, further improving development efficiency and obtaining ultimate model performance.

Figure 1 API calling process

API Description

This document describes the definitions, functions, parameters, restrictions, and call examples of operator APIs in different domains, allowing developers to quickly call operator APIs. In addition, the operator specifications defined by the intermediate representation (IR) of different frameworks are provided for developers to build network models.

  • For details about the product models supported by the operator APIs or operator specifications/list, see Table 2.
  • Developers are not advised to use scenarios that are not specified in the operator APIs or operator specifications/list (such as product models, data types, data formats, and data dimensions). The call effect is not guaranteed in the current version.
  • Various exceptions may occur during the operator API call. You can see "Troubleshooting Cases > Operator Execution Issues" in Troubleshooting. This chapter provides typical and frequent operator execution problems to allow developers to locate and solve problems.
Table 1 AOL API list

API Category

Description

Remarks

Basic APIs

Common meta APIs on which the NN operator and fused operator APIs depend, such as the APIs for creating aclTensor, aclScalar, and aclIntArray.

-

NN Operator APIs

Neural network operators, which are the built-in basic CANN operators and cover the calculation types related to deep learning algorithms in frameworks such as TensorFlow, PyTorch, MindSpore, and ONNX, including typical calculations such as Softmax, MatMul, and Convolution. The API prefix is aclnnXxx.

Currently, this type of operator accounts for the largest proportion in the operator library.

They are actually a set of C language-based APIs that can be directly called to execute without the need of additional IR definitions. This call mode is called single-operator API execution call. For details, see "Single-Operator Calling" in CANN AscendCL Application Software Development Guide (C&C++).

  • When you call the NN operator and fused operator APIs, the compiled operators in the operator binary package (Ascend-cann-kernels) are directly called. You do not need to compile the operators again. For details about how to install the operator binary package, see CANN Software Installation Guide.
  • When you call the DVPP operator APIs, you do not need to compile the operator.

Fused Operator APIs

Built-in fused operators of CANN. The API prefix is aclnnXxx. They are large operators that combine multiple independent small operators (such as vectors and cubes). The functions of multiple small operators are equivalent to those of the large operator, large operators outperform small operators in terms of performance or memory. The common large operators include Flash Attention and MC2 operators.

NOTE:

In addition to the fused operators provided in this document, you can also click the link to access the Gitee cann-ops-adv repository to obtain the fused operators.

Currently, the fused operators do not support Ascend virtualization instances.

DVPP Operator APIs

The prefix of the Digital Vision Pre-Processing operator APIs is acldvppXxx. These preprocessing APIs can be used for high-performance video/image encoding and decoding and image cropping and resizing.

CANN Operator Specifications

Operator information defined based on the Ascend IR.

-

Operators Specifications of the AI Framework

Operator information defined based on the native framework IR (such as TensorFlow and Caffe).

-

Table 2 Product support table

Product Model

NN Operator APIs

Fused Operator APIs

DVPP Operator APIs

CANN Operator Specifications

TensorFlow Operator List

Caffe Operator List

ONNX Operator List

Atlas 200/300/500 Inference Product

x

x

x

Atlas Training Series Product

√ (Partially supported)

x

x