Plugin Development (ONNX)

Overview

This section describes how to develop an operator plugin capable of mapping an operator trained in a third-party framework to one adapted to Ascend AI Processor, and how to register the operator information with GE. During the execution of ONNX-based network, the plugin information in GE is loaded to parse and map the operators in the network to those that can adapt to Ascend AI Processor.

The operators adapted to Ascend AI Processor are referred to as CANN operators.

Principles

The implementation workflow of an operator plugin includes registering the operator types of CANN operators and the operators in the original framework, and mapping the operator attributes in the original framework to attributes of CANN operators. Operator mapping is implemented by the Parser module. Figure 1 shows the plugin implementation workflow in the network execution scenario.

Figure 1 Process for implementing an operator plugin

GE receives a graph, which represents the topology of an original network model trained under a third-party framework. Then, GE is initialized.
GE loads the .so file of the operator plugin from the Register module, which is stored in the opp/built-in/framework/ directory in the CANN component directory.
The Register module reads operator information in the operator plugin .so file and registers the information with the map file of the operator plugin. (Information about all operator plugins is stored in one map file.)
GE requests the Parser module to call the Parser method.
The Parser module obtains the corresponding Parser function from the map file of the operator plugin based on the operator type (OpType), and returns the implementation function ParseParamsByOperatorFn to the Parser module. The Parser module maps the attributes of a third-party operator to attributes of a CANN operator (that is, the attributes defined in the operator prototype). In this way, a third-party operator is mapped to a CANN operator.
Subsequent operations are performed, including graph preparation, partition, and optimization. Then a network model that adapts to Ascend AI Processor is generated.

Plugin Implementation

GE provides the REGISTER_CUSTOM_OP macro to register an operator based on the specified operator name.

The following code shows how to customize an ONNX operator.

#include "register/register.h"
#include "graph/operator.h"
#include "json.hpp"
namespace domi
{
REGISTER_CUSTOM_OP("OpType")
    .FrameworkType(ONNX) 
    .OriginOpType("OriginOpType")
    .ParseParamsByOperatorFn(ParseParamByOpFunc)   // Registers a function for parsing operator attributes.
    .ImplyType(ImplyType::TVM);  // TBE operator: ImplyType::TVM; AI CPU operator: ImplyType::AI_CPU
}

Add the #include command to the beginning of the code implementation file to include the header files related to the plugin implementation functions in the plugin implementation file.
- register.h is stored in include/register/ under the CANN software installation directory. Inclusion of this header file enables calls to the operator registration APIs.
- operator.h (optional) is stored in include/graph/ under the CANN software installation directory. Inclusion of this header file enables calls to the operator APIs, which can be used to obtain the operator information such as the inputs, outputs, and attributes.
- json.hpp is used to parse ONNX parameter definitions of the string type into the JSON format.
  Click here to download json.hpp if this file is not provided in the sample project. Place it in any subdirectory under the project directory and include this header file.
REGISTER_CUSTOM_OP: registers a custom operator. OpType is the operator type registered with GE. The value cannot conflict with existing operator names and must be the same as that in Operator IR Registration.
FrameworkType: specifies the framework type. ONNX indicates that the original framework is ONNX.
OriginOpType: indicates the type of an operator in the original framework. For example, the custom operator OpTypeA corresponds to the ONNX OPP version opset_version=11, the original framework type is ai.onnx::11::OpTypeA, and the supported ONNX versions range from 9 to 15.

ParseParamsByOperatorFn (ParseParamByOpFunc): registers a function for parsing operator attributes This callback function must be defined by the user.

ParseParamsByOperatorFn is declared as follows:

Status ParseParamByOpFunc(const ge::Operator& op_src, ge::Operator& op_dest)

ParseParamByOpFunc: function name, which is user-defined and must be unique.
op_src: an Operator class object defined by the ONNX framework, including attributes of the operator in the ONNX model. The definition is obtained from the original ONNX model file.
op_dest: data structure of the CANN operator, storing operator information. For details about the class Operator, see Operator.

In the original ONNX model, the attribute is of the repeated message type, as shown in the following.

message NodeProto {
  repeated string input = 1;    // namespace Value
  repeated string output = 2;   // namespace Value
  string name = 3;     // namespace Node
  string op_type = 4;  // namespace Operator
  string domain = 7;   // namespace Domain

  // Additional named attributes.
  repeated AttributeProto attribute = 5;
}

When parsing attributes, for parameters of the repeated message type, GE obtains the attribute values using the GetAttr(const char *name, ge::AscendString &attr_value) API, casts the attribute values of type AscendString to strings, and then converts the attribute values to the JSON format for attribute field parsing.

The implementation is as follows.

using namespace ge;
using json = nlohmann::json;
namespace domi {
namespace {
const int kTypeFloat = 1;
}
Status ParseOnnxParamsLeakyRelu(const ge::Operator& op_src, ge::Operator& op_dest) {
  // trans op_src to op_dest
  // if op_src get required attr failed, need to return Failed
  // if op_src get optional attr failed, need to return Failed or set a default value
  float negative_slope = 0.01f;
  string negative_slope_str;
  AscendString attrs_string;
  // Obtain the attributes of an ONNX operator using the specified attribute name attribute and assign a value to the object of the AscendString type.
  if (ge::GRAPH_SUCCESS == op_src.GetAttr("attribute", attrs_string)) {
    // Cast to the JSON format.
    json attrs = json::parse(attrs_string.GetString());
    for (json attr : attrs["attribute"]) {
      if (attr["name"] == "alpha" && attr["type"] == kTypeFloat) {
        negative_slope_str = attr["f"];  // float type in json has accuracy loss, so we use string type to store it
        negative_slope = atof(negative_slope_str.c_str());
      }
    }
  }

  op_dest.SetAttr("negative_slope", negative_slope);
  return SUCCESS;
}

The GetAttr and SetAttr APIs of the current version cannot parse fields of type double or uint64 in the original file.
During model conversion using the ATC tool, strong verification is not performed on the obtaining of attributes. When implementing an operator plugin, it is advisable to add the corresponding processing logic for possible GetAttr call failures. For example, return a failure message for a required attribute or prompt the user to set a default value for an optional attribute.

ImplyType: specifies the operator implementation type. ImplyType::TVM indicates a TBE operator; ImplyType::AI_CPU indicates an AI CPU operator.

Mapping Operators to Subgraphs (One-to-Many)

The following scenarios may occur during adaptation development:

There's no matching CANN operator for a certain ONNX operator, but you can achieve its function by combining multiple CANN operators.

The implementation of ONNX operators is different from that of CANN operators. For example, some attribute of an ONNX operator is a constant input of the corresponding CANN operator.

For example, the prototype of the ArgMax operator in ONNX is as follows.

**Table 1**
ONNX ArgMax
Attributes	axis: The axis along which the arg indices are computed.
	keepdims: Whether to keep the reduced dimension. Defaults to 1, indicating that the reduced dimension is kept.
	select_last_index: Whether to select the last index or the first index if {name} appears in multiple indices. Defaults to False (first index).
Inputs	data: An input tensor.
Outputs	reduced: Reduced output tensor with the integer data type.

The prototype of the mapped CANN operator ArgMaxV2 is as follows.

**Table 2**
CANN ArgMaxV2
Inputs	x: A multi-dimensional Tensor of type float16, float32, or int16.
Inputs	dimension: A Scalar of type int32, specifying the index with the largest value.
Attributes	dtype: The output type, either "int32" or "int64". Defaults to "int64".
Outputs	y: A multi-dimensional Tensor of type int32 or int64, specifying the index with the largest value. The dimension is one less than that of "x".

According to the comparison between the prototypes of the two operators, the axis attribute of the ONNX operator ArgMax is defined as a constant input of the CANN operator ArgMaxV2.

In these scenarios, one operator in the original ONNX framework needs to be mapped to multiple operators in the CANN. During plugin implementation, you need to build the mapped CANN operators to a subgraph, and then map the ONNX operator to the subgraph. You can obtain implementation samples from the cplusplus\level1_single_api\4_op_dev\1_custom_op\framework\onnx_plugin directory in the open-source samples repository to understand the implementation method and verify the mapping result. The following describes how to convert the ONNX AddN operator into two CANN Add operators.

The operator registration code is as follows:

REGISTER_CUSTOM_OP("PartitionedCall")
    .FrameworkType(ONNX)
    .OriginOpType("ai.onnx::11::AddN")
    .ParseParamsByOperatorFn(ParseParamsAddn)
    .ParseOpToGraphFn(ParseOpToGraphAddn)
    .ImplyType(ImplyType::TVM);

The following describes only the differences from the common registration function:

REGISTER_CUSTOM_OP("PartitionedCall"): registers the subgraph PartitionedCall, which is a general name of combined operators.
If an operator is mapped to a subgraph, the value is fixed at PartitionedCall.
ParseParamsByOperatorFn(ParseParamsAddn): registers the ParseParamsAddn function for parsing custom operator parameters. In ParseParamsAddn, the parameters of the original ONNX operator are mapped to those of the PartitionedCall operator. You need to set the number of inputs and outputs of the PartitionCall node and set the original_type attribute to the operator type in the original framework.
ParseOpToGraphFn(ParseOpToGraphAddn): registers the ParseOpToGraphAddn function for implementing the one-to-many subgraph mapping of the operator. In ParseOpToGraphAddn, the PartitionedCall operator is mapped to a subgraph, which is constructed according to Ascend Graph construction. For details about the ParseOpToGraphFn API, see ParseOpToGraphFn. For details about Ascend Graph construction, see Ascend Graph Developer Guide.

The following is an example of implementing the callback function for mapping the AddN operator parameters to the PartitionedCall operator parameters:

// Example of the ParseParamsByOperatorFn callback function:
Status ParseParamsAddn(const ge::Operator&op_src, ge::Operator&op_dest) {
  // 1. Align the number of inputs and outputs of the PartitionCall node (op_dest) with that of the original node (op_src).
  ge::Operator op_ori = const_cast<ge::Operator&>(op_src);
  std::string in_name = "args";
  std::string in_value = "in_num";
  std::string out_name = "output";
  std::string out_value = "out_num";
  op_ori.SetAttr(in_value, 3);
  op_ori.SetAttr(out_value, 1);
  DynamicInputOutputInfo in_values(kInput, in_name.c_str(), in_name.size(), in_value.c_str(), in_value.size());
  DynamicInputOutputInfo out_values(kOutput, out_name.c_str(), out_name.size(), out_value.c_str(), out_value.size());
  AutoMappingByOpFnDynamic(op_ori, op_dest, {in_values, out_values});
  // 2. Inherit attributes, if any, from op_src to op_dest.
  ...
  // 3. Set the original_type attribute to the operator type in the original framework.
  op_dest.SetAttr("original_type", "ai.onnx::11::AddN");
  return SUCCESS;
}

The following is an example of the callback function for mapping an AddN operator to two CANN Add operators. This function constructs a subgraph consisting of two CANN Add operators.

// Example of the ParseOpToGraphFn callback function:
static ParseOpToGraphAddn(const ge::Operator&op, ge::Graph&graph) {
  // The index attribute of the Data node indicates the indexth input of the original node (op).
  auto data_0 = ge::op::Data().set_attr_index(0);
  auto data_1 = ge::op::Data().set_attr_index(1);
  auto data_2 = ge::op::Data().set_attr_index(2);
  // Create an add0 operator instance and set the operator inputs to data_0 and data_1.
  auto add0 = ge::op::Add("add0")
      .set_input_x1(data_0)
      .set_input_x2(data_1);
  // Create an add1 operator instance and set the operator inputs to data_2 and add0.
  auto add1 = ge::op::Add("add1")
      .set_input_x1(data_2)
      .set_input_x2(add0);

  // Set the inputs and outputs of the graph.
  std::vector<ge::Operator> inputs{data_0, data_1, data_2};
  // The output setting must be the same as that of the original node (op).
  std::vector<std::pair<ge::Operator, std::vector<size_t>>> output_indexs;
  output_indexs.emplace_back(add1, vector<std::size_t>{0});
  graph.SetInputs(inputs).SetOutputs(output_indexs);
  return SUCCESS;
}

Verify the one-to-many mapping result by referring to "Verifying the Mapping Between Operators and Subgraphs (One-to-Many)" in the README file under the cplusplus\level1_single_api\4_op_dev\1_custom_op directory in the open-source samples repository. The verification result is as follows:

In the following figure, the original ONNX model containing the AddN operator is displayed on the left, and the converted subgraph structure containing two CANN Add operators is displayed on the right.

Parent topic: Operator Adaptation