Plugin Development (TensorFlow)

Overview

This section describes how to develop an operator plugin capable of mapping an operator from a third-party framework to one adapted to Ascend AI Processor. During the execution of TensorFlow-based network, the plugin information in GE is loaded to parse and map the operators in the network to those that can adapt to Ascend AI Processor.

The operators adapted to Ascend AI Processor are referred to as CANN operators.

Principles

The implementation workflow of an operator plugin includes registering the operator types of CANN operators and the operators in the original framework, and mapping the operator attributes in the original framework to attributes of CANN operators. Operator mapping is implemented by the Parser module. Figure 1 shows the plugin implementation workflow in the network execution scenario.

Figure 1 Process for implementing an operator plugin
  1. GE receives a graph, which represents the topology of an original network model trained under a third-party framework. Then, GE is initialized.
  2. GE loads the .so file of the operator plugin from the Register module, which is stored in the opp/built-in/framework/ directory in the CANN component directory.
  3. The Register module reads operator information in the operator plugin .so file and registers the information with the map file of the operator plugin. (Information about all operator plugins is stored in one map file.)
  4. GE requests the Parser module to call the Parser method.
  5. The Parser module obtains the corresponding Parser function from the map file of the operator plugin based on the operator type (OpType), and returns the implementation function ParseParamsByOperatorFn to the Parser module. The Parser module maps the attributes of a third-party operator to attributes of a CANN operator (that is, the attributes defined in the operator prototype). In this way, a third-party operator is mapped to a CANN operator.
  6. Subsequent operations are performed, including graph preparation, partition, and optimization. Then a network model that adapts to Ascend AI Processor is generated.

Plugin Implementation

GE provides the REGISTER_CUSTOM_OP macro to register an operator based on the specified operator name.

The following code shows how to customize a TensorFlow operator.
#include "register/register.h"
#include "graph/operator.h"
namespace domi
{
REGISTER_CUSTOM_OP("OpType")
    .FrameworkType(TENSORFLOW) 
    .OriginOpType("OriginOpType")
    .ParseParamsByOperatorFn(ParseParamByOpFunc)
    .ImplyType(ImplyType::TVM);      // TBE operator: ImplyType::TVM; AI CPU operator: ImplyType::AI_CPU
}
  • Add the #include command to the beginning of the code implementation file to include the header files related to the plugin implementation functions in the plugin implementation file.

    register.h is stored in include/register/ under the CANN software installation directory. Inclusion of this header file enables calls to the operator registration APIs.

    operator.h (optional) is stored in include/graph/ under the CANN software installation directory. Inclusion of this header file enables calls to the operator APIs, which can be used to obtain the operator information such as the inputs, outputs, and attributes.

  • REGISTER_CUSTOM_OP: registers a custom operator. OpType is the type of the operator registered to GE, which must be the same as that in 3.
  • FrameworkType: specifies the framework type. TENSORFLOW indicates that the original framework is TensorFlow.
  • OriginOpType: indicates the type of an operator in the original framework.
  • ParseParamsByOperatorFn: registers a function for parsing operator attributes.
  • ImplyType: specifies the operator implementation type. ImplyType::TVM indicates a TBE operator; ImplyType::AI_CPU indicates an AI CPU operator.

The following describes the ParseParamsByOperatorFn parsing function in detail and its implementation scenarios.

Table 1 Implementation scenarios of operator parsing functions

Scenario

Implementation Method

The attributes of an original TensorFlow operator correspond to those of a CANN operator. That is, the quantity, names, and definitions of attributes of both operators are the same.

Implement automatic mapping using the AutoMappingByOpFn callback function.

.ParseParamsByOperatorFn(AutoMappingByOpFn) 

The AutoMappingByOpFn function adds additional attributes of the TensorFlow operator to the CANN operator.

The attributes of a TensorFlow operator and a CANN operator do not completely match. Recalculation is required to assign attributes values to the CANN operator.

Implement attribute parsing using the ParseParamByOpFunc callback function by following the instructions described in operator attributes not matched.

For tensors without format information in the original TensorFlow graph, GE sets the format to ND upon the reception of the graph. However, for format-sensitive operators that do not support the ND format, the format needs to be set in the parsing function of the plugin.

Set the format in the ParseParamByOpFunc callback function by following the instructions described in format-sensitive operator.

  • Operator attributes not matched

    A TensorFlow TopKV2 operator is defined as follows.

    REGISTER_OP("TopKV2")
        .Input("input: T")
        .Input("k: int32")
        .Output("values: T")
        .Output("indices: int32")
        .Attr("sorted: bool = true")
        .Attr("T: realnumbertype")
        .SetShapeFn(TopKShapeFn);

    Its CANN counterpart TopK is defined as follows.

    REG_OP(TopK)
        .INPUT(x, TensorType::RealNumberType())
        .INPUT(k, TensorType({DT_INT32}))
        .OUTPUT(values, TensorType::RealNumberType())
        .OUTPUT(indices, TensorType({DT_INT32}))
        .ATTR(sorted, Bool, true)
        .ATTR(largest, Bool, true)
        .ATTR(dim, Int, -1)
        .OP_END_FACTORY_REG(TopK)

    As shown in the preceding definitions, CANN TopK's largest and dim attributes are not defined in the TensorFlow TopKV2 operator. In this case, assign values to these two attributes in the TopKMappingFn callback function of ParseParamsByOperatorFn as follows.

    Status TopKMappingFn(const ge::Operator &op_src, ge::Operator& op) {
      // Call the AutoMappingFn function to map the attributes correspondingly.
      AutoMappingFn(op_src, op) != SUCCESS) {
        return FAILED;
      }
      // Initialize the dim attribute of the CANN TopK operator.
      int32_t dim = -1;
      op.SetAttr("dim", dim);
     // Assign an initial value to the largest attribute of the CANN TopK operator.
      bool largest = true;
      op.SetAttr("largest", largest);
      return SUCCESS;
    }
    REGISTER_CUSTOM_OP("TopK")
        .FrameworkType(TENSORFLOW)
        .OriginOpType("TopKV2")
        .ParseParamsByOperatorFn(TopKMappingFn)     // Call the TopKMappingFn function to parse attributes.
        .ImplyType(ImplyType::TVM);
  • Format-sensitive operator
    • For an operator with format definition in the original TensorFlow graph, if the format of the TensorFlow operator is as required in CANN operator specifications, the AutoMappingByOpFn function automatically processes the format.
    • For an operator without format definition in the original TensorFlow graph or an operator whose format is inconsistent with that in CANN operator specifications, forcibly set Format and OriginalFormat of the tensor in the ParseParamByOpFunc callback function. (In the current version, set Format and OriginalFormat to the same value.)
      Take the Conv2D operator as an example. The supported format of the second input (filter) is HWCN according to both the TensorFlow API specifications and CANN operator specifications. However, in an original TensorFlow graph (a graph constructed using the native TensorFlow APIs), the format of the filter input is inherited from the data_format attribute directly. As such, set the expected HWCN as the format of filter using the Parser function. See the following example:
      const int kInputFilter = 1
      Status ParseParamsConv2D(const ge::Operator &op_src, ge::Operator& op) {
          AutoMappingByOpFn(op_src, op);
          TensorDesc org_tensor_w = op.GetInputDesc(kInputFilter);
          org_tensor_w.SetOriginFormat(ge::FORMAT_HWCN);
          org_tensor_w.SetFormat(ge::FORMAT_HWCN);
          auto ret = op.UpdateInputDesc(kInputFilter, org_tensor_w);
          if (ret != ge::GRAPH_SUCCESS) {
               return FAILED;
          }
          return SUCCESS;
      }

      Note: The preceding is only for reference. In actual practice, the data format of AI CPU operators must be NHWC.

  • Dynamic-input/output operator
    For operators with dynamic inputs or outputs, use AutoMappingByOpFnDynamic in the ParseParamByOpFunc callback function of the plugin to match TensorFlow operators to CANN operators.
    Status BoostedTreesBucketizeMapping(const ge::Operator& op_src, ge::Operator& op) {
      if (AutoMappingByOpFn(op_src, op) != SUCCESS) {
        return FAILED;
      }
    
      std::string attr_name = "num_features";
      std::vector<std::string> dynamic_inputs {"float_values", "bucket_boundaries"};
      std::string dynamic_output = "y";
    
      vector<DynamicInputOutputInfo> dynamic_name_attr_value;
    
      // input dynamic tensor
      for (std::string input_name : dynamic_inputs) {
          DynamicInputOutputInfo name_attr(kInput, input_name.c_str(), input_name.size(), 
                  attr_name.c_str(), attr_name.size());
          dynamic_name_attr_value.push_back(name_attr);
      }
    
      // output dynamic tensor
      DynamicInputOutputInfo name_attr(kOutput, dynamic_output.c_str(), dynamic_output.size(), 
              attr_name.c_str(), attr_name.size());
      dynamic_name_attr_value.push_back(name_attr);
    
      AutoMappingByOpFnDynamic(op_src, op, dynamic_name_attr_value);
    
      return SUCCESS;
    }

Many-to-One Mapping

To improve computing performance and make full use of hardware for acceleration, multiple small operators on the TensorFlow network need to be fused and mapped to a large CANN operator.

In this scenario, you need to design and develop the scope fusion pattern, and then implement the operator plugin of fusion operators, whose definition is different from that in one-to-one mapping, and is shown as follows:

REGISTER_CUSTOM_OP("OpType")
    .FrameworkType(TENSORFLOW)             // The original framework is TensorFlow.
    .OriginOpType("OriginOpType")   // Type of the operator in the original framework, which is the same as the value of SetType in GenerateFusionResult.
    .FusionParseParamsFn(DecodeBboxV2ParseParams)  // Used to register the function for parsing the attributes of a fusion operator.
    .ImplyType(ImplyType::TVM);           // Specifies the implementation mode of an operator. ImplyType::TVM indicates that the operator is a TBE operator.
  • OpType registered in REGISTER_CUSTOM_OP is the mapped CANN operator type.
  • OriginOpType is registered as the result type of a fusion operator set during scope fusion pattern development.
  • FusionParseParamsFn is the function for registering and parsing the attributes of a fusion operator. For details about the API definition, see FusionParseParamsFn (Overload).

This section does not describe how to develop the scope fusion pattern and fusion operator plugin. For details, see TensorFlow Parser Scope Fusion Pattern Developer Guide.

Many-to-Many Mapping

If you need to fuse multiple operators on the TensorFlow network and map them to CANN operators, an operator plugin is not required. You simply need to implement the scope fusion pattern and set the fusion result. For details about the development guide, see TensorFlow Parser Scope Fusion Pattern Developer Guide.

Mapping Operators to Subgraphs (One-to-Many)

The following scenarios may occur during adaptation development:

  • There's no matching CANN operator for a certain TensorFlow operator, but you can achieve its function by combining multiple CANN operators.
  • The implementation of TensorFlow operators is different from that of CANN operators. For example, some attribute of a TensorFlow operator is a constant input of the corresponding CANN operator.

In these scenarios, one operator in the original TensorFlow framework needs to be mapped to multiple operators in the CANN. During plugin implementation, you need to build the mapped CANN operators to a subgraph, and then map the TensorFlow operator to the subgraph. The following describes how to convert the TensorFlow AddN operator into two CANN Add operators.

In the scenario where an operator is mapped to a subgraph, the registration code is as follows:

REGISTER_CUSTOM_OP("PartitionedCall")
    .FrameworkType(TENSORFLOW)
    .OriginOpType("AddN")
    .ParseParamsFn(ParseParamsAddn)
    .ParseOpToGraphFn(ParseOpToGraphAddn)
    .ImplyType(ImplyType::TVM);

The following describes only the differences from the common registration function:

  • REGISTER_CUSTOM_OP("PartitionedCall"): registers the subgraph PartitionedCall, which is a general name of combined operators.

    If an operator is mapped to a subgraph, the value is fixed at PartitionedCall.

  • ParseParamsByOperatorFn(ParseParamsAddn): registers the ParseParamsAddn function for parsing custom operator parameters. In ParseParamsAddn, the parameters of the original TensorFlow operator are mapped to those of the PartitionedCall operator. You need to set the number of inputs and outputs of the PartitionCall node and set the original_type attribute to the operator type in the original framework.
  • ParseOpToGraphFn(ParseOpToGraphAddn): registers the ParseOpToGraphAddn function for implementing the one-to-many subgraph mapping of the operator. In ParseOpToGraphAddn, the PartitionedCall operator is mapped to a subgraph, which is constructed according to Ascend Graph construction. For details about the ParseOpToGraphFn API, see ParseOpToGraphFn. For details about Ascend Graph construction, see Ascend Graph Developer Guide.
The following is an example of implementing the callback function for mapping the AddN operator parameters to the PartitionedCall operator parameters:
// Example of the ParseParamsByOperatorFn callback function:
Status ParseParamsAddn(const ge::Operator&op_src, ge::Operator&op_dest) {
  // 1. Align the number of inputs and outputs of the PartitionCall node (op_dest) with that of the original node (op_src).
  ge::Operator op_ori = const_cast<ge::Operator&>(op_src);
  std::string in_name = "args";
  std::string in_value = "in_num";
  std::string out_name = "output";
  std::string out_value = "out_num";
  op_ori.SetAttr(in_value, 3);
  op_ori.SetAttr(out_value, 1);
  DynamicInputOutputInfo in_values(kInput, in_name.c_str(), in_name.size(), in_value.c_str(), in_value.size());
  DynamicInputOutputInfo out_values(kOutput, out_name.c_str(), out_name.size(), out_value.c_str(), out_value.size());
  AutoMappingByOpFnDynamic(op_ori, op_dest, {in_values, out_values});
  // 2. Inherit attributes, if any, from op_src to op_dest.
  ...
  // 3. Set the original_type attribute to the operator type in the original framework.
  op_dest.SetAttr("original_type", "AddN");
  return SUCCESS;

The following is an example of the callback function for mapping an AddN operator to two CANN Add operators. This function constructs a subgraph consisting of two CANN Add operators.

// Example of the ParseOpToGraphFn callback function:
static Status ParseOpToGraphAddn(const ge::Operator&op, ge::Graph&graph) {
  // The index attribute of the Data node indicates the indexth input of the original node (op).
  auto data_0 = ge::op::Data().set_attr_index(0);
  auto data_1 = ge::op::Data().set_attr_index(1);
  auto data_2 = ge::op::Data().set_attr_index(2);
  // Create an add0 operator instance and set the operator inputs to data_0 and data_1.
  auto add0 = ge::op::Add("add0")
    .set_input_x1(data_0)
    .set_input_x2(data_1);
  // Create an add1 operator instance and set the operator inputs to data_2 and add0.
  auto add1 = ge::op::Add("add1")
    .set_input_x1(data_2)
    .set_input_x2(add0);
  // Set the inputs and outputs of the graph.
  std::vector<ge::Operator> inputs{data_0, data_1, data_2};
  // The output setting must be the same as that of the original node (op).
  std::vector<std::pair<ge::Operator, std::vector<size_t>>> output_indexs;
  output_indexs.emplace_back(add1, vector<std::size_t>{0});
  graph.SetInputs(inputs).SetOutputs(output_indexs);
  return SUCCESS;
}

The following figure shows the constructed subgraph.