Constructing an Ascend graph

GE provides Ascend Graph APIs for you to construct a computational graph that can be directly run in Ascend AI Processor based on the operator prototype. This section describes the overall process of Ascend Graph construction and briefly introduces each process.

  1. Construct a graph based on the operator prototype.

    Each operator has a prototype definition that specifies the constraints on its execution on Ascend AI Processor, including the operator inputs, outputs, and attributes. For details about the prototype definition of a built-in CANN operator, see CANN Operator Specifications or the header file of the operator prototype definition in Ascend-cann-toolkit installation directory/ascend-toolkit/latest/opp/built-in/op_proto/inc.

    For example, the prototype of the Add operator is defined as follows:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    REG_OP(Add)
        .INPUT(x1, TensorType({DT_BOOL, DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_BF16, DT_INT16,
                               DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
                               DT_COMPLEX64, DT_STRING, DT_COMPLEX32}))
        .INPUT(x2, TensorType({DT_BOOL, DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_BF16, DT_INT16,
                               DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
                               DT_COMPLEX64, DT_STRING, DT_COMPLEX32}))
        .OUTPUT(y, TensorType({DT_BOOL, DT_FLOAT, DT_INT32, DT_INT64, DT_FLOAT16, DT_BF16, DT_INT16,
                               DT_INT8, DT_UINT8, DT_DOUBLE, DT_COMPLEX128,
                               DT_COMPLEX64, DT_STRING, DT_COMPLEX32}))
        .OP_END_FACTORY_REG(Add)
    
    After obtaining the operator prototype definition, perform the following operations to create a graph.
    1. Create an operator instance based on the operator prototype and sets the input, output, and attributes.

      The following is a simple code snippet for creating an Add operator instance.

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      #include "all_ops.h" 
      auto shape_data = vector<int64_t>({ 1,1,28,28 });
      TensorDesc desc_data(ge::Shape(shape_data), FORMAT_ND, DT_FLOAT16);
      
      // Create the data instance of the Data operator.
      auto data = op::Data("data");
      data.update_input_desc_x(desc_data);
      data.update_output_desc_y(desc_data);
      // Create an instance of the Add operator.
      auto add = op::Add("add")
          .set_input_x1(data)
          .set_input_x2(data);
      
    2. Connect operators with edges.

      The edges between operators are classified into data edges and control edges. The data edges specify the input of operators, and the control edges control the execution sequence of operators. In the preceding code snippet of the Add operator instance, set_input_x1 is called to set the first input x1 of the Add operator.

      You need to use the data edge and control edge to construct a graph based on the graph topology structure.

    3. Create a graph instance.

      After defining the operator instance and related topology, create a graph object and set the input and output operators in the graph.

    By now, the process of constructing an Ascend graph using operator prototypes is complete. For more details about the graph construction expressions and skills of various operators, see Graph Construction from Operator Prototypes .

  2. Build and run a graph.
    After a graph is constructed, you can perform the following steps to build and run the graph. (The following describes only key steps and APIs.)
    1. Call the ge::GEInitialize API to initialize the system and request system resources.
    2. Create a session and call the AddGraph API to load the graph object.
    3. Call the aclInit API to initialize AscendCL resources.
    4. (Optional) Call the BuildGraph API to build the graph. If the BuildGraph API is not called, the RunGraphWithStreamAsync API can be used to build the graph.
    5. Specify the device (aclrtSetDevice), create a stream (aclrtCreateStream), allocate memory (aclrtMallocHost and aclrtMalloc), and transfer data from the host to the device (aclrtMemcpy) to prepare for graph execution.
    6. Call the RunGraphWithStreamAsync API to execute the graph and output the execution result.
    7. Call ge::GEFinalize to release related resources.
    The preceding figure shows the entire process of compiling and asynchronously executing a graph. GE also supports the following graph compilation and execution modes. For details, see Graph Build and Run .
    • Build a graph into an offline model that adapts to the Ascend platform, load the offline model through the AscendCL API, and perform inference.
    • Add the constructed graph, and then build and run it (by calling the RunGraph API synchronously). The graph running result is obtained.
    • In the case of a foundation model, build a graph into an offline model that is deployable in distributed mode, and load and run the model through the Ascend Graph API.