Introduction to Graph Modes

Graph Engine (GE) is the control center for the compilation and running of computational graphs. It supports the execution of neural networks on the CANN platform in graph mode.

The graph mode is a running mode of the neural network model. In the graph mode, a user first constructs a computational graph by using a model computation process, and then delivers the graph to Ascend hardware for execution by using GE. Another running mode of the neural network is a single-operator mode. In the single-operator mode, each computing operation is executed immediately after being delivered. Compared with the single-operator mode, GE in graph mode can use technologies such as computational graph optimization, multi-stream parallelism, memory overcommitment, and model offload to accelerate model execution and reduce the model memory usage.

GE provides a unified graph development API, supports the access of multiple AI frameworks, and allows developers to customize graph structures. This flexibility helps developers quickly and efficiently deploy neural network models on Ascend hardware.

Figure 1 GE logical architecture

Key Technologies and Benefits

GE provides technologies such as computational graph optimization, multi-stream parallelism, memory overcommitment, and model offloading to achieve optimal model execution performance. For details about the key GE technologies, visit the following website:

In general, the graph mode has a global perspective and can be better optimized in terms of compilation optimization, memory, and lifecycle management to obtain better memory and performance benefits.

Use Cases

Based on the GE capabilities, CANN supports the following graph modes:

Use Case

Description

Enabling the Graph Mode for the PyTorch Framework

Run the PyTorch network script in graph mode.

Enabling the Graph Mode for the TensorFlow Framework

Run the TensorFlow network script in graph mode.

Execution in ONNX/PB Model Graph Mode

If you have a trained ONNX or PB model, you can use either of the following methods to run the model on the Ascend platform in graph mode:
  • Use the ATC command line tool to convert the original model into an offline model (.om) file that adapts to Ascend AI Processor, and then load the model and perform inference using the AscendCL API.
  • Use the GE C++ parser APIs (such as aclgrphParseONNX and aclgrphParseTensorFlow) to convert the original framework model into the CANN model, and execute the model on the Ascend hardware in graph mode.

Ascend Graph Composition

Construct a graph that can run on the Ascend hardware based on the Ascend graph construction API provided by GE.

custom operator into the graph

If you encounter an operator that is not supported by the Ascend Operator Library (AOL) when executing the PyTorch/TensorFlow network or constructing an Ascend graph based on the operator prototype, you need to develop a custom operator and then import the custom operator to the graph.