Specifying the Internal Formats of Graph Inputs and Outputs
Overview
Before learning about this feature, you need to understand the original format and internal format (or runtime format).
The original format refers to the original image format without any conversion, corresponding to the format used in the script, such as NCHW and NHWC. To ensure the optimal efficiency of operators on the hardware with different specifications, the internal format (corresponding to the format used for computation on the device) is introduced, such as NC1HWC0, FRACTAL_NZ, and FRACTAL_Z. For example, if a script developer uses a tensor in NCHW format in the model, the tensor may be converted to the runtime format NC1HWC0 after being optimized by the graph build framework. For details about the formats, see "Format".
Converting the original format to the runtime format adds extra overhead to the original script. To address this issue, this section provides the function of specifying the internal format of the model input and output. In this way, less format conversion overhead is generated during the transfer of tensors at the graph boundary, bringing more performance benefits.
The following figure illustrates the API call sequence in this feature.

After a graph instance is created, call SetStorageFormat to set the runtime format for the output TensorDesc of the Data node or the output TensorDesc of the model, call SetExpandDimsRule to set the dimension expansion rule, and then set the input and output operators in the graph to complete graph build.
Restrictions:
- Currently, only the Data and RefData nodes support the setting of the internal format of the graph input.
- This feature applies only to Graph Build and Run, but not Graph Build to an Offline Model.
- If the original shape has fewer than four dimensions, you need to specify the dimension expansion rule during the conversion to the internal format. The layout of data in the internal format varies according to the rule.
Procedure
The following describes how to set the internal format for the input RefData node and output node in graph construction. The following is an example of the constructed graph.

The sample code is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
// 1. Construct the input RefData node of the entire graph and set the runtime format. // 1.1 Construct the tensor desc of RefData. std::vector<int64_t> weight_shape = {4,50}; TensorDesc weight_desc = TensorDesc(ge::Shape(weight_shape), FORMAT_NHWC, DT_FLOAT); weight_desc.SetStorageFormat(FORMAT_NC1HWC0); // Set the runtime format of the tensor desc. weight_desc.SetExpandDimsRule(AscendString("NC")); // Set the dimension expansion rule of the tensor desc. // 1.2 Construct the RefData node and set its input and output tensor desc. auto weight = op::RefData("weight").set_attr_index(1); // 1.3 Set the input and output tensor desc for the RefData node. weight.update_input_desc_x(refdata_01_desc); weight.update_output_desc_y(refdata_01_desc); // 2. Construct other nodes in the graph. auto fm_data = op::Data("data").set_attr_index(0); // 3. Construct the output Conv2D node of the entire graph and set the runtime format of the output. // 3.1 Construct the output tensor desc of Conv2D. std::vector<int64_t> conv2d_out_shape = {4,50}; TensorDesc conv2d_out_desc = TensorDesc(ge::Shape(conv2d_out_shape), FORMAT_NHWC, DT_FLOAT); conv2d_out_desc.SetStorageFormat(FORMAT_NC1HWC0); // Set the runtime format of the tensor desc. conv2d_out_desc.SetExpandDimsRule(AscendString("NC")); // Set the dimension expansion rule of the tensor desc. // 3.2 Construct the Conv2D node. auto conv2d = op::Conv2D("conv2d").set_input_x(fm_data).set_input_filter(weight); // 3.3 Set the output tensor desc for the Conv2D node. conv2d.update_output_desc_y(conv2d_out_desc); // 4. Construct the entire graph and set the input and output of the entire network. ge::Graph graph("demo_graph"); graph.SetInputs({fm_data, weight}).SetOutputs({{conv2d, 0}}); |