Dump Graph Details
Before model conversion, set the following environment variables:
export DUMP_GE_GRAPH=1 # Determines the size of a dump graph. export DUMP_GRAPH_LEVEL=1 # Determines the number of dump graphs.
The following files are generated in the current path where the atc command is executed. For details about the environment variables, see 2.
- ge_onnx*.pbtxt: model description structure based on ONNX. You can open this file using visualizer software such as Netron.
- ge_proto*.txt: text file stored in Protobuf format. You can convert it into a JSON file to facilitate fault locating. This file appears in pair with the ge_onnx*.pbtxt file, but has more attributes of the string type than the ge_onnx*.pbtxt file, making it more comprehensive. You can open either of them.
Compared with the ge_onnx*.pbtxt file, the ge_proto*.txt file has a smaller size. Therefore, setting DUMP_GE_GRAPH to 2 or 3 has the same effect on the ge_proto*.txt file, that is, dumping information without data such as weights.
Each of the preceding files corresponds to a step in the model build process, for example, the build begins with the execution of the ge_onnx_00000001_graph_0_PreRunBegin.pbtxt file and ends with the execution of the ge_onnx_00000078_graph_0_PreRunAfterBuild.pbtxt file. Each file contains all operators involved in the corresponding step. For details about the subgraph functions in each phase of the dump graph, see Table 1. (The dump subgraphs may vary between models, but the workflow is basically the same.)
Subgraph |
Component |
Description |
|---|---|---|
ge_proto_xxxx_FlowGraphPreRunBegin.txt |
GE |
Graph before FlowModelBuild |
ge_proto_xxxx_AfterFlowGraphPartition.txt |
GE |
Graph after flow partitioning (flow partitioning: a partitioning method used in DataFlow) |
ge_proto_xxxx_AfterParallelPartitioner.txt |
GE |
Graph after pipeline parallel partitioning (The "pipeline" refers to the PP in the backend inference scenario. Currently, it is not supported in commercial scenarios.) |
ge_proto_xxxx_PreRunBegin.txt |
GE |
Graph structure obtained after custom optimization. |
ge_proto_xxxx_RunCustomPassBegin.txt |
GE |
Custom pass input graph |
ge_proto_xxxx_RunCustomPassEnd.txt |
GE |
Custom pass output graph |
ge_proto_xxxx_PreRunAfterInitPreparation.txt |
FE |
Graph structure obtained after all initialization in the graph preparation phase. |
ge_proto_xxxx_PrepareAfterCheckAndUpdateInput.txt |
GE |
Graph structure obtained after the graph input is checked and updated. |
ge_proto_xxxx_PrepareAfterPropagateFormatIfNeed.txt |
GE |
Graph structure obtained after format inference on single-operator. |
ge_proto_xxxx_OptimizeGraph_TagNoConstFoldingAfter.txt |
FE |
Used for quantization scenarios. The FE adds a tag for an operator to indicate no constant folding. When the GE checks that the tag exists, it does not perform constant folding. |
ge_proto_xxxx_PreRunAfterOptimizeGraphPrepare.txt |
GE |
Graph structure obtained after the original graphs in the operator information library are prepared (by calling the OptimizeGraphPrepare API). |
ge_proto_xxxx_PreRunAfterHandleSummaryOp.txt |
GE |
Graph structure obtained after Summary node processing. |
ge_proto_xxxx_PrepareAfterGraphEquivalentTransformation.txt |
GE |
Graph structure obtained after the for loop is equivalently transformed into while loop. |
ge_proto_xxxx_PrepareAfterProcessOutput.txt |
GE |
Graph structure obtained after graph data processing. |
ge_proto_xxxx_PrepareAfterOptimizeAfterGraphNormalization.txt |
GE |
Output graph of graph optimization after graph normalization |
ge_proto_xxxx_PrepareAfterProcessMultiBatch.txt |
GE |
Graph structure obtained after processing in the presence of dynamic batch size profiles. |
ge_proto_xxxx_PrepareAfterInsertAipp.txt |
GE |
Graph structure obtained after AIPP processing. |
ge_proto_xxxx_PrepareAfterProcessAippNodesDataFormat.txt |
GE |
Output graph of the AIPP node with format updated |
ge_proto_xxxx_PreRunAfterNormalizeGraph.txt |
GE |
Output graph of graph normalization |
ge_proto_xxxx_PreRunAfterOptimizeGraphInit.txt |
GE |
Output graph of graph optimization after graph initialization |
ge_proto_xxxx_PrepareAfterProcessBeforeInfershape.txt |
GE |
Graph structure obtained after "dead edges"of the conditional operator are eliminated. |
ge_proto_xxxx_after_first_inferformat.txt |
GE |
Graph structure obtained after format inference on the entire graph. |
ge_proto_xxxx_after_infershape.txt |
GE |
Graph structure obtained after shape inference on the entire graph, with constant folding. |
ge_proto_xxxx_PrepareAfterInferFormatAndShape.txt |
GE |
Graph structure obtained after format and shape inference. This graph has undergone the second format inference, compared with the preceding graph. |
ge_proto_xxxx_PrepareAfterCtrlFlowPreProcess.txt |
GE |
Graph structure obtained after the conditional operator is preprocessed. |
ge_proto_xxxx_PrepareAfterGetDynamicOutputShape.txt |
GE |
Graph structure obtained after graph output processing in the presence of dynamic batch size profiles. |
ge_proto_xxxx_PrepareAfterProcessAippStage2.txt |
GE |
Graph structure obtained after graph input processing in AIPP mode. |
ge_proto_xxxx_PrepareAfterPrepareOptimize.txt |
GE |
Graph structure obtained after optimization in the graph preparation phase. |
ge_proto_xxxx_PreRunAfterPrepare.txt |
GE |
Graph structure obtained after graph preparation, the same as the preceding graph. |
ge_proto_xxxx_OptimizeQuantGraph_FeGraphFusionAfter.txt |
FE |
Graph structure obtained after quantization in the graph optimization phase ends. |
ge_proto_xxxx_OptimizeOriginalGraph_FeGraphFusionAfter.txt |
FE |
Graph structure obtained after graph fusion ends. |
ge_proto_xxxx_OptimizeOriginalGraph_FeTopoSortingAfter.txt |
FE |
Graph structure obtained after graph fusion and topology sorting, for checking whether a ring is formed. |
ge_proto_xxxx_PreRunAfterOptimizeOriginalGraph.txt |
GE |
Graph structure obtained after the original graphs in the operator information library are optimized (by calling the OptimizeOriginalGraph API). |
ge_proto_xxxx_PrepareAfterUpdateInputOutputByUserOptions.txt |
GE |
Graph structure obtained after the graph input and output are processed based on the user's command-line options. |
ge_proto_xxxx_PrepareAfterUpdateVariableFormats.txt |
GE |
Graph structure obtained after the variable formats are processed. |
ge_proto_xxxx_PreRunAfterPrepareRunningFormatRefiner.txt |
GE |
Same as the preceding graph. |
ge_proto_xxxx_BeforeOptimizeOriginalGraphJudgeInsert.txt |
FE |
Input graph of the op_judge process |
ge_proto_xxxx_OptimizeOriginalGraph_FeOpJudgeAfter.txt |
FE |
Graph structure obtained after the operator judging process. |
ge_proto_xxxx_OptimizeOriginalGraph_FeDistHeavyFormatAfter.txt |
FE |
Graph structure obtained after diffusion of heavy operators. |
ge_proto_xxxx_OptimizeOriginalGraph_FeInsertTransNodeAfter.txt |
FE |
Graph structure obtained after the transform operator is inserted. |
ge_proto_xxxx_PreRunAfterRefineRunningFormat.txt |
GE |
Graph structure obtained after each operator information library is optimized (by calling the OptimizeOriginalGraphJudgeInsert API). |
ge_proto_xxxx_PreRunAfterSubexpressionMigration.txt |
GE |
Graph structure after the common subexpression is extracted in the scenario of dynamic dimension size profiles. |
ge_proto_xxxx_before_SameTransdataBreadthFusionPass.txt |
GE |
Input graph of SameTransdataBreadthFusionPass |
ge_proto_xxxx_after_SameTransdataBreadthFusionPass.txt |
GE |
Output graph of SameTransdataBreadthFusionPass |
ge_proto_xxxx_OptimizeStage1_1.txt |
GE |
Graph structure obtained after graph optimization stage 1_1. |
ge_proto_xxxx_OptimizeStage1_2.txt |
GE |
Graph structure obtained after graph optimization stage 1_2. |
ge_proto_xxxx_PreRunAfterOptimize1.txt |
GE |
Graph structure obtained after optimization stage 1 for all graphs. |
ge_proto_xxxx_PreRunAfterOptimizeAfterStage1.txt |
GE |
Graph structure obtained after each operator information library is optimized (by calling the OptimizeAfterStage1 API). |
ge_proto_xxxx_PreRunAfterInferShape2.txt |
GE |
Graph structure obtained after the second shape inference. |
ge_proto_xxxx_AfterPipelinePartition.txt |
GE |
Graph structure obtained after graph partitioning for the local pipeline, which is used in the helper scenario. |
ge_proto_xxxx_BeforeStagePartition.txt |
GE |
Graph before stage partitioning (Stage partitioning is mainly used in the early embedding scenario. It is uncertain whether it is still used.) |
ge_proto_xxxx_AfterStagePartition.txt |
GE |
Graph after stage partitioning |
gge_proto_xxxx_AfterEnginePlacer.txt |
GE |
Graph after engine selection |
ge_proto_xxxx_Before_DSP.txt |
GE |
Graph before dynamic and static model splitting |
ge_proto_xxxx_After_DSP.txt |
GE |
Graph after dynamic and static model splitting |
ge_proto_xxxx_AfterDynamicShapePartition.txt |
GE |
Graph structure obtained after graph partitioning with a dynamic shape. |
ge_proto_xxxx_MergedComputeGraphAfterCompositeEnginePartition.txt |
GE |
Structure of the merged graph obtained after opposite subgraph partitioning and subgraph optimization. |
ge_proto_xxxx_partition0_rank0_inputNodeGraph_AtomicEnginePartitioning.txt |
GE |
Structure of the input node subgraph obtained after graph partitioning based on the atomic engine rules. |
ge_proto_xxxx_partition0_rank1_new_sub_graph1_AtomicEnginePartitioning.txt |
GE |
Structure of subgraph 1 obtained after graph partitioning based on the atomic engine rules. |
ge_proto_xxxx_partition0_rank2_new_sub_graph110_AtomicEnginePartitioning.txt |
GE |
Structure of subgraph 110 obtained after graph partitioning based on the atomic engine rules. |
ge_proto_xxxx_OptimizeSubgraphPreProc.txt |
GE |
Output graph of subgraph preprocessing optimization |
ge_proto_xxxx_DNN_VM_RTS_OptimizeSubGraphBefore.txt |
RTS |
- |
ge_proto_xxxx_DNN_VM_RTS_OptimizeSubGraphAfter.txt |
RTS |
- |
ge_proto_xxxx_AIcoreEngine_OptimizeSubGraphBefore.txt |
FE |
- |
ge_proto_xxxx_OptimizeSubGraphBefore.txt |
GE |
Subgraph structure obtained before optimization. Subgraphs have the same name but different sequence numbers, depending on the number of subgraphs. |
ge_proto_xxxx_OptimizeSubGraphAfter.txt |
GE |
Subgraph structure obtained after optimization. Subgraphs have the same name but different sequence numbers, depending on the number of subgraphs. |
ge_proto_xxxx_partition0_rank1_new_sub_graph1_lxfusion_input.txt |
AOE |
SGAT input graph in the ATC and AOE baseline scenarios. |
ge_proto_xxxx_partition0_rank1_new_sub_graph1_after_rebuild.txt |
AOE |
UB fusion graph of the AOE SGAT internal process. |
ge_proto_xxxx_AIcoreEngine_OptimizeSubGraphAfter.txt |
FE |
- |
ge_proto_xxxx_OptimizeSubgraphPostProc.txt |
GE |
Output graph of subgraph postprocessing optimization |
ge_proto_xxxx_mergedComputeGraph.txt |
GE |
Structure of the merged graph, the same as the preceding graph. |
ge_proto_xxxx_MergedComputeGraphAfterAtomicEnginePartition.txt |
GE |
Structure of the merged graph obtained after opposite subgraph partitioning and subgraph optimization. |
ge_proto_xxxx_PreRunAfterOptimizeSubgraph.txt |
GE |
Subgraph structure obtained after optimization. |
ge_proto_xxxx_OptimizeWholeGraphaicpu_tf_optimizer.txt |
GE |
Graph information obtained after the original graph optimization API of each engine is called. OptimizeWholeGraph is followed by the engine name. |
ge_proto_xxxx_OptimizeWholeGraphaicpu_ascend_optimizer.txt |
GE |
Graph information obtained after the original graph optimization API of each engine is called. OptimizeWholeGraph is followed by the engine name. |
ge_proto_xxxx_OptimizeWholeGraphdvpp_graph_optimizer.txt |
GE |
Output graph after DVPP entire graph optimization |
ge_proto_xxxx_OptimizeWholeGraphAIcoreEngine.txt |
GE |
Graph information obtained after the original graph optimization API of each engine is called. OptimizeWholeGraph is followed by the engine name. |
ge_proto_xxxx_OptimizeWholeGraphDNN_VM_RTS_GRAPH_OPTIMIZER_STORE.txt |
GE |
Graph information obtained after the original graph optimization API of each engine is called. OptimizeWholeGraph is followed by the engine name. |
ge_proto_xxxx_OptimizeWholeGraphDNN_VM_HOST_CPU_OPTIMIZER.txt |
GE |
Graph information obtained after the original graph optimization API of each engine is called. OptimizeWholeGraph is followed by the engine name. |
ge_proto_xxxx_PreRunAfterOptimizeWholeGraph.txt |
GE |
Graph structure obtained after each operator information library is optimized (by calling the OptimizeWholeGraph API). |
ge_proto_xxxx_BeforeHandleMemConflict.txt |
GE |
Graph before the memory conflict is handled |
ge_proto_xxxx_BeforeHandleMemoryLayoutConflict.txt |
GE |
Input graph of memory allocation conflict handling |
ge_proto_xxxx_PreRunAfterMemConflictProc.txt |
GE |
Output graph of memory allocation conflict handling |
ge_proto_xxxx_PreRunAfterOptimize2.txt |
GE |
Graph structure obtained after optimization stage 2 for all graphs. |
ge_proto_xxxx_PreRunAfterOptimizeGraphBeforeBuild.txt |
GE |
Graph structure obtained after each operator information library is optimized (by calling the OptimizeGraphBeforeBuild API). |
ge_proto_xxxx_partition0_rank0_inputNodeGraph_SecondPartitioning.txt |
GE |
Structure of the input node graph obtained after seconding partitioning. |
ge_proto_xxxx_partition0_rank1_new_sub_graph1_SecondPartitioning.txt |
GE |
Structure of subgraph 1 obtained after seconding partitioning. |
ge_proto_xxxx_partition0_rank2_new_sub_graph110_SecondPartitioning.txt |
GE |
Structure of subgraph 2 obtained after seconding partitioning. |
ge_proto_xxxx_BeforePreBuildModel.txt |
GE |
Graph structure obtained after second graph splitting and before graph building. |
ge_proto_xxxx_AfterPreBuildModel.txt |
GE |
Graph structure obtained after pre-building. |
ge_proto_xxxx_AfterCalcOpParam.txt |
GE |
Graph structure obtained after the tensor sizes of all nodes in the graph are computed |
ge_proto_xxxx_BeforeAssignedLogicalStreams.txt |
GE |
Graph structure obtained before logic streams are assigned. |
ge_proto_xxxx_AfterAssignedLogicalStreams.txt |
GE |
Graph structure obtained after logic streams are assigned. |
ge_proto_xxxx_BeforeRefreshRealStream.txt |
GE |
Graph structure obtained before the stream sync activation relationship is processed. This graph has undergone memory allocation, compared with the preceding graph. |
ge_proto_xxxx_AfterRefreshRealStream.txt |
GE |
Graph structure obtained after the stream sync activation relationship is processed. |
ge_proto_xxxx_AfterBuildModel.txt |
GE |
Graph structure obtained after weight combination and generation of basic model data. |
ge_proto_xxxx_AfterOptimizeStreamedSubGraph.txt |
GE |
Graph structure obtained after the stream allocation result is optimized. |
ge_proto_xxxx_GenerateTaskBefore.txt |
GE |
Graph structure obtained before a task is generated on the node. |
ge_proto_xxxx_GenerateTaskAfter.txt |
GE |
Graph structure obtained after a task is generated on the node. The GenerateTask API of each operator information library is called. |
ge_proto_xxxx_AfterGetTask.txt |
GE |
Graph structure obtained after all tasks are generated on the node, the same as the preceding graph. |
ge_proto_xxxx_Build.txt |
GE |
Graph structure obtained after graph building. |
ge_proto_xxxx_PreRunAfterBuild.txt |
GE |
Same as the preceding graph. |
ge_proto_xxxx_BeforeAttrsCompress.txt |
GE |
Graph before OM attribute compression |
ge_proto_xxxx_AfterAttrsCompress.txt |
GE |
Graph after the OM attribute compression |