输出潜在可合并算子组合列表,如图1。
====UBModel====
UB fusion operators need to be optimized
需要进行UB融合的算子
Identifications of UB fusion operators: the following operators can be used for UB fusion:
UB融合算子识别:以下算子可以进行UB融合:
List of operators that can be fused in subgraph 0
可以在子图中融合的算子运算符列表
Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block2a_dwconv/depthwiseblock2a_activation/Sigmoidblock2a_activation/mul, block2a_se_excite/mul; Fusion Operator Duration: 336.248993;
Fusion Type:可融合算子类型;Fusion Operator Detail:可融合算子明细;Duration:可融合算子运行总时间。
Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block2b_dwconv/depthwiseblock2b_activation/Sigmoidblock2b_activation/mul, block2b_se_excite/mul; Fusion Operator Duration: 284.063004;
Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block3b_dwconv/depthwiseblock3b_activation/Sigmoidblock3b_activation/mul, block3b_se_excite/mul; Fusion Operator Duration: 220.571999;
输出结果会根据可融合算子运行总时间从大到小以及相同可融合算子类型进行排序。
优化建议:
建议根据输出结果将可融合算子进行融合。
输出潜在可合并算子组合列表,如图2。
====AippFusionModel====
Fuse Cast/TransData with Conv needs to be optimized
需要进行Aipp首层算子融合的算子
Identifications of AIPP fusion operators: the following operators can be used for AIPP fusion:
AIPP融合算子识别:以下算子可以进行AIPP融合:
List of operators that can be fused with aipp
可以在AIPP中融合的算子运算符列表
1. trans_Cast_0+trans_TransData_1+stem_conv/convolutionstem_activation/Sigmoidstem_activation/mul
优化建议:
建议根据输出结果将可融合算子进行融合。
====L2Model====
L2 fusion operators need to be optimized
L2融合算子需要优化
Identifications of L2 fusion operators: the following operators can be used for L2 fusion:
L2融合算子的识别:L2融合可以使用以下算子:
List of operators that can be fused in subgraph
可以在子图中融合的运算符列表
1. MobilenetV3/expanded_conv/project/Conv2DMobilenetV3/expanded_conv/add, MobilenetV3/expanded_conv_1/expand/Conv2DMobilenetV3/expanded_conv_1/expand/Relu, MobilenetV3/expanded_conv_1/depthwise/depthwiseMobilenetV3/expanded_conv_1/depthwise/Relu, MobilenetV3/expanded_conv_1/project/Conv2D
Op Info
算子信息
Op Name: MobilenetV3/expanded_conv/project/Conv2DMobilenetV3/expanded_conv/add; OP Type: Conv2D; Input Shapes: 8,1,112,112,16;1,1,16,16;16;8,1,112,112,16; mac_ratio: 0.063851; vec_ratio: 0.635701; mte2_ratio: 0.980165; mte3_ratio: 0.547420; Hit Rate: 0.989344;
Op Name: MobilenetV3/expanded_conv_1/expand/Conv2DMobilenetV3/expanded_conv_1/expand/Relu; OP Type: Conv2D; Input Shapes: 8,1,112,112,16;1,4,16,16;64; mac_ratio: 0.066542; vec_ratio: 0.608959; mte2_ratio: 0.880165; mte3_ratio: 0.973353; Hit Rate: 0.982700;
Op Name: MobilenetV3/expanded_conv_1/depthwise/depthwiseMobilenetV3/expanded_conv_1/depthwise/Relu; OP Type: DepthwiseConv2D; Input Shapes: 8,4,112,112,16;4,3,3,1,16,16;64; mac_ratio: 0.175088; vec_ratio: 0.119661; mte2_ratio: 0.879285; mte3_ratio: 0.185707; Hit Rate: 0.930730;
Op Name: MobilenetV3/expanded_conv_1/project/Conv2D; OP Type: Conv2D; Input Shapes: 8,4,56,56,16;4,2,16,16;24; mac_ratio: 0.276653; vec_ratio: 0.427420; mte2_ratio: 0.877128; mte3_ratio: 0.668421; Hit Rate: 0.989344;
Recommadation
优化建议:
1. Open AOE.
开启AOE功能。
有关AOE功能的使用请参见《AOE工具使用指南》。