算子融合推荐

UB算子融合

输出潜在可合并算子组合列表,如图1

图1 UBModel

====UBModel====

UB fusion operators need to be optimized

需要进行UB融合的算子

Identifications of UB fusion operators: the following operators can be used for UB fusion:

UB融合算子识别:以下算子可以进行UB融合:

List of operators that can be fused in subgraph 0

可以在子图中融合的算子运算符列表

Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block2a_dwconv/depthwiseblock2a_activation/Sigmoidblock2a_activation/mul, block2a_se_excite/mul; Fusion Operator Duration: 336.248993;

Fusion Type:可融合算子类型;Fusion Operator Detail:可融合算子明细;Duration:可融合算子运行总时间。

Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block2b_dwconv/depthwiseblock2b_activation/Sigmoidblock2b_activation/mul, block2b_se_excite/mul; Fusion Operator Duration: 284.063004;

Fusion Type: DepthwiseConv2D+Mul; Fusion Operator Detail: block3b_dwconv/depthwiseblock3b_activation/Sigmoidblock3b_activation/mul, block3b_se_excite/mul; Fusion Operator Duration: 220.571999;

输出结果会根据可融合算子运行总时间从大到小以及相同可融合算子类型进行排序。

优化建议

建议根据输出结果将可融合算子进行融合。

首层算子融合

输出潜在可合并算子组合列表,如图2

图2 AippFusionModel

====AippFusionModel====

Fuse Cast/TransData with Conv needs to be optimized

需要进行Aipp首层算子融合的算子

Identifications of AIPP fusion operators: the following operators can be used for AIPP fusion:

AIPP融合算子识别:以下算子可以进行AIPP融合:

List of operators that can be fused with aipp

可以在AIPP中融合的算子运算符列表

1. trans_Cast_0+trans_TransData_1+stem_conv/convolutionstem_activation/Sigmoidstem_activation/mul

优化建议

建议根据输出结果将可融合算子进行融合。

L2融合(动态Batch切分)

====L2Model====

L2 fusion operators need to be optimized

L2融合算子需要优化

Identifications of L2 fusion operators: the following operators can be used for L2 fusion:

L2融合算子的识别:L2融合可以使用以下算子:

List of operators that can be fused in subgraph

可以在子图中融合的运算符列表

1. MobilenetV3/expanded_conv/project/Conv2DMobilenetV3/expanded_conv/add, MobilenetV3/expanded_conv_1/expand/Conv2DMobilenetV3/expanded_conv_1/expand/Relu, MobilenetV3/expanded_conv_1/depthwise/depthwiseMobilenetV3/expanded_conv_1/depthwise/Relu, MobilenetV3/expanded_conv_1/project/Conv2D

Op Info

算子信息

Op Name: MobilenetV3/expanded_conv/project/Conv2DMobilenetV3/expanded_conv/add; OP Type: Conv2D; Input Shapes: 8,1,112,112,16;1,1,16,16;16;8,1,112,112,16; mac_ratio: 0.063851; vec_ratio: 0.635701; mte2_ratio: 0.980165; mte3_ratio: 0.547420; Hit Rate: 0.989344;

Op Name: MobilenetV3/expanded_conv_1/expand/Conv2DMobilenetV3/expanded_conv_1/expand/Relu; OP Type: Conv2D; Input Shapes: 8,1,112,112,16;1,4,16,16;64; mac_ratio: 0.066542; vec_ratio: 0.608959; mte2_ratio: 0.880165; mte3_ratio: 0.973353; Hit Rate: 0.982700;

Op Name: MobilenetV3/expanded_conv_1/depthwise/depthwiseMobilenetV3/expanded_conv_1/depthwise/Relu; OP Type: DepthwiseConv2D; Input Shapes: 8,4,112,112,16;4,3,3,1,16,16;64; mac_ratio: 0.175088; vec_ratio: 0.119661; mte2_ratio: 0.879285; mte3_ratio: 0.185707; Hit Rate: 0.930730;

Op Name: MobilenetV3/expanded_conv_1/project/Conv2D; OP Type: Conv2D; Input Shapes: 8,4,56,56,16;4,2,16,16;24; mac_ratio: 0.276653; vec_ratio: 0.427420; mte2_ratio: 0.877128; mte3_ratio: 0.668421; Hit Rate: 0.989344;

Recommadation

优化建议

1. Open AOE.

开启AOE功能。

有关AOE功能的使用请参见AOE工具使用指南