昇腾社区首页
中文
注册

算子融合规则配置功能(optimization_switch)

功能简介

算子编译时,可根据实际业务需要灵活地配置算子融合,以降低网络推理时间、提高整网性能。

本功能是算子融合规则的控制开关,与算子融合规则配置功能(fusion_switch_file)类似,差异如下:

config.fusion_config.fusion_switch_file仅能关闭图融合和UB融合的规则,并且需要单独配置Json文件;而本功能适用于所有融合规则的指定,不需要再单独设置Json文件。如果两个功能都配置,且配置了同一个融合规则,则以本功能配置为准。

使用方法

该功能通过torchair.get_npu_backend中compiler_config配置,示例如下,仅供参考不支持直接拷贝运行,参数介绍参见表1

1
2
3
4
5
6
import torch_npu, torchair
config = torchair.CompilerConfig()
# 算子融合规则配置开关
config.ge_config.optimization_switch = "Passname1:on;Passname2:off"
npu_backend = torchair.get_npu_backend(compiler_config=config)
opt_model = torch.compile(model, backend=npu_backend) 
表1 参数说明

参数名

说明

optimization_switch

算子编译时,融合规则的控制开关。取值格式为key-value键值对,形如"Passname1:on;Passname2:off",key为Pass名称,value为on(表示开)或off(表示关),不支持大小写模式匹配,多组配置使用英文分号分隔。可配置的融合规则请参见融合规则列表

融合规则列表

注意如下融合规则关闭后,可能会对功能使用有影响,请谨慎操作。更多融合规则的介绍请参见CANN 图融合和UB融合规则参考

  • AABiasaddConvFusion
  • AddNFusionPass
  • AddRmsNormFusionGraphPass
  • ADeformableConv2dPass
  • ADepthwiseFusionPass
  • ALSTMFusionPass
  • ApplyAddOutputPass
  • ApplyAddOutputPass
  • AReduceMeanFusionPass
  • AReduceSumFusionPass
  • ArgMaxWithFusionPass
  • AvgPool3DFusionPass
  • AvgPool3DGradFusionPass
  • AvgPoolGradFusionPass
  • AvgPoolQuantProcessFusionPass
  • BatchMatMulFusionPass
  • BatchMatmulV2QuantProcessFusionPass
  • BatchNormBnInferFusionPass
  • BatchNormGradBnInferGradFusion
  • BatchNormGradInfGradFusion
  • CastRemoveFusionPass
  • clip_by_norm_nodivsquaresum
  • CommonLSTMFusionPass
  • CommonSubexpressionEliminationPass:GE公共表达式消除Pass
  • ConstToAttrGatherV2Fusion
  • ConstToAttrPass
  • ConstToAttrPass
  • ConstToAttrReduceSumFusion
  • ConstToAttrResizeNearestNeighborGradFusion
  • ConstToAttrStridedSliceV2Fusion
  • Conv2DQuantProcessFusionPass
  • Conv2DTDQuantProcessFusionPass
  • COPYPass
  • DeConvQuantProcessFusionPass
  • DeformableOffsetsFusionPass
  • DepthwiseDfFusionPass
  • DepthwiseDwMulFusionPass
  • DepthwiseFusionPass
  • DepthwiseInsertTransDataFusionPass
  • DepthwiseToConv2dFusionPass
  • DreluFusionPass
  • DWConv2DQuantProcessFusionPass
  • DynamicGRUV2GradFusionPass
  • DynamicRNNFusionPass
  • DynamicRNNGradAFusionPass
  • DynamicRNNGradAlignFusionPass
  • DynamicRNNGradDAlignFusionPass
  • DynamicRNNGradDFusionPass
  • DynamicRNNGradFusionPass
  • DynamicRNNInsertTransposePass
  • DynamicRNNInsertTransposePass
  • DynamicRNNSeqFusionPass
  • EinsumPass
  • FCQuantProcessFusionPass
  • FixPipeAbilityProcessPass
  • FlattenV2Pass
  • FusedBatchNormBertFusionPass
  • FusedBatchnormFusionPass
  • FusedBatchNormGradFusionPass
  • FusedBatchNormGradFusionPass
  • Globalavgpoolpass
  • GroupConv2DQuantProcessFusionPass
  • HostBNFusionPass
  • HostShapeOptimizationPass:AI CPU断流水pass
  • InplaceAddRmsNormFusionPass
  • MapIndexFusionPass
  • MatMul2MatMulV2FusionPass
  • MatmulV2QuantProcessFusionPass
  • MaxPoolWithArgmaxFusionPass
  • NormalizeFusionPass
  • PackFusionPass
  • PassThroughFusionPass
  • PassThroughSecondFusionPass
  • PermuteFusionPass
  • PoolingQuantProcessFusionPass
  • PReluGradFusionPass
  • PriorBoxPass
  • ProposalFusionPass
  • RNNFusionPass
  • sedBatchNormGradFusionPass
  • SingleBatchNormFusion
  • SoftmaxGradExtFusion
  • SpatialTransformerDPass
  • SPPPass
  • TbeAntiquantMaxpoolingFusionPass
  • TbeConvCommonRules0FusionPass
  • TbeConvCommonRules2FusionPass
  • TbeConvDequantS16FusionPass
  • TbeConvRequantFusionPass
  • TfMergeConv2DBackpropInputFusionPass
  • TfMergeSubFusionPass
  • TfTagNoConstFoldingFusionPass
  • TransdataCastFusionPass
  • TransposedUpdateFusionPass
  • UnpackFusionPass
  • V100NotRequantFusionPass
  • V200NotRequantFusionPass
  • WeightQuantBatchMatmulV2TransposeFusionPass
  • YoloPass
  • YoloV2DetectionOutputPass
  • YoloV3DetectionOutputV2Pass
  • ZConcatDFusionPass
  • ZConcatExt2FusionPass
  • ZConfusionSoftmaxGradFusionPass
  • ZSplitVDFusionPass
  • ZSplitVFusionPass