ConvWeightCompressFusionPass

Description

For Cube operators, compresses the filter by inserting the compression operator. The corresponding Cube operators support Conv2D, FullyConnection, and MatMulV2.

Before: After:

Restrictions

The first node (Conv2D/FullyConnection/MatMulV2) must meet the following conditions:

  • The input data type must be int8 or uint8.
  • The AI Core must be supported.
  • The value of groups must not be greater than 1.
  • Weight compression must be supported.
  • The input of filter must be included in the trustlist, including GroupPadding, ConvBnFilterHost, ConvScaleFilterHost, Concatv2HostCpuOp, RequantHostCpuOp, QuantWeightRollBack, GatherV2, GatherV2D, SwapCo, ReverseV2D, ConcatV2, TransData, Cast, Reshape, TransposeD, ReFormat, SqueezeV2, UnsqueezeV2, Maximum, Add, Mul, Sub, and AscendWeightQuant.