ConcatQuantFusionPass

Description

Fuses the ConcatD/ConcatV2D+Quant subgraph into the Quant+ConcatD/ConcatV2D subgraph. The fusion patterns can reduce the amount of data to be moved and improve computing performance.

Before: After:

Or

Before: After:

Or

Before: After:

Restrictions

  • In the scenario shown in Figure 1, the values of Quant0 and Quant1 must be the same.
  • During data comparison, the corresponding fusion pattern needs to be disabled.
  • This fusion pattern is not supported when the data type of the current Quant output is int4.
  • The output node of Concat does not support the stridedwrite operator.
  • When FixPipe is supported, ReLU can be LeakyRelu, Prelu, Relu6, or Relu.
  • When the Concat input format is NCHW and concat_dim_ is 1 or -3, or when the Concat input format is NHWC and concat_dim_ is 3 or -1, the merge axis is the C axis, the value of the C axis must be an integer multiple of the K0 value. The shape value must meet the following conditions:
    When the data type is the default Float16 or Float32, K0=16. When the data type is int8, K0=32. When the data type is int4, K0=64. This constraint applies to the following chip types:
    • Atlas 200/300/500 Inference Product
    • Atlas Training Series Product