Fusion Support
Currently, this tool mainly implements the following forms of fusion (single-operators involved in the following fusion forms must meet the restrictions described in Quantization):
- Before AMCT quantization, "Conv+BN" fusion is performed on the "Conv2D+BatchNorm" composite in the model, during which the BatchNorm layer is removed.
- Depthwise_Conv+BN fusion: Before AMCT quantization, "Depthwise_Conv+BN" fusion is performed on the "DepthwiseConv2dNative+BatchNorm" composite in the model. The BatchNorm layer is removed.
- OP+(BiasAdd)+Mul fusion: Before AMCT quantization, "OP+(BiasAdd)+Mul" fusion is performed on the "Conv2D/MatMul/DepthwiseConv2dNative/Conv2DBackpropInput+Mul" and "Conv2D/MatMul/DepthwiseConv2d/Conv2DBackpropInput+BiasAdd+Mul" composites in the model, during which the Mul layer is removed.
In this scenario, the other input of Mul must be of the Const type with an empty shape.
- Group_conv+BN fusion: If the "Split+Multi-Conv2D+ ConcatV2 (or Concat)" composite is used in the model to indicate Group_conv, the "Group_conv+BatchNorm" composite is fused before AMCT quantization. After fusion, the BatchNorm layer is removed.
BN fusion applies to the following operators: FusedBatchNorm, FusedBatchNormV2, and FusedBatchNormV3.
- Fusion of small BN operators into FusedBatchNormV3: applicable only to PTQ. Only the "Conv+BN" or "Conv+BiasAdd+BN" composite triggers such fusion and small BN operators must take 4D inputs.
AMCT analyzes the composite of the small BN operators generated by tf.keras.layers.BatchNormalization, and replaces the small BN operators with a larger BN composite on the following conditions:
- On tf.keras.layers.BatchNormalization with fused=False, inputs, and training=False, the network structures before and after fusion are as follows.

- On tf.keras.layers.BatchNormalization with fused=False, center=False, inputs, and training=False, the network structures before and after fusion are as follows.

- On tf.keras.layers.BatchNormalization with fused=False, scale=False, inputs, and training=False, the network structures before and after fusion are as follows.

- On tf.keras.layers.BatchNormalization with fused=False, scale=False, center=False, inputs, and training=False, the network structures before and after fusion are as follows.

- On tf.keras.layers.BatchNormalization with fused=False, inputs, and training=False, the network structures before and after fusion are as follows.
Parent topic: Reference