Fusion Functions and Graph Structure Optimization Implemented by the Tool

Fusion Functions

Currently, the following layer fusion types are supported:

  • Conv+BN fusion: Before AMCT quantization, "Conv+BN" fusion is performed on the "Conv+BatchNormalization" composite in the model. After fusion, the BatchNormalization layer is removed.
  • BatchNormalization+Mul: Before AMCT quantization, "BN+Mul" fusion is performed on the "BatchNormalization+Mul" composite in the model. After fusion, the Mul layer is removed.
  • BatchNormalization+Add: Before AMCT quantization, "BN+Add" fusion is performed on the "BatchNormalization+Add" composite in the model. After fusion, the Add layer is removed.

Graph Structure Optimization

If the model to be saved contains the Matmul+Add structure, the Dequant operator is inserted after the Matmul operator. However, if the following conditions are met, the Dequant operator is inserted after the Add operator:

  • If the output of MatMul is only one Add operator, and the other input of the Add operator is a one-dimensional constant, the Dequant operator is inserted after the Add operator to perform bias quantization on the other constant of the Add operator.

  • If MatMul has constants, the length of the other input of the Add operator must be the same as the last dimension of the weight.