Fusion Functions and Graph Structure Optimization Implemented by the Tool
Fusion Functions
Currently, the following layer fusion types are supported:
- Conv+BN fusion: Before AMCT quantization, "Conv+BN" fusion is performed on the "Conv+BatchNormalization" composite in the model. After fusion, the BatchNormalization layer is removed.
- BatchNormalization+Mul: Before AMCT quantization, "BN+Mul" fusion is performed on the "BatchNormalization+Mul" composite in the model. After fusion, the Mul layer is removed.
- BatchNormalization+Add: Before AMCT quantization, "BN+Add" fusion is performed on the "BatchNormalization+Add" composite in the model. After fusion, the Add layer is removed.
Graph Structure Optimization
If the model to be saved contains the Matmul+Add structure, the Dequant operator is inserted after the Matmul operator. However, if the following conditions are met, the Dequant operator is inserted after the Add operator:
- If the output of MatMul is only one Add operator, and the other input of the Add operator is a one-dimensional constant, the Dequant operator is inserted after the Add operator to perform bias quantization on the other constant of the Add operator.
- If MatMul has constants, the length of the other input of the Add operator must be the same as the last dimension of the weight.
Parent topic: Reference
