Matmulv2FusionPass

Description

For the MatMulV2 operator with three inputs in the weight int8 quantization scenario, inserts the TransposeD operator before the input Tensor B to be transposed.

Before:

After:

Restrictions

  • OpDesc of MatMulV2 must have the transpose_b attribute and the attribute value must be True.
  • The number of dimensions of Tensor B must be 2, and the data type must be int8.
  • The IR related to the source framework matmul must contain the transpose_b attribute.