Matmulv2FusionPass

Description

For the MatMulV2 operator with three inputs in the weight int8 quantization scenario, inserts the TransposeD operator before the input Tensor B to be transposed.

Before:

After:

Restrictions

OpDesc of MatMulV2 must have the transpose_b attribute and the attribute value must be True.
The number of dimensions of Tensor B must be 2, and the data type must be int8.
The IR related to the source framework matmul must contain the transpose_b attribute.

Parent topic: Graph Fusion Patterns