BatchMatMulV2ReshapeFusionPass

Description

Reshapes tensor A or tensor B from 1-dimensional to 2-dimensional.

Before: After:

In the scenario where the shape of A is 3D and that of B is 2D, use Reshape to reset the input of A to 2D and convert it into a MatMul operator for computation. The fusion pattern is as follows:

Before: After:

Restrictions

  • The INT4 and INT8 data types are not supported. In dynamic scenarios, only the left and right matrices that are not transposed are supported.
  • The left and right matrices cannot be in NZ format.
  • For the second graph fusion scenario, that is, the shape of the left matrix is 3 dimensions and that of the right matrix is 2 dimensions, trans_flag of the left matrix must be False. The value of batch*m cannot exceed the value of max(int64).
  • When the input optype is MatMulV2, the left input dtype of the input node can only be fp16 (data_type != DT_FLOAT16).
  • If BatchMatMulV2 is followed by the Add/Relu/AddN operator, graph fusion takes effect when batch size > 50 and M < 32 or batch size > 1 and M = 1.
  • In a single BatchMatMulV2, graph fusion takes effect when batch size > 4096 and M < 64 or batch size > 1 and M = 1.
  • For the non-UB fusion scenario of Atlas Training Series Product , graph fusion can be enabled for trustlists.

Availability

Atlas 200/300/500 Inference Product

Atlas Training Series Product