SetQuantScalar

Function Description

Matmul computation supports int8 inputs and half/int8 outputs. In this scenario, the dequantization API needs to be called for dequantization. After the dequantization API is called, a dequantization operation is performed to dequantize the final result to the half/int8 type when data is moved from L0C to GM. The dequantization API in this section provides the function of dequantizing all values of the output matrix by using the same dequantization coefficient.

Call this API before calling Iterate or IterateAll.

Prototype

1
__aicore__ inline void SetQuantScalar(const uint64_t quantScalar)

Parameters

Parameter

Input/Output

Description

quantScalar

Input

Dequantization coefficient

Returns

None

Availability

Precautions

None

Example

1
2
3
4
5
6
7
REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling);
uint64_t ans = 2;
mm.SetQuantScalar(ans);
mm.SetTensorA(gm_a);
mm.SetTensorB(gm_b);
mm.SetBias(gm_bias);
mm.IterateAll(gm_c);