SetQuantScalar

Function Description

Matmul computation supports int8 inputs and half/int8 outputs. In this scenario, the dequantization API needs to be called for dequantization. After the dequantization API is called, a dequantization operation is performed to dequantize the final result to the half/int8 type when data is moved from L0C to GM. The dequantization API in this section provides the function of dequantizing all values of the output matrix by using the same dequantization coefficient.

Call this API before calling Iterate or IterateAll.

Prototype

__aicore__ inline void SetQuantScalar(const uint64_t quantScalar)

Parameters

Parameter	Input/Output	Description
quantScalar	Input	Dequantization coefficient

Returns

None

Availability

Precautions

None

Example

REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling);
uint64_t ans = 2;
mm.SetQuantScalar(ans);
mm.SetTensorA(gm_a);
mm.SetTensorB(gm_b);
mm.SetBias(gm_bias);
mm.IterateAll(gm_c);

Parent topic: Matmul