SetQuantScalar
Function Description
Matmul computation supports int8 inputs and half/int8 outputs. In this scenario, the dequantization API needs to be called for dequantization. After the dequantization API is called, a dequantization operation is performed to dequantize the final result to the half/int8 type when data is moved from L0C to GM. The dequantization API in this section provides the function of dequantizing all values of the output matrix by using the same dequantization coefficient.
Call this API before calling Iterate or IterateAll.
Prototype
1 | __aicore__ inline void SetQuantScalar(const uint64_t quantScalar) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
quantScalar |
Input |
Dequantization coefficient |
Returns
None
Availability
Precautions
None
Example
1 2 3 4 5 6 7 | REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling); uint64_t ans = 2; mm.SetQuantScalar(ans); mm.SetTensorA(gm_a); mm.SetTensorB(gm_b); mm.SetBias(gm_bias); mm.IterateAll(gm_c); |
Parent topic: Matmul