SetAntiQuantScalar

Function Description

Matmul computation supports half inputs of matrix A and int8 inputs of matrix B. In this scenario, the pseudo-quantization API must be called for pseudo-quantization. After the API is called, the pseudo-quantization operation is performed to convert matrix B to the half type when data is moved from the GM to L1. The pseudo-quantization API in this section provides the function of performing pseudo-quantization on all data of matrix B by using the same quantization coefficient.

Call this API before calling Iterate or IterateAll.

Prototype

1
__aicore__ inline void SetAntiQuantScalar(const SrcT offsetScalar, const SrcT scaleScalar)

Parameters

Parameter

Input/Output

Description

offsetScalar

Input

Pseudo-quantization coefficient, which is used for addition

scaleScalar

Input

Pseudo-quantized coefficient, which is used for multiplication

Returns

None

Availability

Precautions

None