SetAntiQuantScalar

Applicability

Product	Supported
Atlas A3 training products/Atlas A3 inference products	x
Atlas A2 training products/Atlas A2 inference products	x
Atlas 200I/500 A2 inference products	x
Atlas inference product's AI Core	√
Atlas inference product's Vector Core	x
Atlas training products	x

Function

Matmul computation supports half inputs of matrix A and int8 inputs of matrix B. In this scenario, the pseudo-quantization API must be called for pseudo-quantization. After the API is called, the pseudo-quantization operation is performed to convert matrix B to the half type when data is moved from the GM to L1. The pseudo-quantization API in this section provides the function of performing pseudo-quantization on all data of matrix B by using the same quantization coefficient.

Call this API before calling Iterate or IterateAll.

Prototype

__aicore__ inline void SetAntiQuantScalar(const SrcT offsetScalar, const SrcT scaleScalar)

Parameters

Parameter	Input/Output	Description
offsetScalar	Input	Pseudo-quantization coefficient used for addition. The data type is determined by SrcT, which corresponds to the type defined in A_TYPE.
scaleScalar	Input	Pseudo-quantization coefficient used for multiplication. The data type is determined by SrcT, which corresponds to the type defined in A_TYPE.

Returns

None

Restrictions

None

Parent topic: Matmul Kernel APIs