SetAntiQuantScalar
Applicability
Product |
Supported |
|---|---|
x |
|
x |
|
x |
|
√ |
|
x |
|
x |
Function
Matmul computation supports half inputs of matrix A and int8 inputs of matrix B. In this scenario, the pseudo-quantization API must be called for pseudo-quantization. After the API is called, the pseudo-quantization operation is performed to convert matrix B to the half type when data is moved from the GM to L1. The pseudo-quantization API in this section provides the function of performing pseudo-quantization on all data of matrix B by using the same quantization coefficient.
Call this API before calling Iterate or IterateAll.
Prototype
1 | __aicore__ inline void SetAntiQuantScalar(const SrcT offsetScalar, const SrcT scaleScalar) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
offsetScalar |
Input |
Pseudo-quantization coefficient used for addition. The data type is determined by SrcT, which corresponds to the type defined in A_TYPE. |
scaleScalar |
Input |
Pseudo-quantization coefficient used for multiplication. The data type is determined by SrcT, which corresponds to the type defined in A_TYPE. |
Returns
None
Restrictions
None