GetAscendAntiQuantTmpBufferFactorSize

Function

Obtains maxLiveNodeCount and extraBuf. When the fixed space size is used, the maximum number of elements that can be computed by the operator at a time can be calculated based on maxLiveNodeCount and extraBuf. maxLiveNodeCount indicates how many times the temporary space is relative to the data size in a single computation. extraBuf indicates the size of the extra temporary space.

Examples:

  • The AscendAntiQuant API needs to be called for operator implementation. You need to reserve space of the currBuff size and use the GetAscendAntiQuantTmpBufferFactorSize API to obtain the output values of maxLiveNodeCount and extraBuf. The maximum number of elements in a single computation of the operator can be calculated as follows:

    currentShapeSize = (currBuff - extraBuf) / maxLiveNodeCount / typeSize

  • To implement the operator, two kernel APIs KernelIntf1 and KernelIntf2 need to be called. Two output values (maxLiveNodeCount and extraBuf) of two GetXxxTmpBufferFactorSize API functions (Xxx indicates the two high-level APIs to be called) and the existing temporary space are used to calculate the maximum number of elements in a single computation (currentShapeSize).

    currentShapeSize1 = (currBuff - extraBuf1) / maxLiveNodeCount1 / typeSize

    currentShapeSize2 = (currBuff - extraBuf2) / maxLiveNodeCount2 / typeSize

    currentShapeSize = min(currentShapeSize1, currentShapeSize2)

Note that currBuff indicates the available space for API computation. The space used for purposes such as input and output must be excluded. In addition, the output value of maxLiveNodeCount might be 0. Ensure that the value is not 0 to avoid the division-by-zero error.

Prototype

1
void GetAscendAntiQuantTmpBufferFactorSize(const ge::Shape& srcShape, const ge::Shape& scaleShape, bool isTranspose, ge::DataType inputDataType, ge::DataType outputDataType, uint32_t& maxLiveNodeCount, uint32_t& extraBuf)

Parameters

Table 1 Parameters

Parameter

Input/Output

Description

srcShape

Input

Shape of the input srcTensor.

scaleShape

Input

Shape of the input scale.

isTranspose

Input

Whether to apply transposing.

inputDataType

Input

Input data type. The value is ge::DataType. For details about the definition of this type, see DataType.

outputDataType

Input

Output data type. The value is ge::DataType. For details about the definition of this type, see DataType.

maxLiveNodeCount

Output

Maximum number of live nodes, indicating how many times the temporary space is the data size in a single computation.

extraBuf

Output

Size of the used extra temporary space, in bytes.

Returns

None

Restrictions

If currentShapeSize × typeSize < 256B is obtained based on maxLiveNodeCount and extraBuf, currentShapeSize should be rounded up based on the value of 256B/typeSize.

Example

1
2
3
4
5
6
7
8
std::vector<int64_t> srcDims = { 64, 512 };
auto srcShape = ge::Shape(srcDims);
std::vector<int64_t> scaleDims = { 1, 512 };
auto scaleShape = ge::Shape(scaleDims);
bool isTranspose = false;
uint32_t maxLiveNodeCount = 0;
uint32_t extraBuf = 0;
AscendC::GetAscendAntiQuantTmpBufferFactorSize(srcShape, scaleShape, isTranspose, ge::DT_INT8, ge::DT_BF16, maxLiveNodeCount, extraBuf);