SoftmaxFlashV3 Tiling
Function Usage
Obtains the tiling parameters required by SoftmaxFlashV3.
Prototype
1 | void GetSoftMaxFlashV3MaxMinTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, uint32_t& maxValue, uint32_t& minValue, const bool isUpdate, const bool isBasicBlock = false); |
1 | void SoftMaxFlashV3TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2,const uint32_t localWorkSpaceSize, optiling::SoftMaxTiling& softmaxFlashV3Tiling, const bool isUpdate,const bool isBasicBlock = false); |
Parameters
API |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
dataTypeSize1 |
Input |
Data type size of the input srcTensor and expMaxTensor, that is, the data type size of T in the SoftMaxFlashV3 kernel function. Currently, T supports only the half type. Therefore, this parameter can only be set to 2. |
dataTypeSize2 |
Input |
Data type size of the input inMeanTensor, inExpSumTensor, and inMaxTensor, that is, the data type size of U in the SoftMaxFlashV3 kernel function. Currently, U supports only the float type. Therefore, this parameter can only be set to 4. |
maxValue |
Output |
Maximum size of the temporary space required for SoftMaxFlashV3 computation. Any space exceeding this value will not be utilized by the API. Within the range between the minimum and maximum, as the temporary space increases, the API computing performance on the kernel becomes better to some extent. To achieve better performance, reserve or allocate the space based on the actual buffer usage. If the maximum space size is 0, no temporary space is required. NOTE:
maxValue is for reference only and may be larger than the remaining space of the Unified Buffer. In this case, select a proper temporary space size based on the remaining space of the Unified Buffer. |
minValue |
Output |
Minimum size of the temporary space required for SoftMaxFlashV3 computation. To ensure correct functions, the size of the temporary space to be reserved or allocated during API computation cannot be less than this parameter value. If the minimum space size is 0, no temporary space is required. |
isUpdate |
Input |
Whether to set update to true in the SoftMaxFlashV3 computation. The parameter value must be the same as isUpdate of the SoftmaxFlashV3 kernel API. |
isBasicBlock |
Input |
Reserved for future use. The default value false must be used. |
API |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
dataTypeSize1 |
Input |
Data type size of the input srcTensor and expMaxTensor, that is, the data type size of T in the SoftMaxFlashV3 kernel function. Currently, T supports only the half type. Therefore, this parameter can only be set to 2. |
dataTypeSize2 |
Input |
Data type size of the input inMeanTensor, inExpSumTensor, and inMaxTensor, that is, the data type size of U in the SoftMaxFlashV3 kernel function. Currently, U supports only the float type. Therefore, this parameter can only be set to 4. |
localWorkSpaceSize |
Input |
Size of the remaining space that can be used for SoftmaxFlashV3 computation. Note that the value of localWorkSpaceSize must be greater than the minimum temporary space size as required by the return of the GetSoftMaxFlashV3MaxMinTmpSize API. |
isUpdate |
Input |
Whether to set update to true in the SoftMaxFlashV3 computation. The parameter value must be the same as isUpdate of the SoftmaxFlashV3 kernel API. |
isBasicBlock |
Input |
Reserved for future use. The default value false must be used. |
softmaxFlashV3Tiling |
Output |
Tiling information required by the SoftMaxFlashV3 API. |
Returns
There are no returns for GetSoftMaxFlashV3MaxMinTmpSize.
There are no returns for SoftMaxFlashV3TilingFunc.