SoftmaxFlashV2 Tiling
Function
Obtains the tiling parameters required by SoftmaxFlashV2.
Prototype
- APIs for obtaining the minimum and maximum temporary space required for kernel computation
1uint32_t GetSoftMaxFlashV2MinTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
uint32_t GetSoftMaxFlashV2MaxTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
- Tiling computation APIs
- Computation API in the AscendC::optiling namespace
1void SoftMaxFlashV2TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const uint32_t localWorkSpaceSize, optiling::SoftMaxTiling& softmaxFlashTiling, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
- Computation API in the AscendC namespace
1void SoftMaxFlashV2TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const uint32_t localWorkSpaceSize, AscendC::tiling::SoftMaxTiling& softmaxFlashTiling, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
- Computation API in the AscendC::optiling namespace
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
dataTypeSize1 |
Input |
Data type size of the source data to be computed, for example, half = 2. |
dataTypeSize2 |
Input |
Data type size of expSumTensor and maxTensor involved in computation, for example, half = 2. |
isUpdate |
Input |
Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel. |
isBasicBlock |
Input |
Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API in the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API in the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API. |
isFlashOutputBrc |
Input |
Whether to enable the non-extended mode of the output shape. In non-extended mode, the output data is not broadcast, and the output shape is (m, 1). The values are as follows:
|
Parameter |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
localWorkSpaceSize |
Input |
Size of the remaining space that can be used for SoftmaxFlashV2 computation. The value of localWorkSpaceSize must be greater than the minimum temporary space size required for computation by the GetSoftMaxFlashV2MinTmpSize API. |
dataTypeSize1 |
Input |
Data type size of the source data to be computed, for example, half = 2. |
dataTypeSize2 |
Input |
Data type size of maxTensor and sumTensor involved in computation, for example, half = 2. |
isUpdate |
Input |
Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel. |
isBasicBlock |
Input |
Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API in the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API in the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API. |
isFlashOutputBrc |
Input |
Whether to enable the non-extended mode of the output shape. In non-extended mode, the output data is not broadcast, and the output shape is (m, 1). The values are as follows:
|
softmaxFlashTiling |
Output |
Tiling information required by the SoftmaxFlashV2 APIs. The input parameters in the optiling::SoftMaxTiling and AscendC::tiling::SoftMaxTiling formats are supported. |
Returns
GetSoftMaxFlashV2MinTmpSize returns the minimum size (in bytes) of the temporary space required for SoftmaxFlashV2 computation.
GetSoftMaxFlashV2MaxTmpSize returns the maximum size (in byte) of the temporary space required by SoftmaxFlashV2 computation.
Restrictions
None