SoftmaxFlashV2 Tiling
Function Usage
Obtains the tiling parameters required by SoftmaxFlashV2.
Prototype
1 | uint32_t GetSoftMaxFlashV2MinTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false) |
1 | uint32_t GetSoftMaxFlashV2MaxTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false) |
1 | void SoftMaxFlashV2TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const uint32_t localWorkSpaceSize, optiling::SoftMaxTiling& softmaxFlashTiling, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false) |
Parameters
API |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
dataTypeSize1 |
Input |
Data type size of the source data to be computed, for example, half = 2. |
dataTypeSize2 |
Input |
Data type size of maxTensor and sumTensor involved in computation, for example, half = 2. |
isUpdate |
Input |
Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel. |
isBasicBlock |
Input |
Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API on the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API on the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API. |
isFlashOutputBrc |
Input |
Whether to enable the non-extended mode of the output shape. In non-extended mode, BroadCast is not performed on the output data, and the output shape is (m, 1). Values:
|
API |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Shape of the input srcTensor. |
localWorkSpaceSize |
Input |
Size of the remaining space that can be used for SoftmaxFlashV2 computation. Note that the value of localWorkSpaceSize must be greater than the minimum temporary space size as required by the return of the GetSoftMaxFlashV2MinTmpSize API. |
dataTypeSize1 |
Input |
Data type size of the source data to be computed, for example, half = 2. |
dataTypeSize2 |
Input |
Data type size of maxTensor and sumTensor involved in computation, for example, half = 2. |
isUpdate |
Input |
Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel. |
isBasicBlock |
Input |
Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API on the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API on the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API. |
isFlashOutputBrc |
Input |
Whether to enable the non-extended mode of the output shape. In non-extended mode, BroadCast is not performed on the output data, and the output shape is (m, 1). Values:
|
softmaxFlashTiling |
Output |
Tiling information required by SoftmaxFlashV2. |
Returns
GetSoftMaxFlashV2MinTmpSize returns the minimum size (in byte) of the temporary space required for SoftmaxFlashV2 computation.
GetSoftMaxFlashV2MaxTmpSize returns the maximum size (in byte) of the temporary space required for SoftmaxFlashV2 computation.