SoftmaxFlashV2 Tiling

Function

Obtains the tiling parameters required by SoftmaxFlashV2.

Prototype

  • APIs for obtaining the minimum and maximum temporary space required for kernel computation
    1
    uint32_t GetSoftMaxFlashV2MinTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
    
    uint32_t GetSoftMaxFlashV2MaxTmpSize(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
  • Tiling computation APIs
    • Computation API in the AscendC::optiling namespace
      1
      void SoftMaxFlashV2TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const uint32_t localWorkSpaceSize, optiling::SoftMaxTiling& softmaxFlashTiling, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
      
    • Computation API in the AscendC namespace
      1
      void SoftMaxFlashV2TilingFunc(const ge::Shape& srcShape, const uint32_t dataTypeSize1, const uint32_t dataTypeSize2, const uint32_t localWorkSpaceSize, AscendC::tiling::SoftMaxTiling& softmaxFlashTiling, const bool isUpdate, const bool isBasicBlock = false, const bool isFlashOutputBrc = false)
      

Parameters

Table 1 GetSoftMaxFlashV2MinTmpSize/GetSoftMaxFlashV2MaxTmpSize API parameters

Parameter

Input/Output

Description

srcShape

Input

Shape of the input srcTensor.

dataTypeSize1

Input

Data type size of the source data to be computed, for example, half = 2.

dataTypeSize2

Input

Data type size of expSumTensor and maxTensor involved in computation, for example, half = 2.

isUpdate

Input

Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel.

isBasicBlock

Input

Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API in the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API in the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API.

isFlashOutputBrc

Input

Whether to enable the non-extended mode of the output shape. In non-extended mode, the output data is not broadcast, and the output shape is (m, 1). The values are as follows:

  • false (default): disables the non-extended mode. When the output data type is float, the shape is (m, 8). When the output data type is half, the shape is (m, 16).
  • true: enables the non-extended mode. The output shape is (m, 1). If this parameter is set to true, mode in SoftmaxConfig of kernel API must be set to SoftmaxMode::SOFTMAX_OUTPUT_WITHOUT_BRC.
Table 2 SoftMaxFlashV2TilingFunc API parameters

Parameter

Input/Output

Description

srcShape

Input

Shape of the input srcTensor.

localWorkSpaceSize

Input

Size of the remaining space that can be used for SoftmaxFlashV2 computation. The value of localWorkSpaceSize must be greater than the minimum temporary space size required for computation by the GetSoftMaxFlashV2MinTmpSize API.

dataTypeSize1

Input

Data type size of the source data to be computed, for example, half = 2.

dataTypeSize2

Input

Data type size of maxTensor and sumTensor involved in computation, for example, half = 2.

isUpdate

Input

Whether to enable the refresh function. The value must be consistent with that of the SoftmaxFlashV2 API in the kernel.

isBasicBlock

Input

Whether to enable base block computation. The isBasicBlock parameter can be obtained through the isBasicBlockInSoftmax API. The value must be the same as the template parameter of the API in the kernel. The default value is false. Note that if the template parameter SoftmaxConfig is enabled by the API in the kernel, that is, a constant shape is used, the isBasicBlock parameter must be obtained through the isBasicBlockInSoftmax API.

isFlashOutputBrc

Input

Whether to enable the non-extended mode of the output shape. In non-extended mode, the output data is not broadcast, and the output shape is (m, 1). The values are as follows:

  • false (default): disables the non-extended mode. When the output data type is float, the shape is (m, 8). When the output data type is half, the shape is (m, 16).
  • true: enables the non-extended mode. The output shape is (m, 1). If this parameter is set to true, mode in SoftmaxConfig of kernel API must be set to SoftmaxMode::SOFTMAX_OUTPUT_WITHOUT_BRC.

softmaxFlashTiling

Output

Tiling information required by the SoftmaxFlashV2 APIs. The input parameters in the optiling::SoftMaxTiling and AscendC::tiling::SoftMaxTiling formats are supported.

Returns

GetSoftMaxFlashV2MinTmpSize returns the minimum size (in bytes) of the temporary space required for SoftmaxFlashV2 computation.

GetSoftMaxFlashV2MaxTmpSize returns the maximum size (in byte) of the temporary space required by SoftmaxFlashV2 computation.

Restrictions

None