GetCumSumMaxMinTmpSize
Function Description
To perform CumSum computation in the kernel, developers need to reserve or allocate the temporary space. This API is used to obtain the maximum and minimum sizes of the temporary space to be reserved or allocated on the host. Developers can select a proper size within this range as the tiling parameter and pass it to the kernel.
- To ensure correct functions, the temporary space to be reserved or allocated cannot be less than the minimum temporary space.
- Within the range between the minimum and maximum, as the temporary space increases, the API computing performance in the kernel can be optimized to some extent. To achieve better performance, reserve or allocate the space based on the actual buffer usage.
Prototype
1 | void GetCumSumMaxMinTmpSize(const ge::Shape &srcShape, const uint32_t typeSize, const bool isLastAxis, const bool isReuseSource, uint32_t &maxValue, uint32_t &minValue) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
srcShape |
Input |
Input shape. |
typeSize |
Input |
Data type size of operator inputs. The unit is byte. For example, if the data type of operator inputs is half, set this parameter to 2. |
isReuseSource |
Input |
Whether to reuse the space of the source operand input. |
isLastAxis |
Input |
Whether to use the first axis or the last axis. |
maxValue |
Output |
Maximum size of the temporary space required by Cumsum computation. Any space exceeding this value will not be utilized by the API. NOTE:
maxValue is for reference only and may be larger than the available space of the Unified Buffer. In this case, select a proper temporary space size based on the remaining space of the Unified Buffer. |
minValue |
Output |
Minimum size of the temporary space required by Cumsum computation. To ensure correct functions, the size of the temporary space to be reserved or allocated during API computation cannot be less than the value of this parameter. |
Returns
None
Availability
Constraints
- For details about the alignment requirements of the operand address offset, see General Restrictions.
- The input supports only the two-dimensional structure.
- The value of inner must be an integer multiple of 32 bytes.
Example
1 2 3 4 5 6 7 8 | // The input is a matrix with 32 × 32 shape, the input data type of the operator is half, the default value of isLastAxis is true, and the default value of isReuseSource is false. uint32_t firstDim = 32; uint32_t lastDim = 32; std::vector<int64_t> srcShapeDims = {firstDim, lastDim}; auto srcShape = ge::Shape(srcShapeDims); uint32_t maxValue = 0; uint32_t minValue = 0; AscendC::GetCumSumMaxMinTmpSize(srcShape, 2, true, false, maxValue, minValue); |