GetSelectWithBytesMaskMaxMinTmpSize
Function Usage
To perform SelectWithBytesMask computation on the kernel, you need to allocate the temporary space. This API is used to obtain the maximum and minimum sizes of the temporary space to be allocated on the host. You can select a proper size within this range as the tiling parameter and pass it to the kernel.
- To ensure correct functions, the temporary space to be allocated cannot be less than the minimum temporary space.
- Within the range between the minimum and maximum, as the temporary space increases, the API computing performance in the kernel can be optimized to some extent. To achieve better performance, allocate the space based on the actual buffer usage.
Prototype
1 | uint32_t GetSelectWithBytesMaskMinTmpSize(const ge::Shape &src0Shape, const ge::Shape &src1Shape, const uint32_t srcTypeSize, const ge::Shape &maskShape, const uint32_t maskTypeSize, const bool isReuseMask) |
1 | uint32_t GetSelectWithBytesMaskMaxTmpSize(const ge::Shape &src0Shape, const ge::Shape &src1Shape, const uint32_t srcTypeSize, const ge::Shape &maskShape, const uint32_t maskTypeSize, const bool isReuseMask) |
1 | void GetSelectWithBytesMaskMaxMinTmpSize(const ge::Shape &src0Shape, const ge::Shape &src1Shape, const uint32_t srcTypeSize, const ge::Shape &maskShape, const uint32_t maskTypeSize, const bool isReuseMask, uint32_t &maxValue, uint32_t &minValue) |
Parameters
API |
Input/Output |
Function |
|---|---|---|
src0Shape |
Input |
Shape information of input src0. When src0 is a scalar, the shape should be {1}. |
src1Shape |
Input |
Shape information of input src1. When src1 is a scalar, the shape should be {1}. |
srcTypeSize |
Input |
Data type size of input srcTensor. For example, if the data type is half, set this parameter to 2. |
maskShape |
Input |
Shape information of input maskTensor. |
maskTypeSize |
Input |
Data type size of input maskTensor. For example, if the data type is bool, set this parameter to 1. |
isReuseMask |
Input |
Whether to reuse the space of input maskTensor. The value must be the same as that in the kernel. |
maxValue |
Output |
Maximum size of the temporary space required by SelectWithBytesMask computation. NOTE:
maxValue is for reference only and may be larger than the available space of the Unified Buffer. In this case, select a proper temporary space size based on the remaining space of the Unified Buffer. |
minValue |
Output |
Minimum size of the temporary space required by SelectWithBytesMask computation. |
Returns
GetSelectWithBytesMaskMinTmpSize returns the minimum temporary space size required by SelectWithBytesMask computation.
GetSelectWithBytesMaskMaxTmpSize returns the maximum temporary space size required by SelectWithBytesMask computation.
No value is returned for GetSelectWithBytesMaskMaxMinTmpSize.
Examples
1 2 3 4 5 6 7 8 9 | std::vector<int64_t> shape0Vec = {64, 128}; std::vector<int64_t> shape1Vec = {1}; std::vector<int64_t> mask1Vec = {64, 128}; ge::Shape src0Shape(shape0Vec); ge::Shape src1Shape(shape1Vec); ge::Shape maskShape(maskShape); uint32_t maxValue = 0; uint32_t minValue = 0; AscendC::GetSelectWithBytesMaskMaxMinTmpSize(src0Shape, src1Shape, 2, maskShape, 1, false, maxValue, minValue); |