UnPad Tiling

Function Usage

Obtains the UnPad tiling parameter.

Prototype

1
void GetUnPadMaxMinTmpSize(const platform_ascendc::PlatformAscendC &ascendcPlatform, const ge::Shape &srcShape, const uint32_t typeSize, uint32_t &maxValue, uint32_t &minValue)
1
void UnPadTilingFunc(const ge::Shape srcShape, const uint32_t stackBufferSize, const uint32_t typeSize, optiling::UnPadTiling& tiling)

Parameters

Table 1 GetUnPadMaxMinTmpSize API parameters

Parameter

Input/Output

Meaning

ascendcPlatform

Input

Hardware platform information that is passed. For details about the definition of PlatformAscendC, see Overview.

srcShape

Input

Shape of the input tensor, which is two-dimensional

typeSize

Input

Data type size of operator inputs. The unit is byte. For example, if the data type of operator inputs is half, set this parameter to 2.

maxValue

Output

Maximum size of the temporary space required by UnPad computation.

Maximum size of the temporary space required by UnPad computation. Any space exceeding this value will not be utilized by the API. Within the range between the minimum and maximum, as the temporary space increases, the API computing performance on the kernel can be optimized to some extent. To achieve better performance, reserve or allocate the space based on the actual buffer usage. If the maximum space size is 0, no temporary space is required.

NOTE:

maxValue is for reference only and may be larger than the available space of the Unified Buffer. In this case, select a proper temporary space size based on the remaining space of the Unified Buffer.

minValue

Output

Minimum size of the temporary space required by UnPad computation.

Minimum size of the temporary space required by Pad computation. To ensure correct functions, the size of the temporary space to be reserved or allocated during API computation cannot be less than the value of this parameter. If the minimum space size is 0, no temporary space is required.

Table 2 UnPadTilingFunc API parameters

Parameter

Input/Output

Meaning

srcShape

Input

Shape of the input tensor, which is two-dimensional

stackBufferSize

Input

Size of the space that can be used for UnPad computation.

typeSize

Input

Data type size of operator inputs. The unit is byte. For example, if the data type of operator inputs is half, set this parameter to 2.

tiling

Output

Tiling information required by the UnPad API.

Returns

None

Example

The following example describes the process of obtaining the tiling parameter on the host and the method of using the parameter in the kernel when UnPad high-level APIs are used. In this example, the size of the original shape is [320, 64], the size of the target shape to be unpadded is [320, 63], and the input data type is half.

  1. Add the UnPadTiling structure parameter to the TilingData structure to function as a field.
    1
    2
    3
    4
    5
    6
    BEGIN_TILING_DATA_DEF(TilingData)               // Register a tiling class and uses the tiling name as the input parameter.
      TILING_DATA_FIELD_DEF(uint32_t, totalLength); // Add the tiling field to compute the total data volume.
      TILING_DATA_FIELD_DEF(uint32_t, tileNum);     // Add the tiling field that specifies the total number of data blocks to be computed on each core.
      ...                                           // Add other tiling fields.
      TILING_DATA_FIELD_DEF_STRUCT(UnPadTiling, unpadTilingData); // Add the UnPadTiling structure parameter to the TilingData structure.
    END_TILING_DATA_DEF;
    
  2. The tiling implementation function first calls the GetUnPadMaxMinTmpSize API to obtain the maximum and minimum temporary space sizes required by the UnPad API to complete computation, sets an appropriate space size based on this range and the actual buffer usage, and then obtains the tiling parameter required by the UnPad kernel API based on the input shape and remaining size of computing space.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    namespace optiling {
    const uint32_t BLOCK_DIM = 8;
    const uint32_t TILE_NUM = 8;
    static ge::graphStatus TilingFunc(gert::TilingContext* context)
    {
        TilingData tiling;
        uint32_t totalLength = context->GetInputTensor(0)->GetShapeSize();
        context->SetBlockDim(BLOCK_DIM);
        tiling.set_totalLength(totalLength);
        tiling.set_tileNum(TILE_NUM);
        // Set other tiling parameters.
        ...
        std::vector<int64_t> shapeVec = {320,64};
        ge::Shape srcShape(shapeVec);
        uint32_t maxValue = 0;
        uint32_t minValue = 0;
        auto platformInfo = context->GetPlatformInfo();
        auto ascendcPlatform = platform_ascendc::PlatformAscendC(platformInfo);
        AscendC::GetUnPadMaxMinTmpSize(ascendcPlatform, srcShape, sizeof(half), maxValue, minValue);
        // This example is for reference only. Obtain the minimum value and pass it to ensure correct functionality. Developers can pass a proper space size as required.
        const uint32_t localWorkSpaceSize = minValue;
        AscendC::UnPadTilingFunc(srcShape, localWorkSpaceSize , sizeof(half), tiling.unpadTilingData);
         ...
        tiling.SaveToBuffer(context->GetRawTilingData()->GetData(), context->GetRawTilingData()->GetCapacity());
        context->GetRawTilingData()->SetDataSize(tiling.GetDataSize());
        context->SetTilingKey(1);
        return ge::GRAPH_SUCCESS;
    }
    } // namespace optiling
    
  3. The kernel calls GET_TILING_DATA in the kernel function to obtain TilingData, and then passes the UnPad Tiling information in TilingData to the UnPad API for computation. For details about the complete example in the kernel, see Example.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    extern "C" __global__ __aicore__ void func_custom(GM_ADDR x, GM_ADDR y, GM_ADDR z, GM_ADDR workspace, GM_ADDR tiling)
    {
        GET_TILING_DATA(tilingData, tiling);
        KernelFunc op;
        op.Init(x, y, z, tilingData.totalLength, tilingData.tileNum,tilingData.unpadTilingData);
        if (TILING_KEY_IS(1)) {
            op.Process();
        }
    }