Instructions for Use
Ascend C provides a group of Matmul tiling APIs for users to obtain tiling parameters required for Matmul kernel computation. You only need to input information about the matrices A, B, and C and call the corresponding APIs to obtain related parameters in the TCubeTiling structure in Init.
Matmul tiling APIs are classified into Matmul single-core tiling APIs, multi-core tiling APIs, and BatchMatmul tiling APIs. The process of obtaining tiling parameters is as follows:
- Create a single-core tiling object, multi-core tiling object, or BatchMatmul tiling object.
- Set the type information of parameters A, B, C, and Bias, as well as the M, N, Ka, and Kb shape information.
- Call the GetTiling API to obtain the tiling information.
The following provides examples of using Matmul single-core and multi-core tiling APIs as well as BatchMatmul tiling APIs to obtain tiling parameters:
- Matmul single-core tiling
1 2 3 4 5 6 7 8 9 10 11 12
auto ascendcPlatform = platform_ascendc::PlatformAscendC(context->GetPlatformInfo()); matmul_tiling::MatmulApiTiling tiling(ascendcPlatform); tiling.SetAType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); tiling.SetBType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); tiling.SetCType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); tiling.SetBiasType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); tiling.SetShape(1024, 1024, 1024); tiling.SetOrgShape(1024, 1024, 1024); // Ka and Kb can have different lengths, for example, tiling.SetOrgShape(1024, 1024, 1024, 1280). tiling.SetBias(true); tiling.SetBufferSpace(-1, -1, -1); // Set the space that can be used. By default, all space of the AI processor is used. optiling::TCubeTiling tilingData; int ret = tiling.GetTiling(tilingData); // if ret = -1, get tiling failed
- Multi-core tiling
1 2 3 4 5 6 7 8 9 10 11 12 13 14
auto ascendcPlatform = platform_ascendc::PlatformAscendC(context->GetPlatformInfo()); matmul_tiling::MultiCoreMatmulTiling tiling(ascendcPlatform); tiling.SetDim(1); tiling.SetAType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); tiling.SetBType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); tiling.SetCType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); tiling.SetBiasType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); tiling.SetShape(1024, 1024, 1024); tiling.SetSingleShape(1024, 1024, 1024); tiling.SetOrgShape(1024, 1024, 1024); tiling.SetBias(true); tiling.SetBufferSpace(-1, -1, -1); // Set the space that can be used. By default, all space of the AI processor is used. optiling::TCubeTiling tilingData; int ret = tiling.GetTiling(tilingData); // if ret = -1, get tiling failed
- BatchMatmul Tiling
1 2 3 4 5 6 7 8 9 10 11 12 13 14
auto ascendcPlatform = platform_ascendc::PlatformAscendC(context->GetPlatformInfo()); matmul_tiling::BatchMatmulTiling bmmTiling(ascendcPlatform); bmmTiling.SetDim(1); bmmTiling.SetAType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); bmmTiling.SetBType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); bmmTiling.SetCType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); bmmTiling.SetBiasType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT); bmmTiling.SetBias(true); bmmTiling.SetShape(1024, 1024, 1024); bmmTiling.SetSingleShape(1024, 1024, 1024); bmmTiling.SetOrgShape(1024, 1024, 1024); bmmTiling.SetBufferSpace(-1, -1, -1); // Set the space that can be used. By default, all space of the AI processor is used. optiling::TCubeTiling tilingData; int ret = tiling.GetTiling(tilingData); // if ret = -1, get tiling failed
The API list is as follows:
API |
Function |
|---|---|
SetAType |
Sets the position, data format, data type, and transpose status of matrix A. |
SetBType |
Sets the position, data format, data type, and transpose status of matrix B. |
SetCType |
Sets the position, data format, and data type of matrix C. |
SetDequantType |
Sets the dequantization mode. |
SetBiasType |
Sets the position, data format, and data type of the bias. |
SetShape |
Sets the shapes singleM, singleN, and singleK of a single Matmul computation. The unit is the number of elements. |
SetOrgShape |
Sets the original complete shapes M, N, Ka, and Kb during Matmul computation. The unit is the number of elements. |
SetALayout |
Sets the layout axis information of matrix A. |
SetBLayout |
Sets the layout axis information of matrix B. |
SetCLayout |
Sets the layout axis information of matrix C. |
SetBatchInfoForNormal |
Sets the M, N, and K axes and the batch sizes of matrix A and matrix B. |
SetBatchNum |
Sets the maximum number of batches for multi-batch computation |
EnableBias |
Sets whether the bias is used in computation. |
SetBias |
Sets whether the bias is used in computation. |
SetFixSplit |
Sets the fixed baseM, baseN, and baseK. The unit is the number of elements. |
SetBufferSpace |
Sets the size of the available L1/L0C/UB space during Matmul computation. The unit is byte. |
SetTraverse |
Sets the traversal mode, that is, M axis first or N axis first. |
SetMadType |
Sets whether to enable the HF32 mode. Not supported in the current version. |
SetSplitRange |
Sets the maximum and minimum values of baseM, baseN, and baseK. |
SetMatmulConfigParams |
Customizes the MatmulConfig parameters. |
SetDoubleBuffer |
Determines whether to enable double buffer for A, B, C, and bias, and whether to enable ND2NZ or NZ2ND conversion. This API is reserved and not supported in the current version. |
GetBaseM |
Obtains the baseM value. |
GetBaseN |
Obtains the baseN value. |
GetBaseK |
Obtains the baseK value. |
GetTiling |
Obtains tiling parameters. |
API |
Function |
|---|---|
SetDim |
Sets the number of cores that can participate in multi-core Matmul computation. |
SetSingleRange |
Sets the maximum and minimum values of singleCoreM, singleCoreN, and singleCoreK. |
SetSingleShape |
Sets the shapes singleCoreM, singleCoreN, and singleCoreK of the Matmul single-core computation. The unit is the number of elements. |
GetSingleShape |
Obtains the computed singleCoreM, singleCoreN, and singleCoreK. |
SetAlignSplit |
Sets the singleCoreM, singleCoreN, and singleCoreK alignment values during multi-core tiling. |
GetCoreNum |
Obtains the blockDim used after multi-core tiling. |
SetSplitK |
Sets K-axis splitting in multi-core tiling. The EnableMultiCoreSplitK API is recommended. |
EnableMultiCoreSplitK |
Enables K-axis splitting in multi-core tiling. |
API |
Function |
|---|---|
GetCoreNum |
Obtains the blockDim used after multi-core tiling. |