SetMatmulConfigParams

Function

Sets the MatmulConfig parameters in Table 1 for tiling computation. The functions of the parameters configured in this API must be the same in the tiling and in the kernel. Therefore, the parameter values in this API must be the same as those of the MatmulConfig parameters in the kernel. For details about the MatmulConfig parameters, see Table 2.

Prototype

1
void SetMatmulConfigParams(int32_t mmConfigTypeIn = 1, bool enableL1CacheUBIn = false, ScheduleType scheduleTypeIn = ScheduleType::INNER_PRODUCT, MatrixTraverse traverseIn = MatrixTraverse::NOSET, bool enVecND2NZIn = false)
1
void SetMatmulConfigParams(const MatmulConfigParams& configParams)

Parameters

Table 1 Parameters

Parameter

Input/Output

Description

mmConfigTypeIn

Input

Matmul template type, which must be the same as the template created by the Matmul object. Currently, the value can only be 0 or 1.

  • 0: indicates the Norm template
  • 1 (default): indicates the MDL template.

enableL1CacheUBIn

Input

Whether to cache UB computing blocks in L1. It can be enabled in scenarios where MTE3 and MTE2 serial pipelines are applied.

  • false (default): does not cache UB computing blocks in L1.
  • true: caches UB computing blocks in L1.

For the Atlas A3 training products/Atlas A3 inference products, this parameter is not supported.

For the Atlas A2 training products/Atlas A2 inference products, this parameter is not supported.

For the Atlas inference product's AI Core, this parameter is supported.

For the Atlas 200I/500 A2 inference products, this parameter is not supported.

scheduleTypeIn

Input

Matmul data movement mode. The values are as follows:

  • ScheduleType::INNER_PRODUCT (default): performs MTE1 cyclic movement in the K direction.
  • ScheduleType::OUTER_PRODUCT: performs MTE1 cyclic movement in the M or N direction.
  • ScheduleType::N_BUFFER_33: indicates the data movement mode of the NBuffer33 template. MTE2 transfers 1 × 3 base blocks of matrix A each time until all 3 × 3 base blocks of matrix A are loaded to the L1 buffer.

traverseIn

Input

Cyclic iteration sequence for Matmul to perform matrix computation, that is, the offset sequence in which Matmul automatically offsets to the output position of Matrix C for the next iteration after a matrix C slice with the size of [baseM, baseN] is computed in one iteration. The values are as follows:

1
2
3
4
5
enum class MatrixTraverse{
    NOSET = 0,   // Invalid currently.
    FIRSTM,      // Offset to the M-axis direction and then to the N-axis direction.
    FIRSTN,      // Offset to the N-axis direction and then to the M-axis direction.
};

enVecND2NZIn

Input

Whether to enable ND2NZ.

configParams

Input

config parameter. It is of the MatmulConfigParams type. The structure is defined as follows: For details about the parameters, see Table 2.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
struct MatmulConfigParams
{
    int32_t mmConfigType;
    bool enableL1CacheUB;
    ScheduleType scheduleType;
    MatrixTraverse traverse;
    bool enVecND2NZ;
    MatmulConfigParams(int32_t mmConfigTypeIn = 1, bool enableL1CacheUBIn = false,
        ScheduleType scheduleTypeIn = ScheduleType::INNER_PRODUCT, MatrixTraverse traverseIn = MatrixTraverse::NOSET,
        bool enVecND2NZIn = false) {
        mmConfigType = mmConfigTypeIn;
        enableL1CacheUB = enableL1CacheUBIn;
        scheduleType = scheduleTypeIn;
        traverse = traverseIn;
        enVecND2NZ = enVecND2NZIn;
    }
};
Table 2 Parameters in the MatmulConfigParams structure

Parameter

Description

mmConfigType

Matmul template type, which must be the same as the template created by the Matmul object. Currently, the value can only be 0 or 1.

  • 0: indicates the Norm template
  • 1 (default): indicates the MDL template.

enableL1CacheUB

Whether to cache UB computing blocks in L1. It can be enabled in scenarios where MTE3 and MTE2 serial pipelines are applied.

  • false (default): does not cache UB computing blocks in L1.
  • true: caches UB computing blocks in L1.

scheduleType

Matmul data movement mode. The values are as follows:

  • ScheduleType::INNER_PRODUCT (default): performs MTE1 cyclic movement in the K direction.
  • ScheduleType::OUTER_PRODUCT: performs MTE1 cyclic movement in the M or N direction.
  • ScheduleType::N_BUFFER_33: indicates the data movement mode of the NBuffer33 template. MTE2 transfers 1 × 3 base blocks of matrix A each time until all 3 × 3 base blocks of matrix A are loaded to the L1 buffer.

traverse

Cyclic iteration sequence for Matmul to perform matrix computation, that is, the offset sequence in which Matmul automatically offsets to the output position of Matrix C for the next iteration after a matrix C slice with the size of [baseM, baseN] is computed in one iteration. The values are as follows:

1
2
3
4
5
enum class MatrixTraverse{
    NOSET = 0,   // Invalid currently.
    FIRSTM,      // Offset to the M-axis direction and then to the N-axis direction.
    FIRSTN,      // Offset to the N-axis direction and then to the M-axis direction.
};

enVecND2NZ

Whether to enable ND2NZ.

Returns

None

Restrictions

  • This API must be called before the GetTiling API.
  • If the Matmul object uses the NBuffer33 template policy, which means MatmulPolicy is set to NBuffer33MatmulPolicy, before calling the GetTiling API to generate tiling parameters, you must call the API in this section to set the scheduleTypeIn parameter to ScheduleType::N_BUFFER_33 to enable the tiling generation logic of the NBuffer33 template policy.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
auto ascendcPlatform = platform_ascendc::PlatformAscendC(context->GetPlatformInfo());
matmul_tiling::MatmulApiTiling tiling(ascendcPlatform); 
tiling.SetAType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16); 
tiling.SetBType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT16);   
tiling.SetCType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT);   
tiling.SetBiasType(matmul_tiling::TPosition::GM, matmul_tiling::CubeFormat::ND, matmul_tiling::DataType::DT_FLOAT);   
tiling.SetShape(1024, 1024, 1024);   
tiling.SetOrgShape(1024, 1024, 1024);  
tiling.SetBias(true);   
tiling.SetBufferSpace(-1, -1, -1);
tiling.SetMatmulConfigParams(0);  // Additional settings
// tiling.SetMatmulConfigParams({1, false, ScheduleType::OUTER_PRODUCT, MatrixTraverse::FIRSTM});
optiling::TCubeTiling tilingData;   
int ret = tiling.GetTiling(tilingData);