GetBasicConfig
Applicability
|
Product |
Supported |
|---|---|
|
|
√ |
|
|
√ |
|
|
x |
|
|
x |
|
|
x |
|
|
x |
Function
Configures the parameters of the BasicBlock template and obtains the custom BasicBlock template. For details about the BasicBlock template, see Table 1.
You are advised to use constant templates for the API. The BasicBlock template implements constant baseM, baseN, and baseK, while constant templates can be used to implement constant singleCoreM, singleCoreN, singleCoreK, baseM, baseN, and baseK. For details about how to implement constant templates, see Function.
Prototype
1
|
__aicore__ constexpr MatmulConfig GetBasicConfig(const uint32_t basicM, const uint32_t basicN, const uint32_t basicK, const bool intrinsicsLimit = false, const bool batchLoop = false, const BatchMode bmmMode = BatchMode::BATCH_LESS_THAN_L1) |
Parameters
All parameters of this API are used to set the parameters of the MatmulConfig structure. The functions of the corresponding parameters are the same.
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
basicM |
Input |
Sets the basicM parameter. Equivalent to the baseM parameter in the TCubeTiling structure. It indicates the length of the M axis of a base block during Matmul computation. The unit is element. |
|
basicN |
Input |
Sets the basicN parameter. Equivalent to the baseN parameter in the TCubeTiling structure. It indicates the length of the N axis of a base block during Matmul computation. The unit is element. |
|
basicK |
Input |
Sets the basicK parameter. Equivalent to the baseK parameter in the TCubeTiling structure. It indicates the length of the K axis of a base block during Matmul computation. The unit is element. |
|
intrinsicsLimit |
Input |
Sets the intrinsicsCheck parameter. Whether to enable cyclic data move-in from the Global Memory to L1 Buffer when the inner axis (last axis) of the left or right matrix on a single core is greater than or equal to 65535 (number of elements). For example, for the left matrix A [M, K], if singleCoreK of the inner axis on a single core is greater than 65535 and this parameter is set to true, data is moved in cyclically in the API. Values:
|
|
batchLoop |
Input |
Sets the isNBatch parameter. Whether to enable multi-batch input and output. This parameter is valid only for BatchMatmul. After this parameter is enabled, only the Norm template is supported, and IterateNBatch needs to be called to implement multi-batch input and output. Values:
|
|
bmmMode |
Input |
Sets the batchMode parameter. This parameter is used in the BatchMatmul scenario. For details about BatchMatmul, see Batch Matmul basic functions. Relationship between the total amount of multi-batch data for input matrices A and B in a BatchMatmul operation and the size of L1 Buffer when the layout type is set to Normal in the BatchMatmul scenario. Values:
|
Returns
Restrictions
- When this API is used, the base block sizes baseM and baseN must meet the following requirements: singleCoreM must be exactly divisible by baseM, and singleCoreN must be exactly divisible by baseN.
- The values of basicM, basicN, and basicK in this API must be the same as those of baseM, baseN, and baseK in atlasascendc_api_07_0673.html#EN-US_TOPIC_0000002502733188__p17899165811566.
Examples
For details about the BasicBlock template usage example, see basic_block_matmul.
1 2 3 4 5 6 7 |
constexpr MatmulConfig MM_CFG = GetBasicConfig(128, 256, 64); // baseM, baseN, baseK AscendC::Matmul<A_TYPE, B_TYPE, C_TYPE, BIAS_TYPE, MM_CFG> mm; REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling); mm.SetTensorA(gm_a); mm.SetTensorB(gm_b); mm.SetBias(gm_bias); mm.IterateAll(gm_c); |