GetSpecialMDLConfig

Applicability

Product	Supported
Atlas A3 training products / Atlas A3 inference products	√
Atlas A2 training products / Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	x
Atlas inference product 's AI Core	x
Atlas inference product 's Vector Core	x
Atlas training products	x

Function

Configures the parameters of the SpecialMDL template and obtains the custom SpecialMDL template. For details about the SpecialMDL template, see Table 1.

Prototype

      
           __aicore__ constexpr MatmulConfig GetSpecialMDLConfig(const bool intrinsicsLimit = false, const bool batchLoop = false, const uint32_t doMTE2Preload = 0, const bool isVecND2NZ = false, bool isPerTensor = false, bool hasAntiQuantOffset = false)

Parameters

All parameters of this API are used to set the parameters of the MatmulConfig structure. The functions of the corresponding parameters are the same.

**Table 1** API parameters
Parameter	Input/Output	Description
intrinsicsLimit	Input	Sets the intrinsicsCheck parameter. Whether to enable cyclic data move-in from the Global Memory to L1 Buffer when the inner axis (last axis) of the left or right matrix on a single core is greater than or equal to 65535 (number of elements). For example, for the left matrix A [M, K], if singleCoreK of the inner axis on a single core is greater than 65535 and this parameter is set to true, data is moved in cyclically in the API. Values: false (default): When the inner axis of the left or right matrix on a single core is greater than or equal to 65535, data is not moved in cyclically. true: When the inner axis of the left or right matrix on a single core is greater than or equal to 65535, data is moved in cyclically.
batchLoop	Input	Sets the isNBatch parameter. Whether to enable multi-batch input and output. This parameter is valid only for BatchMatmul. After this parameter is enabled, only the Norm template is supported, and IterateNBatch needs to be called to implement multi-batch input and output. Values: false (default): disables the multi-batch function. true: enables the multi-batch function.
doMTE2Preload	Input	Sets the doMTE2Preload parameter. Whether to enable the preloading function in the M/N direction when MTE2 pipeline gap and the M/N value are large. After this function is enabled, the MTE2 pipeline gap is reduced and the performance is improved. The preloading function is valid only for the MDL template. Values: 0 (default): disables the function. 1: enables preloading in the M direction. 2: enables preloading in the N direction. Note: When preloading in the M/N direction is enabled, ensure that the data is fully loaded in the K direction and DoubleBuffer is enabled in the M/N direction. The condition for full load in the M direction is that singleCoreK/baseK is less than or equal to stepKa, and that in the N direction is that singleCoreK/baseK is less than or equal to stepKb. For details about how to use this parameter, see Matmul operator sample for preloading in the M and N directions.
isVecND2NZ	Input	Reserved parameter. Retain the default value.
isPerTensor	Input	Sets the isPerTensor parameter. Whether quantization for matrix B is conducted per tensor or per channel in the scenario where matrix A's input type is half and matrix B's input type is int8_t. true: quantization conducted per tensor false: quantization conducted per channel
hasAntiQuantOffset	Input	Sets the hasAntiQuantOffset parameter. Whether to use the offset coefficient when matrix B quantization is enabled in the scenario where matrix A's input type is half and matrix B's input type is int8_t.

Returns

MatmulConfig structure

Restrictions

None

Example

      
           constexpr MatmulConfig MM_CFG = GetSpecialMDLConfig();
AscendC::Matmul<A_TYPE, B_TYPE, C_TYPE, BIAS_TYPE, MM_CFG> mm;
REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling);
mm.SetTensorA(gm_a);
mm.SetTensorB(gm_b);
mm.SetBias(gm_bias);
mm.IterateAll(gm_c);

Parent topic: Matmul Kernel APIs