GetIBShareNormConfig

Applicability

Product	Supported
Atlas A3 training products / Atlas A3 inference products	√
Atlas A2 training products / Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	x
Atlas inference product 's AI Core	x
Atlas inference product 's Vector Core	x
Atlas training products	x

Function

Configures the parameters of the IBShare template and obtains the custom IBShare template. For details about the IBShare template, see Table 1.

Prototype

      
           __aicore__ constexpr MatmulConfig GetIBShareNormConfig(const bool intrinsicsLimit = false, const bool batchLoop = false, const bool isVecND2NZ = false, const BatchMode bmmMode = BatchMode::BATCH_LESS_THAN_L1, const bool isDoubleCache = false, const bool enUnitFlag = true)

Parameters

All parameters of this API are used to set the parameters of the MatmulConfig structure. The functions of the corresponding parameters are the same.

**Table 1** API parameters
Parameter	Input/Output	Description
intrinsicsLimit	Input	Sets the intrinsicsCheck parameter. Whether to enable cyclic data move-in from the Global Memory to L1 Buffer when the inner axis (last axis) of the left or right matrix on a single core is greater than or equal to 65535 (number of elements). For example, for the left matrix A [M, K], if singleCoreK of the inner axis on a single core is greater than 65535 and this parameter is set to true, data is moved in cyclically in the API. Values: false (default): When the inner axis of the left or right matrix on a single core is greater than or equal to 65535, data is not moved in cyclically. true: When the inner axis of the left or right matrix on a single core is greater than or equal to 65535, data is moved in cyclically.
batchLoop	Input	Sets the isNBatch parameter. Whether to enable multi-batch input and output. This parameter is valid only for BatchMatmul. After this parameter is enabled, only the Norm template is supported, and IterateNBatch needs to be called to implement multi-batch input and output. Values: false (default): disables the multi-batch function. true: enables the multi-batch function.
isVecND2NZ	Input	Sets the enVecND2NZ parameter. Whether to enable ND2NZ (converting data from ND format to NZ format) using vector. To enable this function, you need to set SetLocalWorkspace. Values: false (default): disables ND2NZ using the vector. true: enables ND2NZ using the vector. For Atlas inference product 's AI Core, when the Unified Buffer space is sufficient (Unified Buffer space is greater than twice the value of transLength of TCubeTiling), you are advised to enable this parameter for better data movement.
bmmMode	Input	Sets the batchMode parameter. This parameter is used in the BatchMatmul scenario. For details about BatchMatmul, see Basic Functions of Batch Matmul. Relationship between the total amount of multi-batch data for input matrices A and B in a BatchMatmul operation and the size of L1 Buffer when the layout type is set to Normal in the BatchMatmul scenario. Values: BatchMode::BATCH_LESS_THAN_L1: Total amount of multi-batch data < Size of L1 Buffer BatchMode::BATCH_LARGE_THAN_L1: Total amount of multi-batch data > Size of L1 Buffer BatchMode::SINGLE_LARGE_THAN_L1: Total amount of single-batch data > Size of L1 Buffer
isDoubleCache	Input	Sets the enableDoubleCache parameter. Whether to cache two blocks in L1 Buffer after the IBShare template is enabled. Values: false (default): caches one block in L1 Buffer. true: caches two blocks in L1 Buffer. Note: If this parameter is set to true, the base block size must be controlled to ensure that the cached data blocks do not exceed the L1 Buffer capacity.
enUnitFlag	Input	Sets the enUnitFlag parameter. Whether to enable the UnitFlag function to allow parallel execution of computation and data movement for performance improvement. By default, the function is enabled when the Norm and IBShare templates are used and disabled when the MDL template is used. Values: false: disables the UnitFlag function. true: enables the UnitFlag function.

Returns

MatmulConfig structure

Restrictions

The IBShare template applies only to the MIX scenario and does not support the CUBE_ONLY scenario.

Example

      
           constexpr MatmulConfig MM_CFG = GetIBShareNormConfig();
typedef AscendC::MatmulType<AscendC::TPosition::GM, CubeFormat::ND, half> aType; 
typedef AscendC::MatmulType<AscendC::TPosition::GM, CubeFormat::ND, half, true, LayoutMode::NONE, true> bType; 
typedef AscendC::MatmulType<AscendC::TPosition::GM, CubeFormat::ND, float> cType; 
typedef AscendC::MatmulType<AscendC::TPosition::GM, CubeFormat::ND, float> biasType; 
AscendC::Matmul<A_TYPE, B_TYPE, C_TYPE, BIAS_TYPE, MM_CFG> mm;
REGIST_MATMUL_OBJ(&pipe, GetSysWorkSpacePtr(), mm, &tiling);
mm.SetTensorA(gm_a);
mm.SetTensorB(gm_b);
mm.SetBias(gm_bias);
mm.IterateAll(gm_c);

Parent topic: Matmul Kernel APIs