How to Enable Vector Cores During Vector Programming
For
This section describes how to enable Vector Cores in
- When an operator is developed on the kernel, you need to use the macro KERNEL_TASK_TYPE_DEFAULT to enable Vector Cores. When the operator is executed, both AI Cores and Vector Cores are started. In this case, AI Cores are used as Vector Cores. The following code example shows how to enable Vector Cores:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
extern "C" __global__ __aicore__ void add_custom(__gm__ uint8_t *x, __gm__ uint8_t *y, __gm__ uint8_t *z, __gm__ uint8_t *workspace, __gm__ uint8_t *tiling) { GET_TILING_DATA(tilingData, tiling); if (workspace == nullptr) { return; } GM_ADDR usr = AscendC::GetUserWorkspace(workspace); KernelAdd op; op.Init(x, y, z, tilingData.blockDim, tilingData.totalLength, tilingData.tileNum); KERNEL_TASK_TYPE_DEFAULT(KERNEL_TYPE_MIX_VECTOR_CORE); // Enable Vector Cores. if (TILING_KEY_IS(1)) { op.Process1(); } else if (TILING_KEY_IS(2)) { op.Process2(); } // ... }
- In the tiling development on the host, blockDim indicates the total number of AI Cores and Vector Cores. For example, if it is set to 10 on the host, a total of 10 AI Cores and Vector Cores are started. To ensure that Vector Cores can be started, set blockDim to a value greater than the number of AI Cores. You can call the GetCoreNumAic API to obtain the number of AI Cores and call the GetCoreNumVector API to obtain the number of Vector Cores. The following code snippets are examples for the kernel launch project and custom operator project, where the total number of AI Cores and Vector Cores is set, indicating that all AI Cores and Vector Cores are started.
- Kernel launch project
1 2 3 4 5 6 7 8
auto ascendcPlatform = platform_ascendc::PlatformAscendCManager::GetInstance(); auto totalCoreNum = ascendcPlatform.GetCoreNumAic(); // Replace ASCENDXXX with the actual version number. if (ascendcPlatform.GetSocVersion() == platform_ascendc::SocVersion::ASCENDXXX) { totalCoreNum = totalCoreNum + ascendcPlatform.GetCoreNumVector(); } ... kernel_name<<<totalCoreNum , l2ctrl, stream>>>(argument list);
- Custom operator project
1 2 3 4 5 6 7 8 9 10 11 12
// Example of a matching tiling function on the host: ge::graphStatus TilingFunc(gert::TilingContext* context) { // Enable Vector Cores and set blockDim to the sum of vector cores in AI Cores and Vector Cores. auto ascendcPlatform = platform_ascendc::PlatformAscendC(platformInfo); auto totalCoreNum = ascendcPlatform.GetCoreNumAic(); // Replace ASCENDXXX with the actual version number. if (ascendcPlatform.GetSocVersion() == platform_ascendc::SocVersion::ASCENDXXX) { totalCoreNum = totalCoreNum + ascendcPlatform.GetCoreNumVector(); } context->SetBlockDim(totalCoreNum); }
- Kernel launch project
- Check whether the API supports the
Atlas inference product 's Vector Core by referring to the models supported in Ascend C API. - After Vector Cores are enabled, inter-core synchronization instructions (such as IBSet, IBWait, and SyncAll) are not supported, because AI Cores and Vector Cores are executed separately and scheduled by different tasks.
- When operator computation overflow occurs (the input Inf/NaN or the computation result exceeds the range), the results of AI Cores and Vector Cores are inconsistent. AI Cores support only the saturation mode, while Vector Cores support only the Inf/NaN mode.