Async

Supported Products

Product

Supported/Unsupported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

x

Atlas inference product's AI Core

x

Atlas inference product's Vector Core

x

Atlas training products

x

Functions

Async provides a unified API for executing specific functions in different modes (AIC or AIV), thereby avoiding direct hardware condition judgment in code (such as using ASCEND_IS_AIV or ASCEND_IS_AIC).

Prototype

1
2
template <EngineType engine, auto funPtr, class... Args>
__aicore__ void Async(Args... args)

Parameters

Table 1 Parameters in the template

Parameter

Description

engine

Engine mode. The values are AIC and AIV.

1
2
3
4
enum class EngineType : int32_t {
    AIC = 1, // Only AIC
    AIV = 2 // Only AIV
};

funPtr

Function pointer, which specifies the function to be executed. The function signature and parameter types are determined by class... Determined by Args.

class... Args

A variable parameter template, which indicates the list of function parameter types and is used to pass the parameters to funPtr.

Table 2 Parameters

Parameter

Input/Output

Description

Args... args

Input

And class... Args, indicating the actual parameters passed to funPtr.

Returns

None

Restrictions

None

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
extern "C" __global__ __aicore__ void baremix_custom(GM_ADDR a, GM_ADDR b, GM_ADDR bias, GM_ADDR c,
                                                              GM_ADDR workspace, GM_ADDR tilingGm)
{
    KERNEL_TASK_TYPE_DEFAULT(KERNEL_TYPE_MIX_AIC_1_2);
    AscendC::TPipe pipe;
    TCubeTiling tiling;
    CopyTiling(&tiling, tilingGm);
    Async<EngineType::AIC, aicOperation>(a, b, bias, c, workspace, tiling, &pipe);
    Async<EngineType::AIV, aivOperation>(c, tiling, &pipe);

}
 
The following is an example of the functions executed on the AIC and AIV:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
    // Other code logic
    ...
__aicore__ inline void aicOperation(GM_ADDR a, GM_ADDR b, GM_ADDR bias, GM_ADDR c, GM_ADDR workspace, const TCubeTiling &tiling, AscendC::TPipe *pipe) {
    MatmulLeakyKernel<half, half, float, float> matmulLeakyKernel;
    matmulLeakyKernel.Init(a, b, bias, c, workspace, tiling, pipe);
    REGIST_MATMUL_OBJ(pipe, GetSysWorkSpacePtr(), matmulLeakyKernel.matmulObj, &matmulLeakyKernel.tiling);
    matmulLeakyKernel.Process(pipe);
}

__aicore__ inline void aivOperation(GM_ADDR c, const TCubeTiling &tiling, AscendC::TPipe *pipe) {
    LeakyReluKernel<float> leakyReluKernel;
    leakyReluKernel.Init(c, tiling, pipe);
    leakyReluKernel.Process(pipe);
}
    ...
    // Other code logic