Template Argument Declaration

Function

Defines the template argument declaration ASCENDC_TPL_ARGS_DECL and template argument selection ASCENDC_TPL_ARGS_SEL (available template). For details, see Tiling Template Programming.

Prototype

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// ParamStruct is a structure that stores the template argument declaration ASCENDC_TPL_ARGS_DECL and template argument selection ASCENDC_TPL_ARGS_SEL set by the user. It is used for subsequent encoding and decoding between the TilingKey and template arguments and can be ignored.
struct ParamStruct {
    const char* name;
    uint32_t paramType;
    uint8_t bitWidth;
    std::vector<uint64_t> vals;
    const char* macroType;
    ParamStruct(const char* inName, uint32_t inParamType, uint8_t inBitWidth, std::vector<uint64_t> inVals,
        const char* inMacroType):
        name(inName), paramType(inParamType), bitWidth(inBitWidth), vals(std::move(inVals)),
        macroType(inMacroType) {}
};
using TilingDeclareParams = std::vector<ParamStruct>;
using TilingSelectParams = std::vector<std::vector<ParamStruct>>;

// APIs for defining template arguments
#define ASCENDC_TPL_DTYPE_DECL(x, ...) ParamStruct{#x, ASCENDC_TPL_DTYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "DECL"}
#define ASCENDC_TPL_DATATYPE_DECL(x, ...) ParamStruct{#x, ASCENDC_TPL_DTYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "DECL"}
#define ASCENDC_TPL_FORMAT_DECL(x, ...) ParamStruct{#x, ASCENDC_TPL_FORMAT, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "DECL"}
#define ASCENDC_TPL_UINT_DECL(x, bw, ...) ParamStruct{#x, ASCENDC_TPL_UINT, bw, {__VA_ARGS__}, "DECL"}
#define ASCENDC_TPL_BOOL_DECL(x, ...) ParamStruct{#x, ASCENDC_TPL_BOOL, ASCENDC_TPL_1_BW, {__VA_ARGS__}, "DECL"}
#define ASCENDC_TPL_KERNEL_TYPE_DECL(x, ...) ParamStruct{#x, ASCENDC_TPL_SHARED_KERNEL_TYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "DECL"}

#define ASCENDC_TPL_DTYPE_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_DTYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_DATATYPE_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_DTYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_FORMAT_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_FORMAT, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_UINT_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_UINT, 0, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_BOOL_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_BOOL, ASCENDC_TPL_1_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_KERNEL_TYPE_SEL(...) ParamStruct{"kernel_type", ASCENDC_TPL_KERNEL_TYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_DETERMINISTIC_SEL(...) ParamStruct{"deterministic", ASCENDC_TPL_DETERMINISTIC, ASCENDC_TPL_1_BW, {__VA_ARGS__}, "SEL"}
#define ASCENDC_TPL_SHARED_KERNEL_TYPE_SEL(x, ...) ParamStruct{#x, ASCENDC_TPL_SHARED_KERNEL_TYPE, ASCENDC_TPL_8_BW, {__VA_ARGS__}, "SEL"}

#define ASCENDC_TPL_ARGS_DECL(x, ...) static TilingDeclareParams g_tilingDeclareParams{ __VA_ARGS__ }
#define ASCENDC_TPL_ARGS_SEL(...) { __VA_ARGS__}
#define ASCENDC_TPL_SEL(...) static TilingSelectParams g_tilingSelectParams{ __VA_ARGS__ }

Parameters

Table 1 Arguments in the tiling template

Macro

Function

Description

ASCENDC_TPL_ARGS_DECL(args0, ...)

Overall template argument declaration of an operator.

  • args0: operator type.
  • args1-argsn: The following are definitions of template parameters of the DTYPE, FORMAT, UINT, BOOL, and KERNEL_TYPE types, which are defined by using ASCENDC_TPL_DTYPE_DECL, ASCENDC_TPL_DATATYPE_DECL, ASCENDC_TPL_FORMAT_DECL, ASCENDC_TPL_UINT_DECL, ASCENDC_TPL_BOOL_DECL and ASCENDC_TPL_KERNEL_TYPE_DECL respectively.

ASCENDC_TPL_DTYPE_DECL(args0, ...)

Defines the template parameter of the user-defined data type.

  • args0: argument name.
  • args1-argsn: The following parameters are enumerated values of the user-defined data type.

ASCENDC_TPL_DATATYPE_DECL(args0, ...)

Defines the template parameter of the native data type.

  • args0: argument name.
  • args1-argsn: There are two cases. The following parameters are enumerated native data type options, or the index values of the corresponding input parameters (specified by ASCENDC_TPL_INPUT(x), where x is the corresponding value) or the index values of the corresponding output parameters (specified by ASCENDC_TPL_OUTPUT(x), where x is the corresponding value). Note that only the first parameter takes effect if there are multiple parameters.
  • The following native data types are supported. For details about the data types, see C_DataType.
    C_DT_FLOAT
    C_DT_FLOAT16
    C_DT_INT8
    C_DT_INT32
    C_DT_UINT8
    C_DT_INT16
    C_DT_UINT16
    C_DT_UINT32
    C_DT_INT64
    C_DT_UINT64
    C_DT_DOUBLE
    C_DT_BOOL
    C_DT_COMPLEX64
    C_DT_BF16
    C_DT_INT4
    C_DT_UINT1
    C_DT_INT2
    C_DT_COMPLEX32
    C_DT_HIFLOAT8
    C_DT_FLOAT8_E5M2
    C_DT_FLOAT8_E4M3FN
    C_DT_FLOAT4_E2M1
    C_DT_FLOAT4_E1M2

ASCENDC_TPL_FORMAT_DECL(args0, ...)

The following modes are supported:

1. The template parameters are all of the user-defined format type.

2. The template parameters are all of the native format type.

  • args0: argument name.
  • args1-argsn: There are two modes.
    • 1. The subsequent parameters are enumerated values of the custom format.
    • 2. There are two cases in this mode: The subsequent parameters are enumerated values of the native format; or the index value of the corresponding input parameter (specified by ASCENDC_TPL_INPUT(x), where x is the corresponding value) or the index value of the corresponding output parameter (specified by ASCENDC_TPL_OUTPUT(x), where x is the corresponding value). Note that only the first index value takes effect if there are multiple index values.
  • The following native format options are supported. For details about the data formats, see C_Format.
    C_FORMAT_NCHW
    C_FORMAT_NHWC
    C_FORMAT_ND
    C_FORMAT_NC1HWC0
    C_FORMAT_FRACTAL_Z
    C_FORMAT_NC1C0HWPAD
    C_FORMAT_NHWC1C0
    C_FORMAT_FSR_NCHW
    C_FORMAT_FRACTAL_DECONV
    C_FORMAT_C1HWNC0
    C_FORMAT_FRACTAL_DECONV_TRANSPOSE
    C_FORMAT_FRACTAL_DECONV_SP_STRIDE_TRANS
    C_FORMAT_NC1HWC0_C04
    C_FORMAT_FRACTAL_Z_C04
    C_FORMAT_CHWN
    C_FORMAT_FRACTAL_DECONV_SP_STRIDE8_TRANS
    C_FORMAT_HWCN
    C_FORMAT_NC1KHKWHWC0
    C_FORMAT_BN_WEIGHT
    C_FORMAT_FILTER_HWCK
    C_FORMAT_HASHTABLE_LOOKUP_LOOKUPS
    C_FORMAT_HASHTABLE_LOOKUP_KEYS
    C_FORMAT_HASHTABLE_LOOKUP_VALUE
    C_FORMAT_HASHTABLE_LOOKUP_OUTPUT
    C_FORMAT_HASHTABLE_LOOKUP_HITS
    C_FORMAT_C1HWNCoC0
    C_FORMAT_MD
    C_FORMAT_NDHWC
    C_FORMAT_FRACTAL_ZZ
    C_FORMAT_FRACTAL_NZ
    C_FORMAT_NCDHW
    C_FORMAT_DHWCN
    C_FORMAT_NDC1HWC0
    C_FORMAT_FRACTAL_Z_3D
    C_FORMAT_CN
    C_FORMAT_NC
    C_FORMAT_DHWNC
    C_FORMAT_FRACTAL_Z_3D_TRANSPOSE
    C_FORMAT_FRACTAL_ZN_LSTM
    C_FORMAT_FRACTAL_Z_G
    C_FORMAT_RESERVED
    C_FORMAT_ALL
    C_FORMAT_NULL
    C_FORMAT_ND_RNN_BIAS
    C_FORMAT_FRACTAL_ZN_RNN
    C_FORMAT_NYUV
    C_FORMAT_NYUV_A
    C_FORMAT_NCL
    C_FORMAT_FRACTAL_Z_WINO
    C_FORMAT_C1HWC0
    C_FORMAT_FRACTAL_NZ_C0_16
    C_FORMAT_FRACTAL_NZ_C0_32
    C_FORMAT_FRACTAL_NZ_C0_2
    C_FORMAT_FRACTAL_NZ_C0_4
    C_FORMAT_FRACTAL_NZ_C0_8

ASCENDC_TPL_UINT_DECL(args0, args1, args2, ...)

Template argument declaration of the unsigned integer (UINT) type.

  • args0: argument name.
  • args1: maximum bit width. The number of template arguments cannot exceed the maximum bit width.
  • args2: mode defined by the arguments. The following three modes are supported:
    • ASCENDC_TPL_UI_RANGE: range mode. The first value indicates the number of ranges. Every two values following the first value are grouped indicating the start and end positions of the range. Note that the number of defined ranges must be the same as the number of subsequent groups.

      Example: ASCENDC_TPL_UINT_DECL(args0, args1,ASCENDC_TPL_UI_RANGE,2,0,2,3,5) indicates that there are two groups of parameters, and the ranges of the two groups are {0, 2} and {3, 5}. Therefore, the valid values of the UINT parameter defined by this parameter are {0, 1, 2, 3, 4, 5}.

    • ASCENDC_TPL_UI_LIST: exhaustive mode. If this mode is set, all argument values will be listed.

      Example: ASCENDC_TPL_UINT_DECL(args0, args1,ASCENDC_TPL_UI_LIST,10,12,13,9,8,7,6) indicates that there is one group of parameters to be enumerated, and [10, 12,13,9,8,7, 6] is the enumerated value. Therefore, the valid values of the UINT parameter defined by this parameter are {10, 12,13,9,8,7, 6}.

    • ASCENDC_TPL_UI_MIX: mixed mode. If this mode is set, the first n values are the argument definitions of the range mode, and the last m values are the argument definitions of the exhaustive mode.

      Example:

      ASCENDC_TPL_UINT_DECL(args0, args1,ASCENDC_TPL_UI_MIX,2,0,2,3, 5, 10, 12, 13, 9, 8) indicates that there are two groups of exhaustive parameters, and the two groups are {0, 2} and {3, 5}. [10, 12, 13, 9, 8] are exhaustive values. Therefore, the valid values of the UINT parameter defined by this parameter are {0, 1, 2, 3, 4, 5, 10, 12, 13, 9, 8}.

  • args3-argsn: parameter values corresponding to different range modes.

ASCENDC_TPL_BOOL_DECL(args0, ...)

Template argument declaration of the bool type.

args0: argument name.

args1-args2: The value can be 0 or 1.

ASCENDC_TPL_KERNEL_TYPE_DECL(args0, ...)

Defines the kernel type of the operator template parameter.

args0: argument name.

args1-argsn: The subsequent values are several kernel types.

Currently, the following kernel types are supported:

  • ASCENDC_TPL_AIV_ONLY // Only the Vector core on the AI Core is started during operator execution.
  • ASCENDC_TPL_AIC_ONLY // Only the Cube core on the AI Core is started during operator execution.
  • ASCENDC_TPL_MIX_AIV_1_0 // Only the Vector core on the AI Core is started during operator execution in the AIC and AIV mixed scenario.
  • ASCENDC_TPL_MIX_AIC_1_0 // Only the Cube core on the AI Core is started during operator execution in the AIC and AIV mixed scenario.
  • ASCENDC_TPL_MIX_AIC_1_1 // Both the Cube core and Vector core on the AI Core are started during operator execution in the AIC and AIV mixed scenario, with the ratio of 1:1.
  • ASCENDC_TPL_MIX_AIC_1_2 // Both the Cube core and Vector core on the AI Core are started during operator execution in the AIC and AIV mixed scenario, with the ratio of 1:2.
  • ASCENDC_TPL_AICORE // Only the AI Core is started during operator execution.
  • ASCENDC_TPL_VECTORCORE // This parameter is reserved and is not supported in the current version.
  • ASCENDC_TPL_MIX_AICORE // This parameter is reserved and is not supported in the current version.
  • ASCENDC_TPL_MIX_VECTOR_CORE // The AI Core and Vector Core are both started during operator execution.

This API can only be used together with ASCENDC_TPL_SHARED_KERNEL_TYPE_SEL(args0,...).

Table 2 Definition of tiling template argument selection

Macro

Function

Description

ASCENDC_TPL_SEL(...)

Overall template argument selection of an operator.

Template argument selection that contains multiple operators.

ASCENDC_TPL_ARGS_SEL(...)

Template argument selection of an operator.

Template argument selection of an operator.

ASCENDC_TPL_KERNEL_TYPE_SEL(args0)

Sets the kernel type of the operator template parameter combination. However, this parameter cannot be passed as a template parameter of the kernel function.

args0: kernel type of the operator in the template parameter combination. If this parameter is not specified, the automatic derivation process is used. For all operators under ASCENDC_TPL_SEL, the kernel type must be the same.

Currently, the following kernel types are supported:

  • ASCENDC_TPL_AIV_ONLY // Only the Vector Core on the AI Core is started during operator execution.
  • ASCENDC_TPL_AIC_ONLY // Only the Cube Core on the AI Core is started during operator execution.
  • ASCENDC_TPL_MIX_AIV_1_0 // In the AIC and AIV mixed scenario, only the Vector Core on the AI Core is started during operator execution.
  • ASCENDC_TPL_MIX_AIC_1_0 // In the AIC and AIV mixed scenario, only the Cube Core on the AI Core is started during operator execution.
  • ASCENDC_TPL_MIX_AIC_1_1 // In the AIC and AIV mixed scenario, the Cube Core and Vector Core on the AI Core are started at the same time during operator execution, with the ratio of 1:1.
  • ASCENDC_TPL_MIX_AIC_1_2 // In the AIC and AIV mixed scenario, the Cube Core and Vector Core on the AI Core are started at the same time during operator execution, with the ratio of 1:2.
  • ASCENDC_TPL_AICORE // Only the AI Core is started during operator execution.
  • ASCENDC_TPL_VECTORCORE // This parameter is reserved and is not supported in the current version.
  • ASCENDC_TPL_MIX_AICORE // This parameter is reserved and is not supported in the current version.
  • ASCENDC_TPL_MIX_VECTOR_CORE // The AI Core and Vector Core are both started during operator execution.

    This API is used to configure the kernel type. The value range of the kernel type is the same as that of the KERNEL_TASK_TYPE_DEFAULT API. For details, see Setting the Kernel Type.

ASCENDC_TPL_DTYPE_SEL(args0, ...)

Customized template parameter combination of the DataType type.

  • args0: argument name.
  • args1-argsn: subset of the argument ranges defined in ASCENDC_TPL_DTYPE_DECL.

ASCENDC_TPL_DATATYPE_SEL(args0, ...)

Template parameter combination of the native DataType type

  • args0: argument name.
  • args1 to argsn: The subsequent parameters are a subset of the parameter value ranges defined in ASCENDC_TPL_DATATYPE_DECL.

ASCENDC_TPL_FORMAT_SEL(args0, ...)

Template argument selection of Format.

  • args0: argument name.
  • args1 to argsn: The subsequent parameters are a subset of the parameter value ranges defined in ASCENDC_TPL_FORMAT_DECL.

ASCENDC_TPL_UINT_SEL(args0, args1, args2, ...)

Template argument selection of the UINT type.

  • args0: argument name.
  • args1: mode defined by an argument. The following values are supported:
    • ASCENDC_TPL_UI_RANGE: range mode.
    • ASCENDC_TPL_UI_LIST: exhaustive mode.
    • ASCENDC_TPL_UI_MIX: mixed mode.
  • args2-argsn: subset of the argument ranges defined in ASCENDC_TPL_UINT_DECL.

For details about how to configure the mode and arguments, see ASCENDC_TPL_UINT_DECL(args0, args1, args2, ...).

ASCENDC_TPL_BOOL_SEL(args0, ...)

Template argument selection of the bool type.

args0: argument name.

args1-args2: subset of the argument ranges defined in ASCENDC_TPL_BOOL_DECL.

ASCENDC_TPL_DETERMINISTIC_SEL(args0)

This template parameter combination is used to configure whether to enable deterministic computing.

args0: indicates the parameter name. The value can be true, false, 1, or 0. [true/1] indicates that deterministic computing is enabled for the template parameter combination, and [false/0] indicates that deterministic computing is disabled. Note that this value is not used as the input parameter of the operator template. When this value is enabled for compilation, -DDETERMINISTIC_MODE=1 is added, and a JSON file and an .o file ending with _deterministic are generated, for example, "AddCustomTemplate_816f04e052850554f4b3cacb35f8e8c6_deterministic.json"/"AddCustomTemplate_816f04e052850554f4b3cacb35f8e8c6_deterministic.o".

Note: If the deterministic computing version is compiled using the ASCENDC_TPL_DETERMINISTIC_SEL(true) API, the deterministic computing switch needs to be turned on during operator calling. For example, when the aclnn single-operator is called, the aclrtCtxSetSysParamOpt API needs to be used to perform related configurations.

This parameter is supported only by the following models:

  • Atlas A3 training products / Atlas A3 inference products
  • Atlas A2 training products / Atlas A2 inference products

ASCENDC_TPL_SHARED_KERNEL_TYPE_SEL(args0, ...)

Sets the kernel type of the operator template parameter combination. This parameter can be transferred as the template parameter of the kernel function.

args0: argument name.

args1-argsn: kernel type of the operator in the template parameter combination. The subsequent parameters are several kernel types. This API cannot be used together with the ASCENDC_TPL_KERNEL_TYPE_SEL API.

If the KERNEL_TASK_TYPE_DEFAULT(value) API is also used, this API has a higher priority.

Returns

None

Constraints

After the values of template arguments are modified or added, the custom operator package needs to be recompiled. The original operator binary files cannot be used.