aclrtLaunchKernelAttrValue

typedef union aclrtLaunchKernelAttrValue {
    uint8_t schemMode;
    uint32_t localMemorySize;
    aclrtEngineType engineType; 
    uint32_t blockDimOffset; 
    uint8_t isBlockTaskPrefetch; 
    uint8_t isDataDump; 
    uint16_t timeout;
    aclrtTimeoutUs timeoutUs;
    uint32_t rsv[4];
} aclrtLaunchKernelAttrValue;

Member

Description

schemMode

Scheduling mode.

The options are as follows:

  • 0: common scheduling mode. If there is an idle core, the operator execution is started. For example, if blockDim is set to 8, the operator kernel function will be executed on eight cores. If the common scheduling mode is specified, the operator execution is started as long as one core is idle.
  • 1: batch scheduling mode. The operator execution is started only when all required cores are idle. For example, if blockDim is set to 8, the operator kernel function will be executed on eight cores. If the batch scheduling mode is specified, the operator execution is started only when all the eight cores are idle.

localMemorySize

Size of the internal UB buffer of the Vector Core required for executing the Single Instruction Multiple Thread (SIMT) operator, in bytes.

This parameter is not supported currently, and the related configuration does not take effect.

engineType

Operator execution engine. For details about the values, see aclrtEngineType.

Only Atlas inference products support this parameter.

This parameter does not take effect for the following products:

  • Atlas A3 training products/Atlas A3 inference products
  • Atlas A2 training products/Atlas A2 inference products
  • Atlas 200I/500 A2 inference products
  • Atlas training products

blockDimOffset

Block dimension offset.

  • If blockDim ≤ Number of AI Cores, computation on the Vector Core is not required. In this case, set engineType to ACL_RT_ENGINE_TYPE_AIC (indicating computation on the AI Core) and set blockDimOffset to 0.
  • If blockDim > Number of AI Cores, then:
    • Deliver a task in a stream, set engineType to ACL_RT_ENGINE_TYPE_AIC (indicating computation on the AI Core), and set blockDimOffset to 0.
    • Deliver a task in another stream, set engineType to ACL_RT_ENGINE_TYPE_AIV (indicating computation on the Vector Core), and set blockDimOffset to aicoreblockdim. The formula for calculating aicoreblockdim is as follows:
      • If blockDim ≤ Number of AI Cores + Number of Vector Cores, aicoreblockdim = Number of AI Cores.
      • Otherwise, aicoreblockdim = Roundup (blockDim x Number of AI Cores)/(Number of AI Cores + Number of Vector Cores).

Only Atlas inference products support this parameter.

This parameter is not supported by the following products:

  • Atlas A3 training products/Atlas A3 inference products
  • Atlas A2 training products/Atlas A2 inference products
  • Atlas 200I/500 A2 inference products
  • Atlas training products

isBlockTaskPrefetch

Whether to block hardware from prefetching the information of the current task when the task is delivered.

The options are as follows:

  • 0: Do not block.
  • 1: Block.

isDataDump

Whether to enable dump.

The options are as follows:

  • 0: Disable.
  • 1: Enable.

timeout

Timeout interval for the task scheduler to wait for task execution. This parameter applies only to the scenario where the AI CPU or AI Core operator is executed.

The options are as follows:

  • 0: Wait forever.
  • > 0: Specific timeout interval, in seconds.

timeoutUs

Timeout interval for the task scheduler to wait for task execution, in microseconds.

If both timeoutLow and timeoutHigh in the aclrtTimeoutUs struct are set to 0, it indicates forever waiting.

For the same launch kernel task, timeoutUs and timeout cannot be configured at the same time. Otherwise, an error is reported.

rsv

Reserved. The value is fixed at 0.