SetBlockDim

Function Usage

Sets blockDim, which refers to the number of Vector cores or Cube cores involved in computation.

Prototype

ge::graphStatus SetBlockDim(const uint32_t block_dim)

Parameters

Parameter

Input/Output

Description

block_dim

Input

blockDim is a concept about logical cores, and its value range is [1, 65535]. To fully utilize hardware resources, set this parameter to the number of physical cores or a multiple of the number of physical cores.

  • In coupling mode and separation mode, the meaning and setting rules of blockDim at runtime are different. The details are as follows:
    • Coupling mode: Because the Vector and Cube Units are integrated, blockDim is used to start multiple AI Core instances, without distinguishing between these units. The number of AI Cores can be obtained using GetCoreNumAiv or GetCoreNumAic.
    • Separation mode
      • For operators that contain only Vector Units, blockDim is used to set the number of vector (AIV) instances to be started. For example, if an AI Processor has 40 vector cores, set blockDim to 40.
      • For operators that contain only Cube Units, blockDim is used to set the number of cube (AIC) instances to be started. For example, if an AI Processor has 20 Cube cores, set blockDim to 20.
      • Operators for Vector/Cube fusion computing are started by groups of AIVs and AICs. blockDim is used to set the number of groups to be started. For example, if an AI Processor has 40 Vector cores and 20 Cube cores, a group consists of two Vector cores and one Cube core. Set the number of groups to 20. In this case, 20 groups are started, including 40 Vector cores and 20 Cube cores. Note: In this scenario, the number of blockDim (logical cores) cannot exceed the number of physical cores (a physical core contains two Vector cores and one Cube core).
      • The number of AIC and AIV cores can be obtained by calling GetCoreNumAic and GetCoreNumAiv, respectively.
  • When the device resource limit is set, the number of blockDim cores cannot exceed the number of physical cores obtained by calling APIs such as GetCoreNumAiv. For example, if aclrtSetStreamResLimit is used to set the stream-level device resource limit to eight cores, blockDim cannot exceed 8. Otherwise, resources of other streams are preempted, causing the resource limit to become invalid.
  • If the device resource limit feature is used, the value of blockDim set for the operator cannot exceed the number of cores returned by an API (like GetCoreNum, GetCoreNumAic, or GetCoreNumAiv) from PlatformAscendC. For example, if aclrtSetStreamResLimit is used to set the number of stream-level vector cores to 8, the return value of GetCoreNumAiv is 8. The value of blockDim set for the vector operator cannot exceed 8. Otherwise, resources of other streams are preempted, causing the resource limit to become invalid.

Returns

ge::GRAPH_SUCCESS on success.

For details about the definition of graphStatus, see ge::graphStatus.

Constraints

None

Examples

ge::graphStatus Tiling4XXX(TilingContext* context) {
  auto ret = context->SetBlockDim(32);
  // ...
}