Control Units
The control units provide instruction control for the entire computing process and are responsible for running the entire AI Core. Figure 1 shows the control units of an AI Core. For details about each module, see Table 1.
|
Control Unit/Instruction Queue |
Description |
|---|---|
|
Scalar Unit |
Scalar compute unit. |
|
Cube Queue |
Cube instruction queue. Instructions in the same queue are executed in sequence, and instructions in different queues can be executed in parallel. |
|
Vector Queue |
Vector instruction queue. Instructions in the same queue are executed in sequence, and instructions in different queues can be executed in parallel. |
|
MTE Queue |
MTE instruction queue. Instructions in the same queue are executed in sequence, and instructions in different queues can be executed in parallel. |
|
Event Sync |
A module used to control the dependency and synchronization between instructions across queues. |
Multiple instructions enter the instruction cache module of the AI Core from the system memory through the bus. Based on the instruction type, there are two kinds of subsequent instruction execution processes:
- For a scalar instruction, it will be executed immediately by the Scalar Unit.
- For other instructions, they are scheduled to five independent queues (Vector Queue, Cube Queue, and MTE1/MTE2/MTE3 Queues), and then allocated to an execution unit for execution.
- PipeBarrier synchronizes the instructions in the same queue. Instructions after the barrier cannot issue until all instructions before the barrier are committed.
- SetFlag and WaitFlag are a pair of inter-queue synchronization instructions.
- SetFlag: The current instruction starts to be executed after all read and write operations of the current instruction are completed and the corresponding flag bit in hardware is set to 1.
- WaitFlag: When this instruction is executed, if the corresponding flag bit is 0, the subsequent instructions in the queue are blocked; if the corresponding flag bit is 1, it is changed to 0, and subsequent instructions are executed.
Ascend C provides APIs for synchronization control. You can use this type of APIs to implement synchronization control. Generally, there is no need to consider synchronization when programming based on the programming model and paradigm described in Programming Model. The programming model implements synchronization control. Using the programming model and paradigm is recommended. Manual synchronization control may complicate programming.
However, we still hope that you can understand the basic principles of synchronization to better understand and design parallel computing programs. In a few cases, you need to manually insert synchronization. For details, see When Do I Need to Manually Insert Synchronization.
