TPosition
When managing physical memories at different levels, Ascend C uses an abstract logical position (TPosition) to express memory at different levels, replacing on-chip physical storage and hiding the hardware architecture. The main TPosition types include VECIN, VECOUT, VECCALC, A1, A2, B1, B2, CO1, and CO2. VECIN, VECCALC, and VECOUT are used for vector programming, while A1, A2, B1, B2, CO1, and CO2 are used for matrix programming. For details about the basic concepts of TPosition, see Programming Paradigm. For details about the mapping between TPosition and physical storage, see Table 1.
TPosition is defined as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
enum class TPosition : uint8_t { GM, A1, A2, B1, B2, C1, C2, CO1, CO2, VECIN, VECOUT, VECCALC, LCM = VECCALC, SPM, SHM = SPM, TSCM, C2PIPE2GM, C2PIPE2LOCAL, MAX, }; |
The enumerated values of TPosition are defined as follows.
|
Enumerated Value |
Description |
|---|---|
|
GM |
Global memory, corresponding to the external memory of AI Core. |
|
VECIN |
Used for vector computation; storage location of the move-in data. This location is used when data is moved in to the Vector Unit. |
|
VECOUT |
Used for vector computation; storage location of the move-out data. This location is used when moving out the result from the Vector Unit. |
|
VECCALC |
Used for vector/matrix computation. This location is used when temporary variables are required for the computation. |
|
A1 |
Used for matrix computation and used to store the entire matrix A, which is similar to the level-2 cache in the multi-level cache of the CPU. |
|
B1 |
Used for matrix computation and used to store the entire matrix B, which is similar to the level-2 cache in the multi-level cache of the CPU. |
|
C1 |
Used for matrix computation and used to store the entire bias matrix, which is similar to the level-2 cache in the multi-level cache of the CPU. |
|
A2 |
Used for matrix computation and used to store the split smaller matrix A, which is similar to the level-1 cache in the multi-level cache of the CPU. |
|
B2 |
Used for matrix computation and used to store the split smaller matrix B, which is similar to the level-1 cache in the multi-level cache of the CPU. |
|
C2 |
Used for matrix computation and used to store the split smaller bias matrix, which is similar to the level-1 cache in the multi-level cache of the CPU. |
|
CO1 |
Used for matrix computation and used to store the small-block result matrix C, which can be considered as Cube Out. |
|
CO2 |
Used for matrix computation and used to store the entire result matrix C, which can be considered as Cube Out. |
|
SPM |
Used to temporarily store data in the Unified Buffer when the Unified Buffer may overflow. |
|
TSCM |
Used to temporarily swap data to extra space for Matmul operation. TSCM is short for Temp Swap Cache Memory. |
|
C2PIPE2GM |
Used to store FixPipe quantization parameters. |
|
C2PIPE2LOCAL |
Reserved parameter for future use. |