Overview
Basic APIs implement abstract hardware capabilities and open chip capabilities to ensure completeness and compatibility. APIs marked as Instruction Set Architecture Special Interface (ISASI) are related to the hardware architecture and the compatibility cannot be guaranteed across hardware versions.
The APIs are classified into the following types based on their functions:
- Scalar compute APIs: These APIs are used to call the Scalar Unit to perform computation.
- Vector compute APIs: These APIs are used to call the Vector Unit to perform computation.
- Cube compute APIs: These APIs are used to call the Cube Unit to perform computation.
- Data movement APIs: The compute APIs perform computation based on data in the local memory. Therefore, data needs to be moved from the global memory to the local memory, computed by calling the compute APIs, and then moved from the local memory to the global memory. The APIs that move data are called data movement APIs, for example, the DataCopy API.
- Resource management APIs: These APIs, such as the AllocTensor and FreeTensor APIs, are used to allocate and manage memory.
- Synchronization control APIs: These APIs, such as the EnQue and DeQue APIs, are used to implement communication and synchronization between tasks. Different API instructions may depend on each other. It can be learned from Abstract Hardware Architecture that different instructions are executed asynchronously in parallel. To ensure that instructions in different instruction queues are executed according to correct logic, synchronization instructions need to be sent to different units. Synchronization control APIs complete the process of sending synchronization instructions internally. You do not need to pay attention to the internal implementation logic. Instead, you can simply use the APIs to complete the process.
The APIs are classified into the following types based on their data operation methods:
- Contiguous compute APIs: These APIs support computation of the first n data elements of a tensor. They compute n contiguous data elements of the source operand and contiguously write the data to the destination operand to solve the contiguous compute problem of the one-dimensional tensor.
1Add(dst, src1, src2, n);
- High-dimensional sharding APIs: These APIs support repeat and stride. They are flexible compute APIs that provide equivalent programming capabilities as built-in APIs, fully leveraging hardware advantages. They support operations on parameters such as DataBlock Stride, Repeat stride, and Mask for each operand.
The following figure uses vector addition as an example to illustrate the characteristics of contiguous compute APIs and high-dimensional sharding APIs.
Figure 1 Characteristics of computation methods of compute APIs
Parent topic: Basic APIs