Instructions
Description
The APIs provided in this section are used to implement the data queue and data forwarding functions on the device.
A queue can be operated only by a single consumer or producer. Enqueue or dequeue should be performed by a single thread. Concurrency is not supported.
Typical Application Scenario Example
You can use the queue management APIs and queue-based model loading APIs (aclmdlLoadFromFileWithQ or aclmdlLoadFromMemWithQ APIs) to implement data-driven model inference. The key API call sequence in this scenario is as follows:
- Call the acltdtCreateQueue API for multiple times to apply for queues for storing model input and output data. One input corresponds to one input queue ID, and one output corresponds to one output queue ID.
- Call the queue-based model loading API (aclmdlLoadFromFileWithQ or aclmdlLoadFromMemWithQ API) to transfer the model, input queue ID, and output queue ID.
- Allocate the memory for storing the input and output data.
- Read the input data for model inference into the memory and call the acltdtEnqueueData API to enqueue each input data.
At this time, AscendCL automatically triggers model inference based on the input data enqueued. After the inference is complete, the result data is automatically stored in the output queue.
- Call the acltdtDequeueData dequeue interface to obtain each output data for further post-processing.
- After obtaining the inference result, clear resources in a timely manner, including releasing the input and output memory, calling the aclmdlUnload API to unload the model, and calling the acltdtDestroyQueue API to destroy the queue.
Parent topic: Shared Queue Management