Preparations

Determining Service Process

Modularize the service process based on functions such as object detection, image classification, and property recognition. For example, in Figure 1, the service process is divided into image obtaining, image decoding, image resizing, object detection, image cropping, image resizing, image classification, serialization, and result sending.

Figure 1 Typical service process

Searching for Proper Plugins

Match service functions based on the function description and specifications of the existing Vision SDK plugins. For details about Vision SDK plugins, see Table 1. For details about the plugin usage, see Plugins.

When the plugins provided by Vision SDK cannot meet function requirements, you can develop custom plugins by referring to (Optional) Developing a Plugin.

Table 1 Plugin list

Plugin Type

Plugin Name

Function Description

Input plugins

appsrc

Sends data to the stream, and appsrc sends the data to downstream element.

mxpi_rtspsrc

Receives the input video path of the external calling API, pulls video streams, stores the pulled raw stream in the buffer, and sends the data to the downstream plugin.

Output plugins

mxpi_dataserialize

Assembles the stream result into a JSON string for output.

appsink

Obtains data from streams.

fakesink

(Dummy or black hole plugin) Swallows data to discard unnecessary data.

filesink

Writes the input data to a file and saves the file to the local host.

Streaming plugins

mxpi_parallel2serial

Outputs the input data of multiple ports in sequence through one port.

mxpi_distributor

Sends data of a specified class or channel to different ports.

mxpi_synchronize

Pushes data to the output port only after all input ports have data.

queue

When this plugin outputs data, another thread is created for subsequent processing to decouple the input data from the output data, create a buffer queue, and store the data that has not been output to the downstream plugin.

tee

Distributes a single input data record multiple times.

mxpi_datatransfer

Transfers memory data between the device and host.

mxpi_nmsoverlapedroiV2

Filters repeated objects in the overlapping area after block division.

mxpi_roigenerator

Allows users to enter the number, size, and overlap parameters of image blocks to automatically generate the object frame.

mxpi_semanticsegstitcher

Merges the images of the semantic segmentation inference result.

mxpi_objectselector

Filters the postprocessing results based on the area size, area boundary, and confidence threshold during multi-level inference.

mxpi_skipframe

Performs frame skipping on data.

Media data processing plugins

mxpi_imagedecoder

Decodes images in JPG, JPEG, or BMP format.

mxpi_imageresize

Resizes images.

Resizes the decoded YUV and RGB images with the specified width and height. YUV_420 supports both 4K and 8K images. Other types of YUV images, such as YUV422 and YUV444, support only 4K images. The RGB format can be RGB888 or BGR888.

mxpi_imagecrop

Crops images.

mxpi_videodecoder

Decodes videos. Currently, only the H.264 and H.265 formats are supported.

mxpi_videoencoder

Encodes videos.

mxpi_imageencoder

Encodes images.

mxpi_imagenormalize

Performs image normalization or standardization.

mxpi_opencvcentercrop

Crops the images in the image center.

mxpi_warpperspective

Perspective transformation plugin, which is used in the scenario where the detection box is a rectangle with a certain angle after inference and needs to be rotated to be a regular rectangle. The output is the cropped image information of each detection box. The cropped image is obtained through perspective transformation.

mxpi_rotation

Rotates images.

Inference Plugins

mxpi_modelinfer

Classifies or detects objects. (This plugin will not evolve from this version. Use mxpi_tensorinfer.)

mxpi_tensorinfer

Classifies or detects objects.

Model postprocessing plugins

mxpi_objectpostprocessor

Inherits the image postprocessing base class, which is used to postprocess the output tensor of the object detection model inference.

mxpi_classpostprocessor

Inherits the model postprocessing base class, which is used to postprocess the output tensor of the classification model inference.

mxpi_semanticsegpostprocessor

Inherits the image postprocessing base class, which is used to postprocess the output tensor of the semantic segmentation model inference.

mxpi_textgenerationpostprocessor

Inherits the model postprocessing base class, which is used to postprocess the output tensor of text generation (including translation, text recognition, and speech recognition) model inference.

mxpi_textobjectpostprocessor

Inherits the image postprocessing base class, which is used to postprocess the output tensor of the detection model inference of the text object bounding box.

mxpi_keypointpostprocessor

Inherits the image postprocessing base class, which is used to postprocess the output tensor of the posture detection model inference.

IVA plugins

mxpi_motsimplesort

Implements multi-object tracking. (This plugin will not evolve from this version. Use mxpi_motsimplesortV2.)

mxpi_motsimplesortV2

Implements multi-object tracking.

mxpi_facealignment

Object alignment plugin, which can be used to align detected object images.

mxpi_qualitydetection

Video quality diagnosis plugin, which can be used to analyze the quality of decoded video images and generate alarms in the case of exceptions.

Debugging Plugins

mxpi_dumpdata

Data export plugin, which is used to export the MxpiBuffer data of the upstream plugin in JSON format.

mxpi_loaddata

Data loading plugin, which is used to load the files exported by the mxpi_dumpdata plugin and restore the files to MxpiBuffer. This plugin must be used together with the filesrc plugin. As the upstream plugin of mxpi_loaddata, filesrc reads file content and sends the file content to mxpi_loaddata.

Screen display plugins

mxpi_opencvosd

Calls basic On-screen Display (OSD) functions to draw basic units, such as frames, words, lines, and circles, on images.

mxpi_object2osdinstances

Converts object frames to drawing units.

mxpi_class2osdinstances

Converts classification results to drawing units.

mxpi_osdinstancemerger

Summarizes drawing units from multiple input ports.

mxpi_channelselector

Transparently transmits buffers of a specified channel ID, filters out buffers of other channels, and clears the metadata except the frame information.

mxpi_channelimagesstitcher

Combines multiple images into a large image, dynamically outputs the preprocessing information of each image, and provides the information to the coordinate assembly plugin.

mxpi_channelosdcoordsconverter

Multi-channel coordinate conversion plugin that receives the drawing units and combination information (coordinate offsets) from each channel, and outputs the summarized coordinate conversion result.

mxpi_bufferstablizer

When no buffer is input during a specified period, this plugin automatically sends an empty buffer until there is a new buffer input.

Preparing an Inference Model

  1. Prepare an inference model that meets the actual service requirements and convert the model into an OM model by referring to CANN ATC Instructions.
  2. Prepare for model postprocessing. For details, see the model postprocessing provided by Vision SDK in Model Support.