Development Using APIs (C++)
The samples described in this section apply to the Atlas inference product and
Sample Overview
The following uses Atlas inference product as an example to demonstrate how to use Vision SDK C++ APIs to develop an image object detection application. Figure 1 shows the inference process of an image object detection model. A YOLOv3 model under the TensorFlow framework is used as an example.
Preparations
- Install and deploy Vision SDK, and then perform the quick start sample.
Table 1 Required dependencies Dependency
Version
Link
OS
For details, see Supported Hardware and Operating Systems.
-
System dependency
-
CANN development kit
8.1.RC1
npu-driver
Ascend HDK 25.0.RC1
Click here. In the Select Resource area on the left, filter the required software packages, confirm the version information, and download the software packages.
For details, see Driver and Firmware Installation and Upgrade Guides of each hardware product.
npu-firmware
Ascend HDK 25.0.RC1
NumPy
1.25.2
1pip3 install numpy==1.25.2
- Obtain the sample code.
- Log in to the development environment where Vision SDK is installed, and upload the sample code package.
- Decompress the sample code package and go to the decompressed directory.
1 2
unzip YoloV3Infer.zip cd YoloV3Infer
The following is an example of the directory structure of the sample code.
1 2 3 4 5 6 7 8 9 10
YoloV3Infer ├── model │ ├── yolov3.names # YOLOv3 postprocessing label file │ ├── yolov3_tf_bs1_fp16.cfg # YOLOv3 postprocessing configuration file │ ├── aipp_yolov3_416_416.aippconfig # YOLOv3.om model AIPP conversion file ├── main.cpp # Main program file ├── CMakeLists.txt ├── run.sh # Script for running the program. It is recommended that you use the dos2unix tool to run the dos2unix run.sh command to format the script before running the program. ├── README.md ├── test.jpg # Test image prepared by the user
- Prepare the yolov3_tf.pb model for inference by referring to section "Preparing a Model" in README.md (see the decompression directory in 4).
- Prepare image data for inference.
Use your image to perform the test (change the name of the image to test.jpg). The following image is used for demonstration.
Figure 2 test.jpg
If issues such as unavailable CMake occur on the openEuler system, see System Commands Yum and Cmake Are Unavailable to solve the issues.
Code Parsing
In this sample, the key steps and code are as follows which cannot be directly copied for compilation or running. For details about the complete sample code, see the sample file.
- Initialize resources and configure model-related variables, such as paths of the model, configuration file, and label.
1 2 3 4 5 6 7 8 9 10 11 12
// Initialize resources and variables. const uint32_t YOLOV3_RESIZE = 416; // Image resizing size std::string yolov3ModelPath = "./model/yolov3_tf_bs1_fp16.om"; // Model path (OM model file automatically generated after the run.sh script is executed. The model file is stored in the ./model directory) std::string yolov3ConfigPath = "./model/yolov3_tf_bs1_fp16.cfg"; // Postprocessing configuration file path std::string yolov3LabelPath = "./model/yolov3.names"; // Postprocessing label file path v2Param.deviceId = 0; // Configuration v2Param.labelPath = yolov3LabelPath; v2Param.configPath = yolov3ConfigPath; v2Param.modelPath = yolov3ModelPath; APP_ERROR ret = MxBase::MxInit();
- Preprocess the input data. Execute MxInit to initialize resources, initialize the ImageProcessor object, and decode the image to obtain the Image object, resize the image, and convert the image into the data format (tensor type) required for inference.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
// Preprocessing // Construct the image processing class. MxBase::ImageProcessor imageProcessor(deviceId); // Construct the decoded image class. MxBase::Image decodedImage; // Perform decoding based on the image path. ret = imageProcessor.Decode(imgPath, decodedImage, ImageFormat::YUV_SP_420); MxBase::Image resizeImage; // Resizing size MxBase::Size resizeConfig(YOLOV3_RESIZE, YOLOV3_RESIZE); // Perform resizing. ret = imageProcessor.Resize(decodedImage, resizeConfig, resizeImage, MxBase::Interpolation::HUAWEI_HIGH_ORDER_FILTER); std::string path = "./resized_yolov3_416.jpg"; // Encode the resized image and output it to the specified path. ret = imageProcessor.Encode(resizeImage, path); // Convert the Image object to a Tensor. MxBase::Tensor tensorImg = resizeImage.ConvertToTensor(); // Set the ID of the device where Tensor is located. ret = tensorImg.ToDevice(deviceId);
- After the model class is built, input the tensor object built during preprocessing, call the Infer API, and obtain the model output result yoloV3Outputs.
1 2 3 4 5 6 7 8
// Model inference // Construct the model class. MxBase::Model yoloV3(modelPath, deviceId); // Construct the batch tensor as the input parameter of the Infer API. std::vector<MxBase::Tensor> yoloV3Inputs = {tensorImg}; // Perform model inference. std::vector<MxBase::Tensor> yoloV3Outputs = yoloV3.Infer(yoloV3Inputs);
- Postprocess the model output. The postprocessing module provided by Vision SDK (or developed by yourself) can be used to obtain the bounding box and object class, and display them on the source image through OpenCV.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
// Postprocessing // Postprocessing source image information MxBase::ImageInfo imageInfo; imageInfo.oriImagePath = argv[1]; imageInfo.oriImage = decodedImage; // Execute the postprocessing function. ret = YoloV3PostProcess(imageInfo, v2Param.configPath, v2Param.labelPath, yoloV3Outputs); // Main logic of the postprocessing function of YoloV3PostProcess // Create postprocessing configuration information. std::map<std::string, std::string> postConfig; postConfig.insert(pair<std::string, std::string>("postProcessConfigPath", yoloV3ConfigPath)); postConfig.insert(pair<std::string, std::string>("labelPath", yoloV3LabelPath)); // Initialize the postprocessing class. MxBase::Yolov3PostProcess yolov3PostProcess; APP_ERROR ret = yolov3PostProcess.Init(postConfig); // Postprocessing vector<MxBase::TensorBase> tensors; // Construct object detection information based on the model inference result. The information is required for postprocessing implemented by Vision SDK. // If the postprocessing function is user-defined, construct it based on the actual situation. vector<vector<MxBase::ObjectInfo>> objectInfos; auto shape = yoloV3Outputs[0].GetShape(); MxBase::ResizedImageInfo imgInfo; // Image width prior to resizing imgInfo.widthOriginal = imageInfo.oriImage.GetOriginalSize().width; // Image height prior to resizing imgInfo.heightOriginal = imageInfo.oriImage.GetOriginalSize().height; // Image width after resizing. imgInfo.widthResize = YOLOV3_RESIZE; // Image height after resizing. imgInfo.heightResize = YOLOV3_RESIZE; // Image resizing mode. imgInfo.resizeType = MxBase::RESIZER_STRETCHING; std::vector<MxBase::ResizedImageInfo> imageInfoVec = {}; imageInfoVec.push_back(imgInfo); // Perform postprocessing. ret = yolov3PostProcess.Process(tensors, objectInfos, imageInfoVec); // Use OpenCV to visualize the bounding box. cv::putText(imgBgr, objectInfos[i][j].className, cv::Point(x0 + 10, y0 + 10), cv::FONT_HERSHEY_SIMPLEX, 1.0, cv::Scalar(0, 255,0), 4, 8); cv::rectangle(imgBgr, cv::Rect(x0, y0, x1 - x0, y1 - y0), cv::Scalar(0, 255, 0), 4); // Perform model postprocessing deinitialization. ret = yolov3PostProcess.DeInit();
- Perform deinitialization and destroy resources.
1 2 3 4 5 6
// Deinitialization ret = MxBase::MxDeInit(); if (ret != APP_ERR_OK) { LogError << "MxDeInit failed, ret=" << ret << "."; return ret; }
Inference Running
- Configure environment variables. (The default CANN installation path /usr/local/Ascend/cann and the Vision SDK installation path /usr/local/Ascend/mxVision-{version} are used as examples.)
source /usr/local/Ascend/cann/set_env.sh source /usr/local/Ascend/mxVision-{version}/set_env.sh - Perform inference. Before running the inference script, modify the MX_SDK_HOME variable in CMakeLists.txt based on the Vision SDK installation path.
1bash run.sh
If the following information is displayed, the running is successful:
1 2 3 4 5 6 7 8 9 10 11 12 13
yoloV3Outputs len=3 ******YoloV3PostProcess****** Size of objectInfos is 1 objectInfo-0 ,Size:1 *****objectInfo-0:0 x0 is 410.738 y0 is 27.4772 x1 is 948.388 y1 is 645.941 confidence is 0.758505 classId is 16 className is dog ******YoloV3PostProcess end******
After the inference is complete, a result.jpg file is generated in the current folder. The image result, as shown in Figure 3, displays the coordinate box and class of the detected object.

