Application Development (Object Detection + Image Classification)

In actual use, the object detection and image classification functions are usually combined. The YOLOv3 model is first used to detect objects and then the ResNet-50 model is used to classify images. This section describes how to use two mxVision models to detect objects and classify images.

Sample Overview

Figure 1 Object detection + image classification

Prerequisites

Log in to the development environment as the root user and create a code directory, for example, MyFirstApp, in any directory (for example, /home).
```
mkdir MyFirstApp
cd MyFirstApp
```
Search for a typical sample image for image classification on the network. The following figure is used as the inference data image for verification.
Figure 2 Typical sample image
Go to the MyFirstApp directory and create the data directory to store the image data required for inference. Upload the downloaded image to the data directory.
```
mkdir data
```
Create the models directory and copy the .cfg and .names files corresponding to the YOLOv3 and ResNet-50 models from the mxVision installation directory to the models directory ({version} indicates the version number and is user-defined).
```
mkdir models  
cp /home/mxVision-{version}/samples/mxVision/models/yolov3/yolov3_tf_bs1_fp16.cfg ./models    
cp /home/mxVision-{version}/samples/mxVision/models/yolov3/yolov3.names ./models    
cp /home/mxVision-{version}/samples/mxVision/models/resnet50/resnet50_aipp_tf.cfg ./models    
cp /home/mxVision-{version}/samples/mxVision/models/resnet50/resnet50_clsidx_to_labels.names ./models   
```
Download the OM files of the YOLOv3 and ResNet-50 models from the ModelZoo page in the Ascend community, decompress the downloaded files, and find the yolov3_tf_aipp.om and resnet50_tensorflow_{version}.om files in their om folders, and upload the files to the models directory.
Create the pipeline directory and the Sample.pipeline file.
```
mkdir pipeline
touch ./pipeline/Sample.pipeline
```

Create the src directory and copy the CMakeLists.txt, main.cpp (application), and run.sh (compilation script) files from the mxVision installation directory to the src directory.

mkdir src
cp /home/mxVision-{version}/samples/mxVision/C++/CMakeLists.txt ./src
cp /home/mxVision-{version}/samples/mxVision/C++/main.cpp ./src
cp /home/mxVision-{version}/samples/mxVision/C++/run.sh ./src

After the preceding steps are complete, the following code directory is generated:

MyFirstApp
├── data
│   ├── dog1_1024_683.jpg       // Test image

├── models                      // Directory for storing the model
│   ├── resnet50_tensorflow_1.7.om      // OM model file
│   ├── resnet50_aipp_tf.cfg     // Model configuration file
│   ├── resnet50_clsidx_to_labels.names   // File indicating the model output class
│   ├── yolov3_tf_aipp.om        // OM model file
│   ├── yolov3_tf_bs1_fp16.cfg   // Model configuration file
│   ├── yolov3.names             // File indicating the model output class

├── pipeline                     // Store pipeline files.
│   ├── Sample.pipeline         // Pipeline file
                                   
├── src
│   ├── CMakeLists.txt          // CMakeList file
│   ├── main.cpp               // Implementation file of the main function, for image classification
│   ├── run.sh                 // Compilation script

Orchestrating the Pipeline File

Figure 3 shows each phase of the service process for object detection and image classification. The pipeline file is first edited, then the mxVision plugin library is invoked, and finally the two models are used in sequence for the inference service. The pipeline file content in this section is orchestrated based on the service process shown in Figure 3.

Figure 3 Service process orchestration (object detection + image classification)

Log in to the development environment, and run the cd /home/MyFirstApp/pipeline command to go to the pipeline directory. Run the vi Sample.pipeline command, and press I on the keyboard to enter the insert mode. The following is an example:

{
    "classification+detection": { // classification indicates the name of the current service inference process.
        "stream_config": {
            "deviceId": "0" // deviceId indicates the ID of the processor to be used.
        },
        "appsrc0": { // appsrc0 indicates the name of the input element.
            "props": {
                "blocksize": "409600" // Size of data read by each buffer
            },
            "factory": "appsrc", // factory defines the element type.
            "next": "mxpi_imagedecoder0" // Enter the connected downstream element. Here, an image decoding element is used.
        },
        "mxpi_imagedecoder0": { // Name of the image decoding element. The value 0 indicates the ID. If multiple image decoding elements are used in a process, name elements as 0, 1, 2, and so on.
            "props": {
                "handleMethod": "opencv" // Use OpenCV as the decoding method.
            },
            "factory": "mxpi_imagedecoder", // Use an image decoding plugin.
            "next": "mxpi_imageresize0" // Enter the connected downstream element. Here, an image resizing element is used.
        },
        "mxpi_imageresize0": { // Name of the image resizing element
            "props": {
                "parentName": "mxpi_imagedecoder0", // Enter the connected upstream element. Here, an image decoding element is used.
                "handleMethod": "opencv" // Use OpenCV as the decoding method.
                "resizeHeight": "416", // Height after resizing
                "resizeWidth": "416", // Width after resizing
                "resizeType": "Resizer_Stretch" // The stretch mode is used for resizing.
            },
            "factory": "mxpi_imageresize", // Use an image resizing plugin.
            "next": "mxpi_tensorinfer0" // Enter the connected downstream element. Here, a model inference element is used.
        },
        "mxpi_tensorinfer0": { // Name of the model inference element
            "props": {
                "dataSource": "mxpi_imageresize0", // Enter the connected upstream element. Here, an image resizing element is used.
                "modelPath": "../models/yolov3_tf_aipp.om"// modelPath defines the model used for object detection. Change the file name based on the model obtained in 3.
                "waitingTime": "2000", // Waiting time for batch combination allowed by the multi-batch model
                "outputDeviceId": "-1" // Copy the memory to the specified location. If this parameter is set to -1, the memory is copied to the host.
            },
            "factory": "mxpi_tensorinfer", // Use a model inference plugin.
            "next": "mxpi_objectpostprocessor0" // Enter the connected downstream element.
        },
        "mxpi_objectpostprocessor0": { // Name of the model postprocessing element
            "props": {
                "dataSource": "mxpi_tensorinfer0", // Enter the connected upstream element. Here, a model inference element is used.
                "postProcessConfigPath": "../models/yolov3_tf_bs1_fp16.cfg",// postProcessConfigPath defines the model used for object detection. Change the file name based on the model obtained in 3.
                "labelPath": "../models/yolov3.names",// labelPath specifies the class name file of the model output. Use the file obtained in 3.
                "postProcessLibPath": "libyolov3postprocess.so" // postProcessLibPath specifies the dynamic library on which model postprocessing depends.
            },
            "factory": "mxpi_objectpostprocessor", // Use a model postprocessing plugin.
            "next": "mxpi_imagecrop0" // Enter the connected downstream element. Here, an image cropping element is used.
        },
        "mxpi_imagecrop0": { // Name of the image cropping element
            "props": {
                "handleMethod": "opencv" // Use OpenCV as the decoding method.
            },
            "factory": "mxpi_imagecrop",// Use an image cropping plugin.
            "next": "mxpi_imageresize1" // Enter the connected downstream element. Here, an image resizing element (No.1) is used.
        },
        "mxpi_imageresize1": { // Name of the image resizing element
            "props": {
                "handleMethod": "opencv" // Use OpenCV as the decoding method.
                "resizeHeight": "280",// Height after resizing
                "resizeWidth": "280",// Width after resizing
                "resizeType": "Resizer_Stretch" // The stretch mode is used for resizing.
            },
            "factory": "mxpi_imageresize", // Use an image resizing plugin.
            "next": "mxpi_opencvcentercrop1" // Enter the connected downstream element. Here, an image center cropping element is used.
        },
        "mxpi_opencvcentercrop1": { // Name of the image center cropping element
            "props": {
                "dataSource": "mxpi_imageresize1",// Enter the connected upstream element. Here, an image resizing element is used.
                "cropHeight": "224",// Height of the cropped image
                "cropWidth": "224" // Width of the cropped image
            },
            "factory": "mxpi_opencvcentercrop",// Use an image center cropping plugin.
            "next": "mxpi_tensorinfer1" // Enter the connected downstream element. Here, a model inference element is used.
        },
        "mxpi_tensorinfer1": { // Name of the model inference element
            "props": { // props indicates the element attribute, which can be used to load files in a specified directory.
                "dataSource": "mxpi_opencvcentercrop1",// Enter the connected upstream element. Here, an image center cropping element is used.
                "modelPath": "../models/resnet50_tensorflow_1.7.om",// modelPath defines the model used for the inference service. Change the file named based on the model obtained in 3.
                "waitingTime": "2000", // Waiting time for batch combination allowed by the multi-batch model
                "outputDeviceId": "-1" // Copy the memory to the specified location. If this parameter is set to -1, the memory is copied to the host.
            },
            "factory": "mxpi_tensorinfer", // Use a model inference plugin.
            "next": "mxpi_classpostprocessor1" // Enter the connected downstream element. Here, a model postprocessing element is used.
        },
        "mxpi_classpostprocessor1": { // Name of the model postprocessing element
            "props": { // props indicates the element attribute, which can be used to load files in a specified directory.
                "dataSource": "mxpi_tensorinfer1",// Enter the connected upstream element. Here, a model inference element is used.
                "postProcessConfigPath": "../models/resnet50_aipp_tf.cfg",// postProcessConfigPath specifies the model postprocessing configuration file. Use the file obtained in 3.
                "labelPath": "../models/resnet50_clsidx_to_labels.names",// labelPath specifies the class name file of the model output. Use the file obtained in 3.
                "postProcessLibPath": "libresnet50postprocess.so"// postProcessLibPath specifies the dynamic library on which model postprocessing depends.
            },
            "factory": "mxpi_classpostprocessor",// Use a model postprocessing plugin.
            "next": "mxpi_dataserialize0" // Enter the connected downstream element. Here, a serialization element is used.
        },
        "mxpi_dataserialize0": { // Name of the serialization element
            "props": {
                "outputDataKeys": "mxpi_classpostprocessor1"// outputDataKeys specifies the index of the data to be output.
            },
            "factory": "mxpi_dataserialize",// Use a serialization plugin.
            "next": "appsink0" // Enter the connected downstream element. Here, an output element is used.
        },
        "appsink0": { // Name of the output element
            "props": {
                "blocksize": "4096000" // Size of data read by each buffer
            },
            "factory": "appsink" // Use an output plugin.
        }
    }
}

After the editing is complete, press Esc to exit the edit mode and enter :wq! to save the file.

The following is a supplement to the pipeline file to help you better understand how to compile the pipeline:

next indicates the connection between elements.
appsrc0 is used to send data to a stream, and appsink0 is used to obtain the inference result from the stream.
mxpi_classpostprocessor0 and mxpi_classpostprocessor1 are used to postprocess the output tensors of model inference.
mxpi_dataserialize0 assembles the inference result into a JSON character string for output.

Editing an Application

main.cpp in the MyFirstApp/src directory is the general application source code. Modify the pipeline file path, input image path, and stream name based on your requirements. The stream name must match the service inference process name in the pipeline file. In the preceding example, the service inference process name in the pipeline file is classification+detection.

int main(int argc, char* argv[])
{
    // 1. Parse the pipeline file.
    std::string pipelineConfigPath = "../pipeline/Sample.pipeline";  // Modify the pipeline file path.
    std::string pipelineConfig = ReadPipelineConfig(pipelineConfigPath);
    if (pipelineConfig == "") {
        LogError << "Read pipeline failed.";
        return APP_ERR_COMM_INIT_FAIL;
    }

   // 2. Initialize the stream manager.
    MxStream::MxStreamManager mxStreamManager;
    APP_ERROR ret = mxStreamManager.InitManager();
    if (ret != APP_ERR_OK) {
        LogError << GetError(ret) << "Failed to init Stream manager.";
        return ret;
    }

   // 3. Create a stream.
    ret = mxStreamManager.CreateMultipleStreams(pipelineConfig);
    if (ret != APP_ERR_OK) {
        LogError << GetError(ret) << "Failed to create Stream.";
        return ret;
    }

    // 4. Read the image to be inferred.
    MxStream::MxstDataInput dataBuffer;
    ret = ReadFile("../data/dog1_1024_683.jpg", dataBuffer);    // Change the input image path.
    if (ret != APP_ERR_OK) {
        LogError << GetError(ret) << "Failed to read image file.";
        return ret;
    }
    std::string streamName = "classification+detection";  // Enter the stream name.
    int inPluginId = 0;

    // 5. Send the image to be inferred to the stream.
    ret = mxStreamManager.SendData(streamName, inPluginId, dataBuffer);
    if (ret != APP_ERR_OK) {
        LogError << GetError(ret) << "Failed to send data to stream.";
        delete dataBuffer.dataPtr;
        dataBuffer.dataPtr = nullptr;
        return ret;
    }

    // 6. Obtain the inference result.
    MxStream::MxstDataOutput* output = mxStreamManager.GetResult(streamName, inPluginId);
    if (output == nullptr) {
        LogError << "Failed to get pipeline output.";
        delete dataBuffer.dataPtr;
        dataBuffer.dataPtr = nullptr;
        return ret;
    }
    std::string result = std::string((char *)output->dataPtr, output->dataSize);
    LogInfo << "Results:" << result;

   // 7. Destroy the stream and resources.
    mxStreamManager.DestroyAllStreams();
    delete dataBuffer.dataPtr;
    dataBuffer.dataPtr = nullptr;
    delete output;
    return 0;
}

Compiling and Running an Application

./run.sh

The following information is displayed, in which classId indicates the class ID, className indicates the class name, and confidence indicates the maximum confidence value of the class.

Results: {
    "MxpiClass": [
        {
            "classId": 163,
            "className": "beagle",
            "confidence": 9.046875
        }
    ]
}

The mapping between class labels and classes is related to the dataset used for model training. The model used in this example is trained based on the ImageNet dataset. You can view the mapping between dataset labels and classes on the Internet.

The mapping between class IDs and classes in the command output is as follows:

"160" ["Rhodesian ridgeback"]
"161" ["Afghan hound, Afghan"]
"162" ["basset, basset hound"]
"163" ["beagle"]
"164" ["bloodhound, sleuthhound"]
"165" ["bluetick"]
"166" ["black-and-tan coonhound"]