Creating an Inference Image

Based on whether a directory on the host is mounted to the container, the container can be deployed in either of the following modes. You can select a mode as required.
  • Mount a directory on the host to the container. In this mode, the Kubeedge native container mounting capabilities (such as volume mounting and capability set) need to be enabled, which poses container security risks. For details, see "Tool Usage" in the container_capability_tool document.
  • Install the driver in an image and do not mount the driver directory during deployment. This mode does not incur container escape, but causes a mismatch between the driver and the host driver. For details, see Inference Container Upgrade Constraints.
    Table 1 Image creation guide in edge scenarios

    Product

    Directory Not Mounted

    Directory Mounted

    Atlas 200 AI accelerator module (RC)

    N/A

    Create an inference image by referring to "Software Installation > Deploying Containerized Applications" in the Atlas 200 AI Accelerator Module Software Installation and Maintenance Guide (RC).

    Atlas 500 AI edge station (model 3000)

    Typical Service Scenarios

    Create an inference image by referring to "Appendix > Creating and Starting a Container Image" in the Atlas 500 AI Edge Station (Model 3000).

    Atlas 500 Pro AI edge server (model 3000)

    Typical Service Scenarios

    Create an inference image by referring to "Appendix > Creating and Starting a Container Image" in the Atlas 500 AI Edge Station (Model 3000).
    NOTE:

    The method for creating a container image on the Atlas 500 Pro AI edge server (model 3000) and Atlas 500 AI edge station (model 3000) is different. You need to configure the environment variable LD_LIBRARY_PATH in Dockerfile based on the driver installation path. That is, replace /home/data/miniD/driver/lib64 with the actual driver installation path.

  • This section applies only to the Atlas 500 AI edge station (model 3000) and Atlas 500 Pro AI edge server (model 3000), and the Atlas 500 AI edge station (model 3000) is used as an example to describe how to create and deploy a container image.
  • For the Atlas 500 Pro AI edge server (model 3000), install the driver of the corresponding NPU device first and then follow the procedure described in this section.

Typical Service Scenarios

AI applications require the NPU computing power of edge devices. The following uses the AtlasEdge platform as an example.

The image must have the following features:
  • The image is not mounted to the host path.
  • The image is run by the HwHiAiUser user.
  • When deploying the container, you do not need to configure the privileged container, capability set, and host network.

Prerequisites

  • Prepare the container OS image by yourself.
  • Prepare the offline inference engine package and service inference program package according to the following table.
Table 2 Required software

Software Package

Description

How to Obtain

Ascend-cann-nnrt_{version}_linux-aarch64.run

Offline inference engine package.

{version} indicates the software package version.

Link

NPU driver

Driver package of the NPU device required in the operating environment.

The software package of the Atlas 500 AI edge station (model 3000) contains the NPU driver. You do not need to install it again. For the Atlas 500 Pro AI edge server (model 3000), install Ascend HDK 22.0.RC5 or a later version by referring to the Ascend 310P NPU Driver and Firmware Installation Guide (AI Accelerator Card).

Dockerfile

Required for creating an image.

Prepared by users.

Service inference program package

It is a set of service inference programs, including model files, inference code, and configuration files. The files can be in .tar or .tar.gz format. Modify the commands in install.sh for installing the service inference programs as required.

NOTE:
  • The running user in the container must have the required permissions on the service inference program package.
  • The model file is updated by updating the container.
  • Configuration files can be directly stored in the image or delivered through ConfigMap.

Prepared by users.

Upload the image file through the Atlas IES. The file size (including the decompressed files) cannot exceed 512 MB.

install.sh

Installation script of the service inference programs.

-

run.sh

Running script of the service inference programs.

-

Procedure

  1. Upload the software packages and scripts to the same directory (for example, /home/test).
    • Ascend-cann-nnrt_{version}_linux-aarch64.run
    • NPU driver
      Before building an image, copy the driver directory in the NPU driver installation path to the current directory. For the Atlas 500 AI edge station (model 3000), the NPU driver installation path is /home/data/miniD/driver. For the Atlas 500 Pro AI edge server (model 3000), use the actual NPU driver installation path.
      • To build an image on the Atlas 500 AI edge station (model 3000), run the cp -r /home/data/miniD/driver/ /home/test command.
      • To build an image on the Atlas 500 Pro AI edge server (model 3000), run the cp -r actual_NPU_installation_path/Ascend/driver/ /home/test command.
    • Service inference program package
    • Dockerfile, install.sh, run.sh
  2. Perform the following steps to create a Dockerfile:
    1. Log in as the root user and run the id HwHiAiUser command to query and record the UID and GID of the HwHiAiUser user in the mirroring environment. (The UID and GID are used to replace the "gid" and "uid" in the Dockerfile compilation example.)
    2. Go to the software package upload directory in Step 1 and run the following command to create a Dockerfile:
      vi Dockerfile
    3. Enter the following content and run the :wq command to save the modification. (The Ubuntu ARM OS is used as an example, and the content is only an example. Perform secondary development as required.)
      # Change the OS and version number as required.
      FROM ubuntu:18.04
      
      # Build Arguments
      ARG NNRT_PKG
      ARG DIST_PKG
      ARG ASCEND_BASE=/usr/local/Ascend
      
      # Set environment variables.
      ENV LD_LIBRARY_PATH=\
      $LD_LIBRARY_PATH:\
      $ASCEND_BASE/driver/lib64:\
      $ASCEND_BASE/nnrt/latest/acllib/lib64
      
      ENV ASCEND_AICPU_PATH=$ASCEND_BASE/nnrt/latest
      
      # Add the HwHiAiUser user for running the service container.
      RUN umask 0022 && \
      groupadd -g gid HwHiAiUser && \
      useradd -u uid -g HwHiAiUser -m -d /home/HwHiAiUser HwHiAiUser
      
      # Set the directory for accessing the started container and the value of WORKDIR based on services. The following uses /home/HwHiAiUser/app as an example.
      RUN mkdir -p /home/HwHiAiUser/app
      WORKDIR /home/HwHiAiUser/app
      COPY $NNRT_PKG .
      COPY $DIST_PKG .
      COPY install.sh .
      
      # Install the NNRT.
      RUN bash ${NNRT_PKG} --quiet --install --install-for-all && \
      rm ${NNRT_PKG}
      
      # Install Driver
      # For the Atlas 500 AI edge station (model 3000) and Atlas 500 Pro AI edge server (model 3000), copy the driver directory in the NPU's driver installation path to the current directory in advance.
      # For the Atlas 500 AI edge station (model 3000), the NPU driver installation path is /home/data/miniD/driver. For the Atlas 500 Pro AI edge server (model 3000), use the actual NPU driver installation path.
      COPY driver/ $ASCEND_BASE/driver/
      
      #Install the dist and run the service installation script.
      RUN sh install.sh && \
          rm $DIST_PKG && \
          rm install.sh
      
      COPY run.sh /home/HwHiAiUser/app/run.sh
      RUN chmod +x /home/HwHiAiUser/app/run.sh
      
      
      RUN chown -R HwHiAiUser:HwHiAiUser /home/HwHiAiUser/app/
      RUN chown -R HwHiAiUser:HwHiAiUser $ASCEND_BASE/nnrt
      RUN chown -R HwHiAiUser:HwHiAiUser $ASCEND_BASE/driver
      USER HwHiAiUser:HwHiAiUser
      
      #Execute run.sh
      CMD bash /home/HwHiAiUser/app/run.sh
    4. After creating the Dockerfile, run the following command to change the permission on the Dockerfile:
      chmod 600 Dockerfile
    5. Modify the install.sh and run.sh scripts based on services and prepare them using the same method for preparing the Dockerfile.
    6. Compilation example of install.sh:
      #!/bin/bash
      # Access the container working directory.
      cd /home/HwHiAiUser/app
      # Decompress the service inference program package based on the package format. DIST indicates the name of the service inference program package prepared by the user.
      tar -xzvf DIST.tar
      Compilation example of run.sh:
      #!/bin/bash
      # Go to the directory where the executable file of the service inference programs is located. Use the actual path.
      cd /home/HwHiAiUser/app/FaceRecognition_b098/src/dist/HostCPU
      # Run the executable file. You need to modify the file according to the actual situation.
      ./AclFaceRecognitionHostCPU
  3. Go to the directory where the software packages are stored and run the following command to create a container image:
    docker build -t image-name:tag --build-arg NNRT_PKG=nnrt-name --build-arg DIST_PKG=distpackage-name .

    If the message "Successfully built xxx" is displayed, the image is successfully built. For details about the command, see Table 3.

    Do not omit . at the end of the command.

    Table 3 Command parameter description

    Parameter

    Description

    image-name:tag

    Specifies the image name and tag. Set this parameter as required.

    NNRT_PKG

    nnrt-name specifies the name of the offline inference engine package. Do not omit the file name extension. Replace it with the actual one.

    DIST_PKG

    distpackage-name specifies the name of the compressed package of the service inference program. Do not omit the file name extension. Replace it with the actual one.

  4. Run the following command to create a container image software package:
    docker save image-name:tag |gzip -c > image-name.tar.gz