Starting a Container in Manual Mounting Mode

This section describes how to manually mount the host firmware and driver directory to a container to start the container.

Prerequisites

  • The driver and firmware have been installed on the host. For details, see NPU Driver and Firmware Installation.
  • Docker has been installed on the host. If the hardware product is A800-9000/A800-9010/A300T-9000/Atlas 300T Pro and the host version is Ubuntu 22.04, the Docker version must be 21.10 or later.
  • A device in the operating environment can be used by only one container. The device can be used by other containers only after the container that uses it exits.
  • Restart the service and container.
    1
    2
    systemctl daemon-reload
    systemctl restart docker
    
  • After obtaining the required OS image, run the following command to view the image:
    1
    docker images
    

Procedure

  1. Run the id HwHiAiUser command on the host to check the gid of HwHiAiUser on the host. In the example shown in Figure 1, the gid is 1001.

    If you use --install-username=username --install-usergroup=usergroup to specify a non-root user when installing the driver package, you need to use the same method to note down the gid of the non-root user. This gid is required when creating a non-root user in the container, ensure that the non-root users who start the related process in the host and container are in the same group.

    Figure 1 Checking the gid of HwHiAiUser on the host
  2. Create and start a Docker container on the host. You can run the following commands to start the container. Modify the mounting information based on the product type and actual paths.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    docker run -it \
    --ipc=host \
    --device=/dev/davinci0:ro \
    --device=/dev/davinci_manager:ro \
    --device=/dev/devmm_svm:ro \
    --device=/dev/hisi_hdc:ro \
    -v /usr/local/dcmi:/usr/local/dcmi:ro \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi:ro \
    -v /usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/common:ro \
    -v /usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/driver:ro \
    -v /etc/ascend_install.info:/etc/ascend_install.info:ro \
    -v /etc/vnpu.cfg:/etc/vnpu.cfg:ro \
    -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info:ro \
    docker_image_id /bin/bash
    
    • --ipc=host: The shared memory in the container may be insufficient. When starting the container, you need to add this parameter to configure the shared memory with the host.
    • docker_image_id indicates the image ID. Replace it as required. You can run the docker images command to view the image information.

    The preceding mount directories are only examples. For details about the directories to be mounted for each product model, see Operation Guide.

    • In the multi-container scenario, to prevent malicious code of a container from allocating a large amount of the kernel memory or even exhausting it, set --kernel-memory as required to specify the maximum kernel memory that can be occupied by a container. Otherwise, other containers may be affected. To perform this operation, you need to enable kmem accounting on the kernel of the client host.
    • In training scenarios, if the host OS is CentOS or BC-Linux, the maximum number of threads in Docker is 4092, which cannot meet the training requirements. In this case, add the --pids-limit 409600 parameter to configure the maximum number of Docker threads in CentOS/BC-Linux during container startup.
    • The shared memory in the container may be insufficient. When starting the container, you need to add the --ipc=host option to enable memory sharing with the host.
    After the container is started, run the following command to check the available Da Vinci devices in the Docker container:
    1
    ls /dev/ | grep davinci*
    

    For details about the command execution, see Figure 2. davinci_manager is the character device node of the management module, and davinci0 is the Da Vinci device used by the container.

    Figure 2 Querying available Da Vinci devices in the Docker container
  3. Create the HwHiAiUser user in the container to start related processes. If a non-root user is created, ensure that the owner group of the non-root user is the same as that of the user who runs the driver. If they are different, add it to the driver running user group. The command is as follows. Replace gid with the value obtained in Step 1. If ok is returned, the creation is successful.
    1
    groupadd -g gid HwHiAiUser && useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser && echo ok
    
  4. Set the following environment variables to load the driver .so files in the container:
    1
    export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/common:/usr/local/Ascend/driver/lib64/driver:${LD_LIBRARY_PATH}
    

    In the preceding command, /usr/local/Ascend is the default installation path. Change it to the actual one. Environment variables set by using the export command take effect immediately but are valid only in the current window.

  5. Run the exit command to exit the container. In the path of the CANN package on the host, run the following command to copy the package to the container:
    1
    docker cp /home/HwHiAiUser/Ascend-cann-nnrt_<version>_linux-<arch>.run container_id:/home/HwHiAiUser/software
    

    Change the paths as required.

    • /home/HwHiAiUser/ is the path for storing the software package on the host.
    • Replace Ascend-cann-nnrt_{version}_linux-<arch>.run with the actual CANN package name.
    • container_id indicates the container ID. You can run the docker ps -a command to view the container ID.
    • /home/HwHiAiUser/software is the path for storing the software package in the container. If the path does not exist, manually create it.
  6. Run the following commands to access the container again:
    1
    2
    docker start container_id 
    docker attach container_id
    
  7. Go to the directory where the CANN package is stored and install the required CANN software according to the host installation mode (Installing Dependencies or Installing CANN Packages).

Operation Guide

Mount the host directory to the container based on the product model. For details, see Table 1.

Table 1 Operation instructions

Product Model

Operation Guide

A800-9000

A800-9010

"Installation and Uninstallation in a Container" in Atlas Center Training Server 24.1.0 NPU Driver and Firmware Installation Guide

A300T-9000

Atlas 300T Pro

"Installation and Uninstallation in a Container" in Atlas Center Training Card 24.1.0 NPU Driver and Firmware Installation Guide

A300-3000

"Installation and Uninstallation in a Container" in Atlas 300I Inference Card 24.1.0 NPU Driver and Firmware Installation Guide (Model 3000)

A300-3010

"Installation and Uninstallation in a Container" in Atlas 300I Inference Card 24.1.0 NPU Driver and Firmware Installation Guide (Model 3010)