Method 1: Mounting vNPUs Using Ascend Docker Runtime

This section describes how to use Ascend Docker Runtime (container engine plugin) to mount vNPUs to a container.

Prerequisites

Obtain the Ascend-docker-runtime_{version}_linux-{arch}.run package and install it by referring to Ascend Docker Runtime.

Usage on Ascend Docker Runtime

Use either of the following methods:

  • Static virtualization: After creating a vNPU using the npu-smi tool, run the following command to mount the vNPU to a container when starting the container. The following command means to mount the vNPU whose ID is 100 when the container is started:
    docker run -it -e ASCEND_VISIBLE_DEVICES=100 -e ASCEND_RUNTIME_OPTIONS=VIRTUAL image-name:tag /bin/bash
  • Dynamic virtualization: When starting a container, run the following command to split four AI Cores from the physical device (ID = 0) as vNPUs and mount them to the container. If a container is started in this way, the virtual device is automatically destroyed when the container process is ended.
    docker run -it --rm -e ASCEND_VISIBLE_DEVICES=0 -e ASCEND_VNPU_SPECS=vir04 image-name:tag /bin/bash
  • To use dynamic virtualization, disable the vNPU restoration function.
  • You can query the available processor IDs as follows:
    • Physical processor ID:
      ls /dev/davinci*
    • Virtual processor ID:
      ls /dev/vdavinci*
  • image-name:tag: image name and tag, for example, ascend-tensorflow:tensorflow_TAG.
  • Do not repeatedly define or fix environment variables such as ASCEND_VISIBLE_DEVICES, ASCEND_RUNTIME_OPTIONS, and ASCEND_VNPU_SPECS in the container image.
Table 1 Parameter description

Parameter

Description

Example

ASCEND_VISIBLE_DEVICES

ASCEND_VISIBLE_DEVICES must be used to specify the NPU device to be mounted to the container. Otherwise, the NPU device fails to be mounted. If the NPU device ID is used to specify devices, one or more devices can be specified, and devices can be used together. If the NPU name is used to specify devices, multiple NPU names of the same type can be specified at the same time.

  • Static virtualization:
    • ASCEND_VISIBLE_DEVICES=100 indicates that vNPU 100 is mounted to the container.
    • ASCEND_VISIBLE_DEVICES=101,103 indicates that vNPUs 101 and 103 are mounted to the container.
    • ASCEND_VISIBLE_DEVICES=100-102 indicates that vNPUs 100 to 102 (including vNPUs 100 and 102) are mounted to the container. The effect is the same as that of ASCEND_VISIBLE_DEVICES=100,101,102.
    • ASCEND_VISIBLE_DEVICES=100-102,104 indicates that vNPUs 100 to 102 and vNPU 104 are mounted to the container. The effect is the same as that of ASCEND_VISIBLE_DEVICES=100,101,102,104.
    • ASCEND_VISIBLE_DEVICES=AscendXXX-Y: XXX indicates the NPU model. The value can be 910, 310, or 310P. Y indicates the vNPU ID.
      • ASCEND_VISIBLE_DEVICES=Ascend910-101 indicates that vNPU 101 is mounted to the container.
      • ASCEND_VISIBLE_DEVICES=Ascend910-101,Ascend910-103 indicates that vNPU 101 and 103 are mounted to the container.
      NOTE:
      • The NPU type must be the same as that used in the actual environment. Otherwise, the mounting fails.
      • You cannot specify both the vNPU ID and vNPU name in a parameter. That is, ASCEND_VISIBLE_DEVICES=100, Ascend910-101 is not supported.
      • It must be used together with ASCEND_RUNTIME_OPTIONS, and the value must contain VIRTUAL, indicating that the vNPU is mounted.
  • Dynamic virtualization:
    ASCEND_VISIBLE_DEVICES=0 indicates that a certain number of AI Cores are allocated from NPU device 0.
    NOTE:
    • A dynamic virtualization command can specify only one physical NPU ID for dynamic virtualization.
    • It must be used together with ASCEND_VNPU_SPECS, to specify the number of AI Cores split on a specified NPU.
    • It can be used together with ASCEND_RUNTIME_OPTIONS, but the value can only be NODRV, indicating that the driver-related directory is not mounted.

ASCEND_RUNTIME_OPTIONS

Restricts the processor ID specified by ASCEND_VISIBLE_DEVICES.

  • NODRV indicates that driver-related directories are not mounted.
  • VIRTUAL indicates that the virtual processor is mounted.
  • NODRV,VIRTUAL indicates that the virtual processor is mounted while driver-related directories are not mounted.
  • ASCEND_RUNTIME_OPTIONS=NODRV
  • ASCEND_RUNTIME_OPTIONS=VIRTUAL
  • ASCEND_RUNTIME_OPTIONS=NODRV,VIRTUAL

ASCEND_VNPU_SPECS

Splits a certain number of AI Cores from a physical NPU device as virtual devices. The value can be vir01, vir02, vir02_1c, vir04, vir04_3c, vir08, vir16, vir04_4c_dvpp, or vir04_3c_ndvpp.

  • Processors of Atlas training product (30 or 32 AI Cores) support vir02, vir04, vir08, and vir16.
  • Processors of Atlas inference product support vir01, vir02, vir02_1c, vir04, vir04_3c, vir04_4c_dvpp, and vir04_3c_ndvpp.

This parameter must be used together with ASCEND_VISIBLE_DEVICES that specifies the physical NPU device to be virtualized.

ASCEND_VNPU_SPECS=vir04 indicates that four AI Cores are allocated as vNPUs that will be mounted to a container.