Application Scenarios and Solutions

Application Scenario

The Ascend virtual instance is suitable for scenarios with concurrent tasks running by multiple users, and each task has low computing power requirement. For foundation model tasks that require high computing power, it is not supported.

Virtualization

When the Ascend virtual instance is used on physical machines or virtual machines, the following virtualization scenarios are supported, as described in Table 1. This section describes the scenarios and methods for partitioning vNPUs on Ascend devices.

You can partition vNPUs in either of the following ways:

Static virtualization: Use the npu-smi tool to manually create vNPUs. Both physical machines and virtual machines support static virtualization.
Dynamic virtualization: After a virtualization task request is received, vNPUs can be dynamically and automatically created, the task can be mounted, and vNPUs can be reclaimed through software configurations.

**Table 1** Application scenarios
Scenarios Supported by Ascend Virtual Instance	Operation Process	Supported Ascend Hardware	Supported Virtualization Mode
Create vNPUs on physical machines and mount vNPUs to virtual machines.	Partition vNPUs on physical machines and mount vNPUs to virtual machines.	Atlas inference product: Atlas 300I Pro inference card Atlas 300I Duo inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card Atlas 800 training server (model 9000) Atlas 800 training server (model 9010)	Static virtualization
Create vNPUs on physical machines and mount vNPUs to containers.	For details about how to create vNPUs on physical machines, see Creating vNPUs. For details about how to mount vNPUs to containers, see Mounting a vNPU.	Atlas inference product Atlas 300I Pro inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card Atlas 300I Duo inference card Atlas 200I SoC A1 core board	Static virtualization
		Atlas inference product Atlas 300I Pro inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card Atlas 200I SoC A1 core board	Dynamic virtualization: Using Ascend Docker Runtime for mounting Using Kubernetes for mounting
		Atlas training product Atlas 300T training card (model 9000) Atlas 300T Pro training card (model 9000) Atlas 800 training server (model 9000) Atlas 800 training server (model 9010) Atlas 900 PoD (model 9000) Atlas 900T PoD Lite	Static virtualization
Create vNPUs on physical machines, mount vNPUs to virtual machines, and mount vNPUs to containers on virtual machines.	Partition vNPUs on physical machines and mount vNPUs to virtual machines. For details about how to mount vNPUs to containers on virtual machines, see Mounting a vNPU.	Atlas inference product: Atlas 300I Pro inference card Atlas 300I Duo inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card	Static virtualization
Directly pass through NPUs on physical machines to virtual machines, partition vNPUs on virtual machines, and mount vNPUs to containers on virtual machines.	Directly pass through NPUs on physical machines to virtual machines. For details about how to partition vNPUs on virtual machines, see Creating vNPUs. For details about how to mount vNPUs to containers on virtual machines, see Mounting a vNPU.	Atlas inference product: Atlas 300I Pro inference card Atlas 300I Duo inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card	Static virtualization
		Atlas inference product: Atlas 300I Pro inference card Atlas 300V video analysis card Atlas 300V Pro video analysis card	Dynamic virtualization: Using Ascend Docker Runtime for mounting Using Kubernetes for mounting

Solutions for Mounting vNPUs to Containers

You can mount vNPUs to a container using either of the following methods:

Native Docker: Only static virtualization is supported (multiple vNPUs are created using the npu-smi tool). vNPUs can be mounted to a container when Docker starts the container.

The vNPUs cannot be mounted to a container when the container is started using the native containerd.
Using MindCluster components:
- Ascend Docker Runtime: used independently based on Ascend Docker Runtime (container engine plugin). Both static and dynamic virtualization are supported. When Ascend Docker Runtime is used to start a container, vNPUs are mounted to the container.
- Kubernetes: mounts vNPUs to a container when the container is started using Kubernetes based on Ascend Device Plugin and Volcano. Both static and dynamic virtualization are supported.
  - Static virtualization: The npu-smi tool is used to create vNPUs in advance. When you need to use vNPU resources, use Ascend Device Plugin to allocate vNPU resources to upper-layer users through its function of discovering devices, allocating devices, and reporting device health status. In this scenario, Volcano is optional.
  - Dynamic virtualization: Ascend Device Plugin reports the number of available AI Cores on a device where it is installed. After a virtualization task is reported, Volcano schedules the task to a node that meets the task requirements. After receiving the request, Ascend Device Plugin of the node automatically splits vNPUs and mounts the task to complete the entire dynamic virtualization process. In this process, you do not need to partition vNPUs in advance, and vNPUs can be automatically reclaimed after the task is complete. This process supports scenarios where your requirements on computing power change continuously.

Parent topic: Virtual Instance