Description
The virtual instance function can virtualize an NPU of a physical machine or virtual machine into several virtual NPUs (vNPUs) and mount the vNPUs to containers. Virtualization management allocates and reclaims resources of different specifications in a unified manner, allowing users to allocate and deallocate resources frequently.
This function allows users to share resources on the same server. This is an advantage that makes the NPU computing power more accessible and affordable to users. To be specific, users can share NPU resources on the same server and use containers to isolate resources. This ensures the stability and security of the operating environment. In addition, resources can be allocated and reclaimed in a unified manner, making it easier to manage multiple tenants.
For more details, see Virtual Instance.
Applicable Products
Product Portfolio |
Applicable Scenario |
Virtualization Mode |
Supported or Not |
|---|---|---|---|
Atlas inference product
|
Divide vNPUs on physical machines and mount vNPUs to containers. |
Static virtualization |
Yes |
Atlas inference product
|
Divide vNPUs on physical machines and mount vNPUs to containers. |
Dynamic virtualization |
Yes |
Atlas inference product
|
Divide vNPUs on physical machines and mount vNPUs to virtual machines. |
Static virtualization |
Yes |
Divide vNPUs on physical machines, mount vNPUs to virtual machines, and mount vNPUs to containers on virtual machines. |
Static virtualization |
Yes |
|
Directly pass through NPUs to virtual machines from physical machines, divide vNPUs on virtual machines, and mount vNPUs to containers on virtual machines. |
Static virtualization |
Yes |
|
Atlas inference product
|
Directly pass through NPUs to virtual machines from physical machines, divide vNPUs on virtual machines, and mount vNPUs to containers on virtual machines. |
Dynamic virtualization |
Yes |
Atlas 800 training server |
Divide vNPUs on physical machines and mount vNPUs to virtual machines. |
Static virtualization |
Yes |
Atlas training product
|
Divide vNPUs on physical machines and mount vNPUs to containers. |
Static virtualization |
Yes |
- |
- |
No |
|
- |
- |
No |
|
- |
- |
No |
|
- |
- |
No |
|
- |
- |
No |
Instructions
- To use dynamic virtualization, directly refer to Dynamic Virtualization without the need to run the npu-smi command to create vNPUs in advance.
- To use static virtualization, refer to Creating vNPUs to create vNPUs first and then mount the created vNPUs to containers.
- After a physical NPU of Atlas inference product is virtualized into vNPUs, the model performance may deteriorate when the vNPUs are used for inference. If the performance deteriorates, you are advised to use vir04 + vir04_3c or vir04 + vir02 + vir02_1c to allocate vNPUs. For details about the hardware resources, see "Virtualization Template" in Virtualization Rules.
- When using vNPUs to train a model, you can use the AOE tuning tool to further tune the model performance. For details, see CANN AOE Tuning Tool User Guide.
Restrictions
- After a physical NPU is virtualized into vNPUs, the physical NPU cannot be mounted to a container, and the physical NPU cannot be directly connected to VMs for use.
- A vNPU can be used by only one job container. Multiple job containers cannot use the same vNPU.
- The working modes of the two NPUs on the Atlas 300I Duo inference card must be the same. That is, both use the virtual instance function or the entire card. Configure the working mode according to your actual needs.
- Virtualization templates are used to allocate all NPU resources on a server. Standard cards of different specifications cannot be used together. For example, the Atlas 300V Pro video analysis card supports 24 GB and 48 GB memory specifications, but does not support virtualization of cards with different memory specifications. Atlas training product with 30 AI Cores and Atlas training product with 32 AI Cores cannot be used together.
- When the server is one of Atlas training product, the virtualization function is supported only when NPUs work in AMP mode, and is not supported in SMP mode. You can follow the procedure below to query and set the NPU working mode. Ensure that the server OS is powered off before starting the operation:
- Log in to the iBMC CLI.
- Run the ipmcget -d npuworkmode command to query the working mode of the NPUs. If the working mode is AMP, you do not need to switch it to another.
- Run the ipmcset -d npuworkmode -v 0 command to set the NPU working mode to AMP.
For details about how to query and set the NPU working mode, see "iBMC CLI > Server Commands > Querying and Setting the NPU processor Working Mode (npuworkmode)" in Atlas 800 Training Server iBMC User Guide (Model 9000).