Static vNPU Scheduling

Function Highlights

When running a training or inference job, you can schedule the job to vNPUs of a node for training or inference. The static vNPU scheduling feature uses the basic scheduling function supported by Kubernetes and works with Volcano or other schedulers to manage vNPU allocations and optimally allocate additional resources for training or inference jobs.

Before You Start

Before using static vNPU scheduling, you need to use the npu-smi tool to create several vNPUs. To use vNPU resources, mount the vNPUs to a container. To use computing power virtualization, you need to learn about the types, allocation rules, and allocation templates supported by Ascend AI processors. For details, see Virtual Instances.

Required Component

The following components need to be installed for training and inference jobs:
  • Volcano or other schedulers
  • Ascend Device Plugin
  • Ascend Docker Runtime
  • Ascend Operator
  • ClusterD
  • NodeD

Instructions

  1. Refer to Installation and Deployment for component installation.
  2. Refer to Full NPU Scheduling/Static vNPU Scheduling (Training) for feature usage.