Full NPU Scheduling

Function Highlights

When running a training or inference job, you can schedule it to the entire NPU of a node and exclusively occupy the NPU to execute the job. This feature uses the basic scheduling function supported by Kubernetes and works with Volcano or other schedulers to select proper NPUs based on the physical topology of NPUs. This maximizes NPU performance, schedules NPUs for training or inference jobs, and optimally allocates other resources.

Volcano can be used to implement switch affinity scheduling and Ascend AI processor-based affinity scheduling. Based on the interconnection topology and processing logic of the Ascend AI processor, Volcano, as a scheduler, can maximize the computing performance of the Ascend AI processor. For details about switch affinity scheduling and Ascend AI processor-based affinity scheduling, see Affinity Scheduling.

Required Component

Volcano or other schedulers
Ascend Device Plugin
Ascend Docker Runtime
Ascend Operator
ClusterD
NodeD

Instructions

Refer to Installation and Deployment for component installation.
Refer to Full NPU Scheduling/Static vNPU Scheduling (Training) for feature usage.

Parent topic: Basic Scheduling