Working Principles

The cluster scheduling components periodically report node and processor information. kubelet reports the number of processors on a node to the node object.
- Ascend Device Plugin reports the processor memory and topology information.
  For a processor with on-chip memory, upon startup, Ascend Device Plugin reports the processor memory details (node-label), the entire NPU information and physical processor ID to device-info-cm, and the total number of schedulable processors (allocatable), number of used processors (allocated), and basic processor information (device ip and super_device_ip) to the node for entire NPU scheduling.
- When a node is faulty, NodeD periodically reports the node health status, node hardware fault information, and node DPC shared storage fault information to node-info-cm.
After reading the information in device-info-cm and node-info-cm, ClusterD integrates the information into cluster-info-cm.
Use kubectl or other deep learning platforms to deliver SGLang inference jobs of OME. OME generates sub-workloads of Deployment or LeaderWorkerSet (LWS) based on the inference job configuration, and then the corresponding sub-workloads generate multiple inference job pods. For details about Deployment or LeaderWorkerSet, see OME documentation.
volcano-controller or LeaderWorkerSet creates a PodGroup for the job. For details about podGroup, see the open source Volcano documentation.
For an SGLang inference job pod, volcano-scheduler selects a proper node based on memory, CPU, labels, and affinity. It also considers processor topology, recording the chosen processor and node hardware details in the pod annotation.
When kubelet creates a container, for an SGLang inference job deployed based on OME, Ascend Device Plugin is called to mount processors, and Ascend Device Plugin or volcano-scheduler writes the processor and node hardware information into the pod annotation. Ascend Docker Runtime assists in mounting corresponding resources.

Parent topic: Deploying an OME-based SGLang Inference Job