Working Principles
- The cluster scheduling components periodically report node and processor information. kubelet reports the number of processors on a node to the node object.
- Ascend Device Plugin reports the processor memory and topology information.
For a processor with on-chip memory, Ascend Device Plugin reports its memory details (node-label) upon startup, the entire NPU information and physical processor ID to device-info-cm, and the total number of schedulable processors (allocatable), number of used processors (allocated), and basic processor information (device ip and super_device_ip) to the node for entire NPU scheduling.
- When a node is faulty, NodeD periodically reports the node health status, node hardware fault information, and node DPC shared storage fault information to node-info-cm.
- Ascend Device Plugin reports the processor memory and topology information.
- After reading the information in device-info-cm and node-info-cm, ClusterD integrates the information into cluster-info-cm.
- Use kubectl or other deep learning platforms to StormService inference jobs of AIBrix. aibrix-controller-manager generates sub-workloads of RoleSet or PodSet based on the inference job configuration, and then the corresponding sub-workloads generate multiple inference job pods. For details about RoleSet or PodSet, see AIBrix documentation.
- volcano-controller creates a PodGroup for the job. For details about PodGroup, see the open-source Volcano documentation.
- volcano-scheduler selects a proper node for the pod based on the node memory, CPU, label, and affinity, and writes the selected processor information and node hardware information to the pod annotation.
- When kubelet creates a container, Ascend Device Plugin is called to mount processors. Ascend Device Plugin or volcano-scheduler writes the processor and node hardware information to the pod annotation. Ascend Docker Runtime assists in mounting corresponding resources.
Parent topic: Deploying vLLM Inference Jobs