Scheduling Process of the Ascend AI processor

The overall scheduling logic is illustrated as below. Ascend Device Plugin discovers Ascend AI Processor resources and reports them. Volcano is a scheduler adapted and modified by Huawei based on the open source Volcano framework.

Scheduling Processes

  • Scheduling process 1
    Figure 1 Scheduling process 1

    By default, the value of self-maintain-available-card in the Volcano startup YAML file is true. The scheduling process of the Ascend AI processor is as follows:

    1. Ascend Device Plugin reports the health status of the Ascend AI processor.
    2. You can call kube-apiserver to create a service container that uses NPUs, for example, vcjob.
    3. Volcano calculates the available Ascend AI processors based on the node and ConfigMap information.
    4. Volcano writes the Ascend AI processor allocation information and timestamp to the pod's Annotations field based on the affinity scheduling principle. After writing the resource information, Volcano submits a pod binding request to Kubernetes.
    5. In each information reporting period, Ascend Device Plugin reads the mounted processor information from the pod's Annotations. If the information needs to be corrected, update it to the pod's Annotation through kube-apiserver. The corrected annotations include huawei.com/resource name, huawei.com/AscendReal, and ascend.kubectl.kubernetes.io/ascend-910-configuration.
    6. Once kubelet detects that a pod is scheduled to its node, it calls the Allocate function of Ascend Device Plugin to mount the NPU device. Ascend Docker Runtime can also be used to mount the NPU device.
    7. Ascend Device Plugin queries the list of pods in the Pending status on the node, obtains the pod with the earliest timestamp after affinity scheduling, obtains the ID of the mounted device, and sends the device ID to kubelet for device mounting.
  • Scheduling process 2
    Figure 2 Scheduling process 2
    The following illustrates the scheduling process of the Ascend AI processor, if the value of self-maintain-available-card in the Volcano startup YAML file is set to false.
    1. Ascend Device Plugin reports the health status of the Ascend AI processor.
    2. Ascend Device Plugin uses kube-apiserver to write the information about the idle Ascend AI processors (healthy Ascend AI processors – used Ascend AI processors) to the DeviceInfo field of ConfigMap mindx-dl-deviceinfo-nodeName.
    3. You can call kube-apiserver to create a service container that uses NPUs, for example, vcjob.
    4. Volcano obtains the available Ascend AI processors based on DeviceInfo.
    5. Volcano writes the Ascend AI processor allocation information and timestamp to the pod's Annotations field based on the affinity scheduling principle. After writing the resource information, Volcano submits a pod binding request to Kubernetes.
    6. Once kubelet detects that a pod is scheduled to its node, it calls the Allocate function of Ascend Device Plugin to mount the NPU device. Ascend Docker Runtime can also be used to mount the NPU device.
    7. Ascend Device Plugin queries the list of pods in the Pending status on the node, obtains the pod with the earliest timestamp after affinity scheduling, obtains the ID of the mounted device, and sends the device ID to kubelet for device mounting.
    8. Ascend Device Plugin updates the allocatable Ascend AI processor displayed in DeviceInfo.

Field Description

  1. Ascend Device Plugin (open source version) reports node resources in ConfigMap format. The reported resource format is huawei.com/resource name:resource name+physical ID, as shown in Figure 3. The marked part in the figure indicates the list of available Ascend AI processors, calculated by the total number of healthy Ascend AI processors minus the number of Ascend AI processors allocated by Volcano. The information about all healthy Ascend AI processors is obtained by calling the NPU driver interface. The processors allocated by Volcano are obtained by traversing all pods that meet the conditions on the current node. That is, the pod status is not Failed or Succeeded, and the pod's Annotations contains the Ascend AI processor information allocated by Volcano.
    • You can log in to the background environment and run the kubectl describe cm mindx-dl-deviceinfo-{nodeName} -n kube-system command to obtain the reported resource information.
    • This field huawei.com/resource name will be deleted in later versions. By default, the available processors of a node are maintained by Volcano, and this field does not take effect. To make the field take effect, change the value of the Volcano configuration parameter self-maintain-available-card to false.
    Figure 3 Node NPU resource information
  2. Volcano calculates the available Ascend AI processors based on the node and ConfigMap information. (If self-maintain-available-card of Volcano is disabled, Volcano reads the DeviceInfo field information using huawei.com/resource name as the key to obtain available Ascend AI processors.) Once the Ascend AI processor that satisfies the affinity rules and is required by the job is selected according to the affinity scheduling policy (that is, the Ascend AI processor allocated to the job), Volcano writes the processor allocation information in the pod's Annotations, as shown in the first part of Figure 4. Then, it writes predicate-time, which captures the exact timestamp of resource allocation. This value does not need to be converted into a readable format, but it can be directly used for comparison. Once kubelet detects that a pod is scheduled to its node, it calls the Allocate function of device-plugin to mount the NPU device.
    Figure 4 NPU information assigned to the pod
  3. Ascend Device Plugin receives the Allocate request (two-processor task used as an example). The input parameter of Allocate is randomly allocated by kubelet, as shown in the huawei.com/kltDev field in Figure 4. The value may be the Ascend AI processor ID that does not meet the affinity rule, for example, Ascend910-7 and Ascend910-0.

    In this case, Ascend Device Plugin finds all pods that meet the conditions on the current node. Pods are not in the Failed or Succeeded status, and Annotations contains the allocated Ascend AI processor IDs written by Volcano. The count must be the same as the number of Ascend AI processors allocated by kubelet.

    Then, select the pod with the smallest predicate-time from the pods that meet the conditions, and change predicate-time to the maximum uint value to prevent future selection. Finally, parse Annotations to obtain the Ascend AI processor information allocated by Volcano, for example, Ascend910-0 and Ascend910-1. Return the information such as the mount path and write the information about the actually allocated Ascend AI processors to huawei.com/AscendReal under Annotations.