(Optional) Integrating the Ascend Plugin to Extend Open-Source Volcano

Volcano involved in the cluster scheduling components adds NPU-related scheduling functions based on the open-source Volcano. The functions can be implemented through Ascend-volcano-plugin. The open-source Volcano framework supports a plugin mechanism for users to register scheduling plugins to implement different scheduling policies.

Ascend-volcano-plugin supports open-source Volcano v1.7.0 and v1.9.0, and does not modify the open-source Volcano framework.

Procedure

  1. Run the following commands in sequence to obtain the official open-source code of Volcano v1.7 from the $GOPATH/src/volcano.sh/ directory.
    mkdir -p $GOPATH/src/volcano.sh/
    cd $GOPATH/src/volcano.sh/ 
    git clone -b release-1.7 https://github.com/volcano-sh/volcano.git
  2. Rename the obtained source code of ascend-for-volcano as ascend-volcano-plugin and upload it to the plugin path ($GOPATH/src/volcano.sh/volcano/pkg/scheduler/plugins/) of the official open-source Volcano code.
  3. Compile the open-source Volcano binary file and Huawei NPU scheduling plugin .so file. Select parameters for the build.sh script based on the open-source code version, for example, v1.7.0.
    cd $GOPATH/src/volcano.sh/volcano/pkg/scheduler/plugins/ascend-volcano-plugin/build
    chmod +x build.sh
    ./build.sh v1.7.0

    The compiled binary files and DLL files are stored in $GOPATH/src/volcano.sh/volcano/pkg/scheduler/plugins/ascend-volcano-plugin/output.

    For the list of compiled files, see Table 1.

    Table 1 Files in the output directory

    File

    Description

    volcano-npu-{version}.so

    DDL of the Huawei NPU scheduling plugin

    Dockerfile-scheduler

    Image build text file of volcano-scheduler

    Dockerfile-controller

    Image build text file of volcano-controller

    volcano-v{version}.yaml

    Volcano startup configuration file

    vc-scheduler

    volcano-scheduler binary file

    vc-controller-manager

    volcano-controller binary file

  4. Use either of the following methods to start volcano-scheduler.
    • Startup YAML file provided by cluster scheduling components
      1. Create a Volcano image. Select parameters for the image based on the open-source code version, for example, v1.7.0.
        docker build --no-cache -t volcanosh/vc-scheduler:v1.7.0 ./ -f ./Dockerfile-scheduler
      2. Start volcano-scheduler.
        kubectl apply -f volcano-v{version}.yaml
        Startup example:
        namespace/volcano-system created
        namespace/volcano-monitoring created
        configmap/volcano-scheduler-configmap created
        serviceaccount/volcano-scheduler created
        clusterrole.rbac.authorization.k8s.io/volcano-scheduler created
        clusterrolebinding.rbac.authorization.k8s.io/volcano-scheduler-role created
        deployment.apps/volcano-scheduler created
        service/volcano-scheduler-service created
        serviceaccount/volcano-controllers created
        clusterrole.rbac.authorization.k8s.io/volcano-controllers created
        clusterrolebinding.rbac.authorization.k8s.io/volcano-controllers-role created
        deployment.apps/volcano-controllers created
        customresourcedefinition.apiextensions.k8s.io/jobs.batch.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/commands.bus.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/podgroups.scheduling.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/queues.scheduling.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/numatopologies.nodeinfo.volcano.sh created
    • Startup YAML file of the open-source Volcano
      1. Copy volcano-npu-{version}.so compiled in Step 3 to $GOPATH/src/volcano.sh/volcano of the open-source Volcano, and add the following commands to its Dockerfile ($GOPATH/src/volcano.sh/volcano/installer/dockerfile/scheduler/Dockerfile).
        FROM golang:1.19.1 AS builder
        WORKDIR /go/src/volcano.sh/
        ADD . volcano
        RUN cd volcano && make vc-scheduler
        FROM alpine:latest
        COPY --from=builder /go/src/volcano.sh/volcano/_output/bin/vc-scheduler /vc-scheduler
        COPY volcano-npu_*.so plugins/     # Newly added
        ENTRYPOINT ["/vc-scheduler"]
      2. Create a Volcano image. Select parameters for the image based on the open-source code version, for example, v1.7.0.
        cd $GOPATH/src/volcano.sh/volcano
        docker build --no-cache -t volcanosh/vc-scheduler:v1.7.0 ./ -f installer/dockerfile/scheduler/Dockerfile
      3. Modify volcano-development.yaml. The file path is $GOPATH/src/volcano.sh/volcano/installer/volcano-development.yaml.
        apiVersion: v1
        kind: ConfigMap
        metadata: 
          name: volcano-scheduler-configmap 
          namespace: volcano-system
        data:
           volcano-scheduler.conf: |
             actions: "enqueue, allocate, backfill"
             tiers:
             - plugins:
               - name: priority
               - name: gang
                 enablePreemptable: false
               - name: conformance
               - name: volcano-npu_v7.3.0_linux-x86_64    # Custom scheduling plugin newly added to ConfigMap. Ensure that the component version mapping is correct.
             - plugins:
               - name: overcommit
               - name: drf
                 enablePreemptable: false
               - name: predicates
               - name: proportion
               - name: nodeorder
               - name: binpack
            configurations:           # Add the following information in bold. This field is required for Volcano.
              - name: init-params
                arguments: {"grace-over-time":"900","presetVirtualDevice":"true","nslb-version":"1.0","shared-tor-num":"2","useClusterInfoManager":"false","super-pod-size": "48","reserve-nodes": "2"}
        ...
        kind: Deployment
        apiVersion: apps/v1
        metadata:
          name: volcano-scheduler
          namespace: volcano-system
          labels:
            app: volcano-scheduler
        spec:
          ...
          template:
        ...
                - name: volcano-scheduler
                  image: volcanosh/vc-scheduler:v1.7.0
                  args:
                    - --logtostderr
                    - --scheduler-conf=/volcano.scheduler/volcano-scheduler.conf
                    - --enable-healthz=true   
                    - --enable-metrics=true      
                    - --plugins-dir=plugins       # Load the custom plugin in the volcano-scheduler startup command.
                    - -v=3
                    - 2>&1
        ---
        # Source: volcano/templates/scheduler.yaml
        kind: ClusterRole
        apiVersion: rbac.authorization.k8s.io/v1
        metadata:
          name: volcano-scheduler
        rules:
        ...
          - apiGroups: ["nodeinfo.volcano.sh"]
            resources: ["numatopologies"]
            verbs: ["get", "list", "watch", "delete"]
         - apiGroups: [""]                                        # Add get permission for new services.
            resources: ["services"]
            verbs: ["get"]
          - apiGroups: [""]
            resources: ["configmaps"]
            verbs: ["get", "create", "delete", "update","list","watch"]    # Add list and watch permissions for new ConfigMaps.
          - apiGroups: ["apps"]
            resources: ["daemonsets", "replicasets", "statefulsets"]
            verbs: ["list", "watch", "get"]
        ...
      4. Start volcano-scheduler.
        kubectl apply -f installer/volcano-development.yaml
        Command output:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        namespace/volcano-system created
        namespace/volcano-monitoring created
        serviceaccount/volcano-admission created
        configmap/volcano-admission-configmap created
        clusterrole.rbac.authorization.k8s.io/volcano-admission created
        clusterrolebinding.rbac.authorization.k8s.io/volcano-admission-role created
        service/volcano-admission-service created
        deployment.apps/volcano-admission created
        job.batch/volcano-admission-init created
        customresourcedefinition.apiextensions.k8s.io/jobs.batch.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/commands.bus.volcano.sh created
        serviceaccount/volcano-controllers created
        clusterrole.rbac.authorization.k8s.io/volcano-controllers created
        clusterrolebinding.rbac.authorization.k8s.io/volcano-controllers-role created
        deployment.apps/volcano-controllers created
        serviceaccount/volcano-scheduler created
        configmap/volcano-scheduler-configmap created
        clusterrole.rbac.authorization.k8s.io/volcano-scheduler created
        clusterrolebinding.rbac.authorization.k8s.io/volcano-scheduler-role created
        service/volcano-scheduler-service created
        deployment.apps/volcano-scheduler created
        customresourcedefinition.apiextensions.k8s.io/podgroups.scheduling.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/queues.scheduling.volcano.sh created
        customresourcedefinition.apiextensions.k8s.io/numatopologies.nodeinfo.volcano.sh created
        mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-pods-mutate created
        mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-queues-mutate created
        mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-podgroups-mutate created
        mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-jobs-mutate created
        validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-jobs-validate created
        validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-pods-validate created
        validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-queues-validate created