Installing Volcano

Volcano or another scheduler must be installed on the management node when you need to use functions of full NPU scheduling, static vNPU scheduling, dynamic vNPU scheduling, resumable training, elastic training, recovery of inference card faults, or rescheduling upon inference card faults.
If Volcano is used for job scheduling, it is not advised to create or mount NPUs in a container using Docker or Containerd and run jobs in the container. Otherwise, Volcano may encounter scheduling problems.
If you need only containerization and resource monitoring functions, you do not need to install Volcano. In this case, skip this section.
This section describes how to install the Volcano components (vc-scheduler and vc-controller-manager). If you need to use other open-source Volcano components, install them by yourself and ensure their security.
- In this document, Volcano refers to Volcano involved in the cluster scheduling components. If you need other open-source Volcano-based schedulers, refer to (Optional) Integrating the Ascend Plugin to Extend Open-Source Volcano to integrate the Ascend-volcano-plugin plugin and enable NPU scheduling.
- NodeD of 6.0.RC1 and later versions are incompatible with Volcano of earlier versions. If you use NodeD of 6.0.RC1 or later, you need to use Volcano of 6.0.RC1 or later.
- When Volcano 6.0.RC2 or later is used as the scheduler, ClusterD must be installed. If ClusterD is not installed, you must modify the startup parameters of Volcano. Otherwise, Volcano cannot schedule jobs.

Procedure

Log in to the Kubernetes management node as the root user and check whether the Volcano image and version number are correct.

docker images | grep volcanosh

Command output:

volcanosh/vc-controller-manager      v1.7.0              84c73128cc55        3 days ago          44.5MB
volcanosh/vc-scheduler               v1.7.0              e90c114c75b1        3 days ago          188MB

If correct, proceed to Step 2.
If not correct, create the image by referring to Preparing an Image.

Copy the YAML file in the directory where the Volcano package is decompressed to any directory on the Kubernetes management node.
Skip this step if you do not need to modify the component startup parameters. Otherwise, modify the Volcano startup parameters in the corresponding startup YAML file based on your requirements. For details about common startup parameters, see Table 4 and Table 5.

Configure log dump for Volcano.

During the installation, Volcano logs are mounted to the drive space (/var/log/mindx-dl). By default, Volcano clears log files when the size of daily logs reaches 1.8 GB. To prevent the drive space from being used up, configure log dump for Volcano. For details, see Table 1. Alternatively, select a more frequent log dump policy to prevent log loss.

In the /etc/logrotate.d directory on the management node, run the following command to create a log dump configuration file:

vi /etc/logrotate.d/file_name

Example:

vi /etc/logrotate.d/volcano

Add the following content to the file and run the :wq command to save the file:

/var/log/mindx-dl/volcano-*/*.log{    
     daily     
     rotate 8     
     size 50M     
     compress     
     dateext     
     missingok     
     notifempty     
     copytruncate     
     create 0640 hwMindX hwMindX     
     sharedscripts     
     postrotate         
         chmod 640 /var/log/mindx-dl/volcano-*/*.log                
         chmod 440 /var/log/mindx-dl/volcano-*/*.log-*            
     endscript 
}

Run the following commands in sequence to set the configuration file permission to 640 and owner to root:

chmod 640 /etc/logrotate.d/file_name
chown root /etc/logrotate.d/file_name

Example:

chmod 640 /etc/logrotate.d/volcano
chown root /etc/logrotate.d/volcano

**Table 1** Configuration items of Volcano log dump files
Configuration Item	Description	Possible Value
daily	Log dump frequency	daily: Performs the dump check once a day. weekly: Performs the dump check once a week. monthly: Performs the dump check once a month. yearly: Performs the dump check once a year.
rotate x	Number of times that log files are dumped before they are deleted	x indicates the number of backups. Example: rotate 0: no backup rotate 8: eight backups
size xx	A log file is dumped only when its size reaches the value of this parameter.	The size unit can be specified as follows: byte (default value) K M For example, size 50M indicates that a log file is dumped when its size reaches 50 MB. NOTE: logrotate periodically checks the sizes of log files based on the configured dump frequency. Dump is triggered only when the size of a log file exceeds the value of size. This means that logrotate does not dump a log file as soon as it reaches its size limit.
compress	Whether to compress dumped logs using gzip	compress: Use gzip for compression. nocompress: Do not use gzip for compression.
notifempty	Whether to dump empty files	ifempty: Dump empty files. notifempty: Do not dump empty files.

(Optional) In volcano-v{version}.yaml, configure the CPU and memory required by Volcano. For the recommended CPU and memory values, see the recommended values in the volcano-controller and volcano-scheduler tables in official documentation of open-source Volcano.

...
kind: Deployment
...
  labels:
    app: volcano-scheduler
spec:
  replicas: 1
...
    spec:
...
          imagePullPolicy: "IfNotPresent"
          resources:
            requests:
              memory: 4Gi
              cpu: 5500m
            limits:
              memory: 8Gi
              cpu: 5500m
...
kind: Deployment
...
  labels:
    app: volcano-controller
spec:
...
    spec:
...
          resources:
            requests:
              memory: 3Gi
              cpu: 2000m
            limits:
              memory: 3Gi
              cpu: 2000m
...

(Optional) Optimize the scheduling time. The plugin used by Volcano can be configured in volcano-v{version}.yaml. For details, see "advanced Volcano configuration parameters" and "supported plugins" in official documentation of open-source Volcano.

...
data:
  volcano-scheduler.conf: |
    actions: "enqueue, allocate, backfill"
    tiers:
    - plugins:
      - name: priority
        enableNodeOrder: false
      - name: gang
        enableNodeOrder: false
      - name: conformance
        enableNodeOrder: false
      - name: volcano-npu_v7.3.0_linux-aarch64   # v7.3.0 indicates the MindCluster version. The number varies depending on the actual version.
    - plugins:
      - name: drf
        enableNodeOrder: false
      - name: predicates
        enableNodeOrder: false
        arguments:
          predicate.GPUSharingEnable: false
          predicate.GPUNumberEnable: false
      - name: proportion
        enableNodeOrder: false
      - name: nodeorder
      - name: binpack
        enableNodeOrder: false
....

(Optional) In volcano-v{version}.yaml, enable the Volcano health check interface and Prometheus information collection interface.

...
kind: Deployment
metadata:
  name: volcano-scheduler
  namespace: volcano-system
  labels:
    app: volcano-scheduler
spec:
  ...
  template:
...
        - name: volcano-scheduler
          image: volcanosh/vc-scheduler:v1.7.0
          args: [ ...
              ...
              --enable-healthz=true   # To ensure that the Volcano health check interface can be accessed, the value of this parameter must be true.
              --enable-metrics=true # To ensure that the Prometheus information collection interface can be accessed, the value of this parameter must be true.
              ...
...

**Table 2** Open interfaces of the cluster scheduling Volcano
Access Mode	Protocol	Method	Description	Component
http://podIP:11251/healthz	http	Get	Health check interface	volcano-controller
http://podIP:11251/healthz	http	Get	Health check interface	volcano-scheduler
http://volcano-scheduler-serviceIP:8080/metrics	http	Get	Prometheus information collection interface	volcano-scheduler

(Optional) In volcano-v{version}.yaml, configure pod deletion mode, virtualization mode, switch affinity scheduling, and self-maintenance of available processor status provided by cluster scheduling components for Volcano during rescheduling.

...
data:
  volcano-scheduler.conf: |
...
    configurations:
      - name: init-params
        arguments: {"grace-over-time":"900","presetVirtualDevice":"true","nslb-version":"1.0","shared-tor-num":"2","useClusterInfoManager":"false","self-maintain-available-card":"true","super-pod-size": "48","reserve-nodes": "2","forceEnqueue":"true"}
...

**Table 3** Parameters
Parameter	Default Value	Description
grace-over-time	900	Maximum time required for deleting a pod in graceful deletion mode during rescheduling. The value ranges from 2 to 3600, in seconds. This field indicates the graceful deletion mode during rescheduling. Graceful deletion means that during rescheduling, the system waits for Volcano to perform related operations for pod deletion. If the pod is not deleted after 900 seconds, it is forcibly deleted.
presetVirtualDevice	true	Virtualization mode. true: static virtualization false: dynamic virtualization
nslb-version	1.0	Switch affinity scheduling version. The value can be 1.0 or 2.0. NOTE: Switch affinity scheduling 1.0 supports Atlas training product and Atlas A2 training product as well as PyTorch and MindSpore. Switch affinity scheduling 2.0 supports Atlas A2 training product and PyTorch.
shared-tor-num	2	Maximum number of shared switches that can be used by a single task in switch affinity scheduling 2.0. The value can be 1 or 2. This parameter takes effect only when nslb-version is set to 2.0. For details about switch affinity scheduling (1.0 or 2.0), see Node-based Affinity.
useClusterInfoManager	true	Method of obtaining cluster information by Volcano. The options are as follows: true: read ConfigMap reported by ClusterD. false: read ConfigMap reported by Ascend Device Plugin and NodeD respectively. NOTE: By default, ConfigMap reported by ClusterD is used. In later versions, ConfigMap reported by Ascend Device Plugin and NodeD cannot be read.
self-maintain-available-card	true	Whether Volcano self-maintains the available processor status. The options are as follows: true: Volcano self-maintains the available processor status. false: Volcano obtains the available processor status based on the ConfigMap reported by ClusterD or Ascend Device Plugin.
super-pod-size	48	Number of nodes in an Atlas 900 A3 SuperPoD.
reserve-nodes	2	Number of reserved nodes in an Atlas 900 A3 SuperPoD. NOTE: If the value of reserve-nodes is greater than that of super-pod-size, the following scenarios may occur: If the value of super-pod-size is greater than 2, the value of reserve-nodes is reset to 2 by default. If the value of super-pod-size is less than or equal to 2, the value of reserve-nodes is reset to 0 by default.
forceEnqueue	true	Whether a job is forcibly added to the to-be-scheduled queue when cluster NPU resources are sufficient. The options are as follows: true: If Volcano enables Enqueue and the cluster NPU resources meet the job requirements, the job is forcibly added to the to-be-scheduled queue, regardless of whether other resources are sufficient. If the job stays in the to-be-scheduled queue for a long time, resources are pre-occupied. As a result, other jobs may fail to be added to the queue. Other values: If cluster NPU resources are insufficient, the job is rejected from entering the to-be-scheduled queue. If NPU resources meet the job requirements, all plugins determine whether the job enters the to-be-scheduled queue. For details about this parameter, see Volcano Actions.

For details about how to configure open-source Volcano, see official documentation of open-source Volcano.
Kubernetes allows nodeAffinity to conduct node affinity scheduling. For details about this field, see Kubernetes documentation. Volcano can also use this field. For details, see Scheduling.

(Optional) Optimize the scheduling time. In a single vcjob or acjob, Volcano can reduce the time of scheduling 4,000 or 5,000 pods to 4,000 or 5,000 nodes to about 5 minutes. To use this function, modify volcano-v{version}.yaml as follows.

To meet the reference time of about 5 minutes, ensure that the CPU frequency is at least 2.60 GHz and the APIServer latency is less than 80 ms.
If the native nodeAffinity and podAntiAffinity fields of Kubernetes are not used for scheduling, you can disable the nodeorder plugin to further reduce the scheduling time.

data:
  volcano-scheduler.conf: |

...
      - name: proportion
        enableNodeOrder: false
      - name: nodeorder
        enableNodeOrder: false # (Optional) Disable the nodeorder plugin when nodeAffinity and podAntiAffinity are not used for scheduling.
...
      containers:
        - name: volcano-scheduler
          image: volcanosh/vc-scheduler:v1.7.0
          command: ["/bin/ash"]
          args: ["-c", "umask 027; GOMEMLIMIT=15000000000 GOGC=off /vc-scheduler # Add GOMEMLIMIT=15000000000 and GOGC=off fields.
                  --scheduler-conf=/volcano.scheduler/volcano-scheduler.conf
                  --plugins-dir=plugins
                  --logtostderr=false
                  --log_dir=/var/log/mindx-dl/volcano-scheduler
                  --log_file=/var/log/mindx-dl/volcano-scheduler/volcano-scheduler.log
                  -v=2 2>&1"]
          imagePullPolicy: "IfNotPresent"
          resources:
            requests:
              memory: 10000Mi                                                                #Change 4 GiB to 10000 MiB.
              cpu: 5500m
            limits:
              memory: 15000Mi                                                       #  #Change 8 GiB to 15000 MiB.
              cpu: 5500m
...

Run the following command in the directory where the YAML file of the management node is stored to start Volcano.

kubectl apply -f volcano-v{version}.yaml

Startup example:

namespace/volcano-system created
namespace/volcano-monitoring created
configmap/volcano-scheduler-configmap created
serviceaccount/volcano-scheduler created
clusterrole.rbac.authorization.k8s.io/volcano-scheduler created
clusterrolebinding.rbac.authorization.k8s.io/volcano-scheduler-role created
deployment.apps/volcano-scheduler created
service/volcano-scheduler-service created
serviceaccount/volcano-controllers created
clusterrole.rbac.authorization.k8s.io/volcano-controllers created
clusterrolebinding.rbac.authorization.k8s.io/volcano-controllers-role created
deployment.apps/volcano-controllers created
customresourcedefinition.apiextensions.k8s.io/jobs.batch.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/commands.bus.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/podgroups.scheduling.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/queues.scheduling.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/numatopologies.nodeinfo.volcano.sh created

Query the component status.

kubectl get pod -n volcano-system

If Running is displayed in the command output, the component is started successfully.

NAME                                          READY    STATUS     RESTARTS     AGE
volcano-controllers-5cf8d788d5-qdpzq   1/1     Running   0          1m
volcano-scheduler-6cffd555c9-45k7c     1/1     Running   0          1m

If the pod status of Volcano is CrashLoopBackOff, rectify the fault by referring to After Volcano Is Manually Installed, the Pod Status Is CrashLoopBackOff.
If volcano-scheduler-6cffd555c9-45k7c is in the Running state but the scheduling is abnormal, rectify the fault by referring to Volcano Works Abnormally, and "Failed to get plugin" Is Displayed in the Log.
After the component is installed, if the pod status of the component is not Running, refer to Component pods Are Not in the Running State.
After the component is installed, if the pod status of the component is ContainerCreating, refer to Cluster Scheduling Component Pods Are in the ContainerCreating State.
If the component fails to be started, refer to Cluster Scheduling Components Fail to Start and "get sem errno =13" Is Displayed in Logs.
If the component is started successfully, but the corresponding pod cannot be found, refer to YAML File for Starting a Component Is Successfully Executed, But the pod Corresponding to the Component Is Not Displayed.

Parameters

**Table 4** volcano-scheduler startup parameters
Parameter	Type	Default Value	Description
--log-dir	String	None	Log directory. The default value in the component startup YAML file is /var/log/mindx-dl/volcano-scheduler.
--log-file	String	None	Log file name. The default value in the component startup YAML file is /var/log/mindx-dl/volcano-scheduler/volcano-scheduler.log. NOTE: Dumped files are named in the format of "volcano-scheduler.log-dump triggering time.gz", for example, volcano-scheduler.log-20230926.gz.
--scheduler-conf	String	/volcano.scheduler/volcano-scheduler.conf	Absolute path of the configuration file of the scheduling component.
--logtostderr	Bool	false	Whether to print logs in the standard output. true: yes. false: no.
-v	Integer	2	Log output level. 1: error 2: warning 3: info 4: debug
--plugins-dir	String	plugins	Path for loading the scheduler plugin.
--version	Bool	false	Whether to query the volcano-scheduler binary version. true: queries the version. false: does not query the version.
--log_file_max_size	Integer	1800	Maximum size of a log file, in MB. NOTE: When the size of a log file exceeds the threshold, the log content is cleared.
--leader-elect	Bool	false	Primary node selected during multi-copy startup.
--percentage-nodes-to-find	Integer	100	Percentage of available nodes selected during job scheduling to the total number of nodes in a cluster.

**Table 5** volcano-controller startup parameters
Parameter	Type	Default Value	Description
--log-dir	String	None	Log directory. The default value in the component startup YAML file is /var/log/mindx-dl/volcano-controller.
--log-file	String	None	Log file name. The default value in the component startup YAML file is /var/log/mindx-dl/volcano-controller/volcano-controller.log. NOTE: Dumped files are named in the format of "volcano-controller.log-dump triggering time.gz", for example, volcano-controller.log-20230926.gz.
--logtostderr	Bool	false	Whether to print logs in the standard output. true: yes. false: no.
-v	Integer	4	Log output level. 1: error 2: warning 3: info 4: debug
--version	Bool	false	volcano-controller binary version number.
--log_file_max_size	Integer	1800	Maximum size of a log file, in MB. NOTE: When the size of a log file exceeds the threshold, the log content is cleared.

Volcano is open-source software. Only common startup parameters are listed here. For details about other parameters, see the description of the open-source software.

Parent topic: Volcano