YAML Parameters

The following table describes only the fields related to MindCluster in the YAML file of OME Serving Runtime.

**Table 1** YAML parameters
Parameter	Value	Description
schedulerName	volcano	Volcano is used as the scheduler.
(Optional) host-arch	Arm: huawei-arm x86_64: huawei-x86	Architecture of the node where a training job is executed. Set this parameter as required. In a distributed training job, ensure that the nodes running the training job have the same architecture.
sp-block	Number of processors on logical SuperPoDs. The value must be an integer multiple of the number of processors on a node, and the total number of processors requested by prefill/decode instances must be an integer multiple of the value.	Cluster scheduling components divide logical SuperPoDs on physical SuperPoDs based on the division policy for affinity scheduling of training jobs. If this field is not specified, Volcano sets the size of the logical SuperPoD of a job to the total number of NPUs configured for the job during scheduling. For details, see UnifiedBus Interconnect Device Network Description. NOTE: This field can be used only on the Atlas 900 A3 SuperPoD.
huawei.com/schedule_minAvailable	Integer	Minimum number of replicas that can be scheduled by a job. This field must be specified in the Deployment scenario. Set this field to the number of effective replicas of the engine or decoder based on the prefill instance or decode instance to which the field belongs. This field does not need to be specified in other scenarios.
pod-rescheduling	on: Enable pod-level rescheduling. Other values or not using this field: Disable pod-level rescheduling.	Pod-level rescheduling means when a job is faulty, not all job pods in PodGroup are deleted. Instead, only faulty pods are deleted, and the controller creates new pods for rescheduling. NOTE: For OME inference jobs, set this field to on, so that MindCluster can reschedule the faulty prefill/decode instances.
accelerator-type	Atlas 800I A2 inference server: module-910b-8 Atlas 800I A3 SuperPoD Server: module-a3-16 Atlas 900 A3 SuperPoD: module-a3-16-super-pod	Set this parameter based on the type of the node where a training job is executed.
huawei.com/Ascend910	Atlas 800I A2 inference server: 8 Atlas 900 A3 SuperPoD/Atlas 800I A3 SuperPoD Server: 16	Number of required NPUs. Currently, only full-server scheduling is supported. Set the value to the actual number of used NPUs.
env[name==ASCEND_VISIBLE_DEVICES].valueFrom.fieldRef.fieldPath	The value is metadata.annotations['huawei.com/Ascend910'], which must be the same as the actual processor type used in the environment.	Ascend Docker Runtime obtains the value of this parameter and mounts NPUs of the corresponding type to a container. NOTE: This parameter applies only to the full NPU scheduling feature of the Volcano scheduler. If you use static vNPU scheduling and other schedulers, delete this parameter from the example YAML file.
fault-scheduling	grace	Enables graceful deletion. The original pod is gracefully deleted first. If graceful deletion has not been successful within 15 minutes, it is forcibly deleted.
	force	Enable the forcible deletion mode for a job to forcibly delete the original pod.
	off	Rescheduling upon faults is disabled.
	None (no fault-scheduling field)
	Other values
fault-retry-times	fault-retry-times > 0	To rectify service plane faults, you must configure the number of unconditional retries on the service plane.
fault-retry-times	None (no fault-retry-times) or 0	Unconditional retry is not triggered, and Volcano does not delete the faulty pod after a service plane fault occurs.

Parent topic: Use on the CLI