Use on the CLI (Other Schedulers)

The process of using other schedulers on the CLI is the same as that of using Volcano. The only difference is that the required job YAML files are different. Prepare and use the corresponding YAML files by referring to Use on the CLI (Volcano).

Procedure

Download YAML files from the cluster scheduling code repository.

**Table 1** YAML files of different hardware models
Job Type	Hardware Model	YAML File Name	How to Obtain
Jobs of Kubernetes or other schedulers	Atlas 200I SoC A1 core board	infer-310p-1usoc.yaml	Click here.
Jobs of Kubernetes or other schedulers	Inference nodes of other types	infer.yaml	Click here.

Upload the YAML file to any directory on the management node and modify the file content as required.

**Table 2** Parameters in the YAML file
Parameter	Value	Description
image	-	Inference image name. Change it based on your actual requirements. (It is the name of the image created in the image preparation section.)
replicas	Integer	Number of job replicas. Generally, the value is 1.
requests	Inference server (equipped with Atlas 300I inference cards) huawei.com/Ascend310: number of processors Atlas inference product in non-mixed insertion mode: huawei.com/Ascend310P: number of processors Atlas inference product in mixed insertion mode: huawei.com/Ascend310P-V: number of processors huawei.com/Ascend310P-VPro: number of processors huawei.com/Ascend310P-IPro: number of processors Atlas 800I A2 inference server/A200I A2 Box heterogeneous component/ Atlas 800I A3 SuperPoD Server: huawei.com/Ascend910: number of processors Example: huawei.com/Ascend310: 1	Type and number of requested NPUs. Change them as required. For requests and limits, the processor name and quantity must be the same.
limits
(Optional) host-arch	ARM environment: huawei-arm x86_64 environment: huawei-x86	Architecture of the node where an inference job is executed. Set this parameter as required. The Atlas 200I SoC A1 core board supports only huawei-arm.
servertype	soc	Server type. To schedule jobs to the Atlas 200I SoC A1 core board, add this parameter and mount the directory by referring to the infer-310p-1usoc.yaml file. This parameter is not required for other types of nodes.

Select a YAML example as required and modify the file as follows.

**Table 3** Operation examples
Feature	Operation Reference
Full NPU scheduling	Creating a Single-Processor Job on Atlas Inference Products (non-Atlas 200I SoC A1 Core Board)
	Creating a Single-Processor Job on the Atlas 200I SoC A1 Core Board
	Creating a Single-Processor Job on the Atlas 800I A2 Inference Server
Static vNPU scheduling	Creating a Single-Processor Job on Atlas Inference Products (Non-Atlas 200I SoC A1 Core Board)

The following uses infer.yaml as an example to describe how to create a single-processor inference job in non-mixed insertion mode of the Atlas inference product (non-Atlas 200I SoC A1 core board).

apiVersion: batch/v1
kind: Job
metadata:
  name: resnetinfer1-1
spec:
  template:
    spec:
      nodeSelector:
          host-arch: huawei-arm    # (Optional) Set it as required.
      affinity:        # The job is not scheduled to the Atlas 200I SoC A1 core board.
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: servertype
                    operator: NotIn
                    values:
                      - soc
      containers:
      - image: ubuntu-infer:v1
...
        resources:
          requests:
            huawei.com/Ascend310P: 1
          limits:
            huawei.com/Ascend310P: 1
...

The following uses infer-310p-1usoc.yaml as an example to describe how to create a single-processor inference job on the Atlas 200I SoC A1 core board (non-mixed insertion mode).
```
apiVersion: batch/v1
kind: Job
metadata:
  name: resnetinfer1-1-1usoc
spec:
  template:
    spec:
      nodeSelector:
         host-arch: huawei-arm    # (Optional) Set it as required.
        servertype: soc               # The job is scheduled only to the Atlas 200I SoC A1 core board.
      containers:
      - image: ubuntu-infer:v1
...
        resources:
          requests:
            huawei.com/Ascend310P: 1
          limits:
            huawei.com/Ascend310P: 1
...
```
The directories and files to be mounted to the node of the Atlas 200I SoC A1 core board are different from those to other types of nodes. To avoid inference failure, if Atlas inference product are required and the node of the Atlas 200I SoC A1 core board exists in a cluster but you do not want to schedule jobs to this type of node, add the affinity field to the example YAML file. This prevents scheduling jobs to the nodes with the servertype=soc label.

Refer to this configuration when using the full NPU scheduling feature. The following uses infer.yaml as an example to describe how to create a single-processor inference job on the Atlas 800I A2 inference server.

apiVersion: batch/v1
kind: Job
metadata:
  name: resnetinfer1-1
spec:
  template:
    spec:
      nodeSelector:
          host-arch: huawei-arm   # (Optional) Set it as required.
...
      containers:
      - image: ubuntu-infer:v1
...
        resources:
          requests:
            huawei.com/Ascend910: 1
          limits:
            huawei.com/Ascend910: 1
...

The following uses infer.yaml as an example to describe how to create a single-processor inference job using vNPUs of the Atlas inference product (non-Atlas 200I SoC A1 core board).

apiVersion: batch/v1
kind: Job
metadata:
  name: resnetinfer1-1
spec:
  template:
    spec:
      nodeSelector:
          host-arch: huawei-arm    # (Optional) Set it as required.
      containers:
      - image: ubuntu-infer:v1
...
        resources:
          requests:
            huawei.com/Ascend310P-2c: 1
          limits:
            huawei.com/Ascend310P-2c: 1
...

Parent topic: Full NPU Scheduling or Static vNPU Scheduling (Inference)